We examine Euclidean distance preserving data perturbation as a tool for
privacy-preserving data mining. Such perturbations allow many important data
mining algorithms, with only minor modification, to be applied to the perturbed
data and produce exactly the same results as if applied to the original data,
e.g. hierarchical clustering and k-means clustering. However, the issue of how
well the original data is hidden needs careful study. We take a step in this
direction by assuming the role of an attacker armed with two types of prior
information regarding the original data.
To address the problem of unsupervised outlier detection in wireless sensor
networks, we develop an approach that (1) is flexible with respect to the
outlier definition, (2) computes the result in-network to reduce both bandwidth
and energy usage,(3) only uses single hop communication thus permitting very
simple node failure detection and message reliability assurance mechanisms
(e.g., carrier-sense), and (4) seamlessly accommodates dynamic updates to data.
We examine performance using simulation with real sensor data streams.