Anonymization of moving objects databases by clustering and perturbation
Preserving individual privacy when publishing data is a problem that is receiving increasing attention. Thanks to its simplicity the concept of k-anonymity, introduced by Samarati and Sweeney [1], established itself as one fundamental principle for privacy preserving data publishing. According to th...
Gespeichert in:
Veröffentlicht in: | Information systems (Oxford) 2010-12, Vol.35 (8), p.884-910 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Preserving individual privacy when publishing data is a problem that is receiving increasing attention. Thanks to its simplicity the concept of
k-anonymity, introduced by Samarati and Sweeney
[1], established itself as one fundamental principle for privacy preserving data publishing. According to the
k-anonymity principle, each release of data must be such that each individual is indistinguishable from at least
k−1 other individuals.
In this article we tackle the problem of anonymization of moving objects databases. We propose a novel concept of
k-anonymity based on co-localization, that exploits the inherent uncertainty of the moving object's whereabouts. Due to sampling and imprecision of the positioning systems (e.g., GPS), the trajectory of a moving object is no longer a polyline in a three-dimensional space, instead it is a cylindrical volume, where its radius
δ
represents the possible location imprecision: we know that the trajectory of the moving object is within this cylinder, but we do not know exactly where. If another object moves within the same cylinder they are indistinguishable from each other. This leads to the definition of
(
k
,
δ
)
-anonymity
for moving objects databases. We first characterize the
(
k
,
δ
)
-anonymity
problem, then we recall
NWA
(
N
ever
W
alk
A
lone
), a method that we introduced in
[2] based on clustering and spatial perturbation. Starting from a discussion on the limits of
NWA
we develop a novel clustering method that, being based on EDR distance
[3], has the important feature of being
time-tolerant. As a consequence it perturbs trajectories both in space and time. The novel method, named
W
4
M
(
W
ait
for
M
e
), is empirically shown to produce higher quality anonymization than
NWA
, at the price of higher computational requirements. Therefore, in order to make
W
4
M
scalable to large datasets, we introduce two variants based on a novel (and computationally cheaper) time-tolerant distance function, and on chunking.
All the variants of
W
4
M
1
1
Software freely available at:
www-kdd.isti.cnr.it/W4M/.
are empirically evaluated in terms of data quality and efficiency, and thoroughly compared to their predecessor
NWA
.
2
2
Software freely available at:
www-kdd.isti.cnr.it/NWA/.
Data quality is assessed both by means of objective measures of information distortion, and by more usability oriented measure, i.e., by comparing the results of (i) spatio-temporal range queries and (ii) frequent pattern mining, executed on the or |
---|---|
ISSN: | 0306-4379 1873-6076 |
DOI: | 10.1016/j.is.2010.05.003 |