A Differentially Private Algorithm for Range Queries on Trajectories
We propose a novel algorithm to ensure $\epsilon$-differential privacy for answering range queries on trajectory data. In order to guarantee privacy, differential privacy mechanisms add noise to either data or query, thus introducing errors to queries made and potentially decreasing the utility of i...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose a novel algorithm to ensure $\epsilon$-differential privacy for
answering range queries on trajectory data. In order to guarantee privacy,
differential privacy mechanisms add noise to either data or query, thus
introducing errors to queries made and potentially decreasing the utility of
information. In contrast to the state-of-the-art, our method achieves
significantly lower error as it is the first data- and query-aware approach for
such queries. The key challenge for answering range queries on trajectory data
privately is to ensure an accurate count. Simply representing a trajectory as a
set instead of \emph{sequence} of points will generally lead to highly
inaccurate query answers as it ignores the sequential dependency of location
points in trajectories, i.e., will violate the consistency of trajectory data.
Furthermore, trajectories are generally unevenly distributed across a city and
adding noise uniformly will generally lead to a poor utility. To achieve
differential privacy, our algorithm adaptively adds noise to the input data
according to the given query set. It first privately partitions the data space
into uniform regions and computes the traffic density of each region. The
regions and their densities, in addition to the given query set, are then used
to estimate the distribution of trajectories over the queried space, which
ensures high accuracy for the given query set. We show the accuracy and
efficiency of our algorithm using extensive empirical evaluations on real and
synthetic data sets. |
---|---|
DOI: | 10.48550/arxiv.1907.08038 |