Differentially private counting of users’ spatial regions

Mining of spatial data is an enabling technology for mobile services, Internet-connected cars and the Internet of Things. But the very distinctiveness of spatial data that drives utility can cost user privacy. Past work has focused upon points and trajectories for differentially private release. In...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge and information systems 2018, Vol.54 (1), p.5-32
Hauptverfasser: Fanaeepour, Maryam, Rubinstein, Benjamin I. P.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Mining of spatial data is an enabling technology for mobile services, Internet-connected cars and the Internet of Things. But the very distinctiveness of spatial data that drives utility can cost user privacy. Past work has focused upon points and trajectories for differentially private release. In this work, we continue the tradition of privacy-preserving spatial analytics, focusing not on point or path data, but on planar spatial regions. Such data represent the area of a user’s most frequent visitation—such as “around home and nearby shops”. Specifically we consider the differentially private release of data structures that support range queries for counting users’ spatial regions. Counting planar regions leads to unique challenges not faced in existing work. A user’s spatial region that straddles multiple data structure cells can lead to duplicate counting at query time. We provably avoid this pitfall by leveraging the Euler characteristic for the first time with differential privacy. To address the increased sensitivity of range queries to spatial region data, we calibrate privacy-preserving noise using bounded user region size and a constrained inference that uses robust least absolute deviations. Our novel constrained inference reduces noise and promotes covertness by (privately) imposing consistency. We provide a full end-to-end theoretical analysis of both differential privacy and high-probability utility for our approach using concentration bounds. A comprehensive experimental study on several real-world datasets establishes practical validity.
ISSN:0219-1377
0219-3116
DOI:10.1007/s10115-017-1113-6