Differential Aggregation against General Colluding Attackers

Local Differential Privacy (LDP) is now widely adopted in large-scale systems to collect and analyze sensitive data while preserving users' privacy. However, almost all LDP protocols rely on a semi-trust model where users are curious-but-honest, which rarely holds in real-world scenarios. Recen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Du, Rong, Ye, Qingqing, Fu, Yue, Hu, Haibo, Li, Jin, Fang, Chengfang, Shi, Jie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Local Differential Privacy (LDP) is now widely adopted in large-scale systems to collect and analyze sensitive data while preserving users' privacy. However, almost all LDP protocols rely on a semi-trust model where users are curious-but-honest, which rarely holds in real-world scenarios. Recent works show poor estimation accuracy of many LDP protocols under malicious threat models. Although a few works have proposed some countermeasures to address these attacks, they all require prior knowledge of either the attacking pattern or the poison value distribution, which is impractical as they can be easily evaded by the attackers. In this paper, we adopt a general opportunistic-and-colluding threat model and propose a multi-group Differential Aggregation Protocol (DAP) to improve the accuracy of mean estimation under LDP. Different from all existing works that detect poison values on individual basis, DAP mitigates the overall impact of poison values on the estimated mean. It relies on a new probing mechanism EMF (i.e., Expectation-Maximization Filter) to estimate features of the attackers. In addition to EMF, DAP also consists of two EMF post-processing procedures (EMF* and CEMF*), and a group-wise mean aggregation scheme to optimize the final estimated mean to achieve the smallest variance. Extensive experimental results on both synthetic and real-world datasets demonstrate the superior performance of DAP over state-of-the-art solutions.
DOI:10.48550/arxiv.2302.09315