Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano
Differential privacy (DP) is by far the most widely accepted framework for mitigating privacy risks in machine learning. However, exactly how small the privacy parameter $\epsilon$ needs to be to protect against certain privacy risks in practice is still not well-understood. In this work, we study d...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Differential privacy (DP) is by far the most widely accepted framework for
mitigating privacy risks in machine learning. However, exactly how small the
privacy parameter $\epsilon$ needs to be to protect against certain privacy
risks in practice is still not well-understood. In this work, we study data
reconstruction attacks for discrete data and analyze it under the framework of
multiple hypothesis testing. We utilize different variants of the celebrated
Fano's inequality to derive upper bounds on the inferential power of a data
reconstruction adversary when the model is trained differentially privately.
Importantly, we show that if the underlying private data takes values from a
set of size $M$, then the target privacy parameter $\epsilon$ can be $O(\log
M)$ before the adversary gains significant inferential power. Our analysis
offers theoretical evidence for the empirical effectiveness of DP against data
reconstruction attacks even at relatively large values of $\epsilon$. |
---|---|
DOI: | 10.48550/arxiv.2210.13662 |