An Extension of Fano's Inequality for Characterizing Model Susceptibility to Membership Inference Attacks
Deep neural networks have been shown to be vulnerable to membership inference attacks wherein the attacker aims to detect whether specific input data were used to train the model. These attacks can potentially leak private or proprietary data. We present a new extension of Fano's inequality and...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep neural networks have been shown to be vulnerable to membership inference
attacks wherein the attacker aims to detect whether specific input data were
used to train the model. These attacks can potentially leak private or
proprietary data. We present a new extension of Fano's inequality and employ it
to theoretically establish that the probability of success for a membership
inference attack on a deep neural network can be bounded using the mutual
information between its inputs and its activations. This enables the use of
mutual information to measure the susceptibility of a DNN model to membership
inference attacks. In our empirical evaluation, we show that the correlation
between the mutual information and the susceptibility of the DNN model to
membership inference attacks is 0.966, 0.996, and 0.955 for CIFAR-10, SVHN and
GTSRB models, respectively. |
---|---|
DOI: | 10.48550/arxiv.2009.08097 |