GAMIN: An Adversarial Approach to Black-Box Model Inversion
Recent works have demonstrated that machine learning models are vulnerable to model inversion attacks, which lead to the exposure of sensitive information contained in their training dataset. While some model inversion attacks have been developed in the past in the black-box attack setting, in which...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent works have demonstrated that machine learning models are vulnerable to
model inversion attacks, which lead to the exposure of sensitive information
contained in their training dataset. While some model inversion attacks have
been developed in the past in the black-box attack setting, in which the
adversary does not have direct access to the structure of the model, few of
these have been conducted so far against complex models such as deep neural
networks. In this paper, we introduce GAMIN (for Generative Adversarial Model
INversion), a new black-box model inversion attack framework achieving
significant results even against deep models such as convolutional neural
networks at a reasonable computing cost. GAMIN is based on the continuous
training of a surrogate model for the target model under attack and a generator
whose objective is to generate inputs resembling those used to train the target
model. The attack was validated against various neural networks used as image
classifiers. In particular, when attacking models trained on the MNIST dataset,
GAMIN is able to extract recognizable digits for up to 60% of labels produced
by the target. Attacks against skin classification models trained on the pilot
parliament dataset also demonstrated the capacity to extract recognizable
features from the targets. |
---|---|
DOI: | 10.48550/arxiv.1909.11835 |