Explanation-based data-free model extraction attacks
Deep learning (DL) has dramatically pushed the previous limits of various tasks, ranging from computer vision to natural language processing. Despite its success, the lack of model explanations thwarts the usage of these techniques in life-critical domains, e.g., medical diagnosis and self-driving s...
Gespeichert in:
Veröffentlicht in: | World wide web (Bussum) 2023-09, Vol.26 (5), p.3081-3092 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep learning (DL) has dramatically pushed the previous limits of various tasks, ranging from computer vision to natural language processing. Despite its success, the lack of model explanations thwarts the usage of these techniques in life-critical domains, e.g., medical diagnosis and self-driving systems. To date, the core technology to solve the explainable issue is explainable artificial intelligence (XAI). XAI methods have been developed to produce human-understandable explanations by leveraging intermediate results of the DL models, e.g., gradients and model parameters. While the effectiveness of XAI methods has been demonstrated in benign environments, their privacy against model extraction attacks (i.e., attacks at the model confidentially) requires to be studied. To this end, this paper proposes DMEAE, a
d
ata-free
m
odel
e
xtraction
a
ttack using
e
xplanation-guided, to explore XAI privacy threats. Compared with previous works, DMEAE does not require collecting any data and utilizes model explanation loss. Specifically, DMEAE creates synthetic data using a generative model with model explanation loss items. Extensive evaluations verify the effectiveness and efficiency of the proposed attack strategy on SVHN and CIFAR-10 datasets. We hope that our research can provide insights for the development of practical tools to trade off the relationship between privacy and model explanations. |
---|---|
ISSN: | 1386-145X 1573-1413 |
DOI: | 10.1007/s11280-023-01150-6 |