Model Compression Hardens Deep Neural Networks: A New Perspective to Prevent Adversarial Attacks

Deep neural networks (DNNs) have been demonstrating phenomenal success in many real-world applications. However, recent works show that DNN's decision can be easily misguided by adversarial examples-the input with imperceptible perturbations crafted by an ill-disposed adversary, causing the eve...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transaction on neural networks and learning systems 2023-01, Vol.34 (1), p.3-14
Hauptverfasser:	Liu, Qi, Wen, Wujie
Format:	Artikel
Sprache:	eng
Schlagworte:	Adversarial defense adversarial examples Artificial neural networks Classifiers Compression Computational modeling deep neural network (DNN) Defense Iterative methods model compression Neural networks Optimization Perturbation Perturbation methods Robustness Security Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep neural networks (DNNs) have been demonstrating phenomenal success in many real-world applications. However, recent works show that DNN's decision can be easily misguided by adversarial examples-the input with imperceptible perturbations crafted by an ill-disposed adversary, causing the ever-increasing security concerns for DNN-based systems. Unfortunately, current defense techniques face the following issues: 1) they are usually unable to mitigate all types of attacks, given that diversified attacks, which may occur in practical scenarios, have different natures and 2) most of them are subject to considerable implementation cost such as complete retraining. This prompts an urgent need of developing a comprehensive defense framework with low deployment costs. In this work, we reveal that "defensive decision boundary" and "small gradient" are two critical conditions to ease the effectiveness of adversarial examples with different properties. We propose to wisely use "hash compression" to reconstruct a low-cost "defensive hash classifier" to form the first line of our defense. We then propose a set of retraining-free "gradient inhibition" (GI) methods to extremely suppress and randomize the gradient used to craft adversarial examples. Finally, we develop a comprehensive defense framework by orchestrating "defensive hash classifier" and "GI." We evaluate our defense across traditional white-box, strong adaptive white-box, and black-box settings. Extensive studies show that our solution can enormously decrease the attack success rate of various adversarial attacks on the diverse dataset.
ISSN:	2162-237X 2162-2388
DOI:	10.1109/TNNLS.2021.3089128