Detection defense against adversarial attacks with saliency map

It is well established that neural networks are vulnerable to adversarial examples, which are almost imperceptible on human vision and can cause the deep models misbehave. Such phenomenon may lead to severely inestimable consequences in the safety and security critical applications. Existing defense...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of intelligent systems 2022-12, Vol.37 (12), p.10193-10210
Hauptverfasser:	Ye, Dengpan, Chen, Chuanxi, Liu, Changrui, Wang, Hao, Jiang, Shunzhi
Format:	Artikel
Sprache:	eng
Schlagworte:	adversarial defense adversarial example deep neural network Intelligent systems machine learning Neural networks Salience saliency map
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	It is well established that neural networks are vulnerable to adversarial examples, which are almost imperceptible on human vision and can cause the deep models misbehave. Such phenomenon may lead to severely inestimable consequences in the safety and security critical applications. Existing defenses are trend to harden the robustness of models against adversarial attacks, for example, adversarial training technology. However, these are usually intractable to implement due to the high cost of retraining and the cumbersome operations of altering the model architecture or parameters. In this paper, we discuss the saliency map method from the view of enhancing model interpretability, it is similar to introducing the mechanism of the attention to the model, so as to comprehend the progress of object identification by the deep networks. We then propose a novel method combined with additional noises and utilize the inconsistency strategy to detect adversarial examples. Our experimental results of some representative adversarial attacks on common data sets including ImageNet and popular models show that our method can detect all the attacks with high detection success rate effectively. We compare it with the existing state‐of‐the‐art technique, and the experiments indicate that our method is more general.
ISSN:	0884-8173 1098-111X
DOI:	10.1002/int.22458