ISDAT: An image-semantic dual adversarial training framework for robust image classification

Adversarial training is known as one of the most effective heuristic defense methods. Unfortunately, most existing work focuses solely on image-space adversarial training, regardless of the exploration of complementary semantic space. Note that semantic space adversarial training is conducive to com...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2025-02, Vol.158, p.110968, Article 110968
Hauptverfasser: Sui, Chenhong, Wang, Ao, Wang, Haipeng, Liu, Hao, Gong, Qingtao, Yao, Jing, Hong, Danfeng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Adversarial training is known as one of the most effective heuristic defense methods. Unfortunately, most existing work focuses solely on image-space adversarial training, regardless of the exploration of complementary semantic space. Note that semantic space adversarial training is conducive to compensating for the deficiency of insufficient diversity of adversarial examples in pure image-space one, thereby facilitating the improvement of model robustness. On this account, it is sensible to learn from both adversarial images and features. Therefore, this paper proposes an image-semantic dual adversarial training framework (ISDAT) for the robustness enhancement of the classification model against multi-attacks. In the inner loop of ISDAT, to craft adversarial images as well as adversarial features, both the benign images and semantic features are perturbed through the image space path and semantic space path, respectively. Concerning attacking which intermediate layer of semantic features contributes most to improving the model’s anti-attack capability, we provide theoretical analysis for guidance, avoiding invalid neuron importance predictions and excessive computation. To ensure their respective contributions of adversarial images and features to model robustness, we advocate forging them with diverse loss views. In specific, we develop a C2 loss for adversarial feature generation involving semantic variance, aggressiveness, and high confidence. In the outer loop of ISDAT, to promote the model’s comprehensive understanding of both adversarial images and adversarial features, we give a joint image-semantic-guided model defense method. In specific, we develop an adversarial image-semantic perception loss (IS). Then, driven by this loss, we further establish an image-semantic end-to-end optimization process, which allows dual learning from both adversarial images and features. Experimental results on the CIFAR-10, CIFAR-100, and SVHN datasets demonstrate the effectiveness of our ISDAT in terms of defending against multiple both white-box and black-box attacks. The code will be available at https://github.com/flower6top. •We propose an image-semantic dual adversarial training framework (ISDAT) against multi-attacks.•We provide theoretical analysis about why and how to generate adversarial features.•In the inner loop of ISDAT, we devise a C2 loss for adversarial feature generation.•In the outer loop of ISDAT, we develop an adversarial image-semantic perceptio
ISSN:0031-3203
DOI:10.1016/j.patcog.2024.110968