Compression-resistant backdoor attack against deep neural networks

In recent years, a number of backdoor attacks against deep neural networks (DNN) have been proposed. In this paper, we reveal that backdoor attacks are vulnerable to image compressions, as backdoor instances used to trigger backdoor attacks are usually compressed by image compression methods during...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-09, Vol.53 (17), p.20402-20417
Hauptverfasser: Xue, Mingfu, Wang, Xin, Sun, Shichang, Zhang, Yushu, Wang, Jian, Liu, Weiqiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In recent years, a number of backdoor attacks against deep neural networks (DNN) have been proposed. In this paper, we reveal that backdoor attacks are vulnerable to image compressions, as backdoor instances used to trigger backdoor attacks are usually compressed by image compression methods during data transmission. When backdoor instances are compressed, the feature of backdoor trigger will be destroyed, which could result in significant performance degradation for backdoor attacks. As a countermeasure, we propose the first compression-resistant backdoor attack method based on feature consistency training. Specifically, both backdoor images and their compressed versions are used for training, and the feature difference between backdoor images and their compressed versions are minimized through feature consistency training. As a result, the DNN treats the feature of compressed images as the feature of backdoor images in feature space. After training, the backdoor attack will be robust to image compressions. Furthermore, we consider three different image compressions (i.e., JPEG, JPEG2000, WEBP) during the feature consistency training, so that the backdoor attack can be robust to multiple image compression algorithms. Experimental results demonstrate that when the backdoor instances are compressed, the attack success rate of common backdoor attack is 6.63% (JPEG), 6.20% (JPEG2000) and 3.97% (WEBP) respectively, while the attack success rate of the proposed compression-resistant backdoor attack is 98.77% (JPEG), 97.69% (JPEG2000), and 98.93% (WEBP) respectively. The compression-resistant attack is robust under various parameters settings. In addition, extensive experiments have demonstrated that even if only one image compression method is used in the feature consistency training process, the proposed compression-resistant backdoor attack has the generalization ability to resist multiple unseen image compression methods.
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-023-04575-8