Compression-resistant backdoor attack against deep neural networks
In recent years, a number of backdoor attacks against deep neural networks (DNN) have been proposed. In this paper, we reveal that backdoor attacks are vulnerable to image compressions, as backdoor instances used to trigger backdoor attacks are usually compressed by image compression methods during...
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-09, Vol.53 (17), p.20402-20417 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In recent years, a number of backdoor attacks against deep neural networks (DNN) have been proposed. In this paper, we reveal that backdoor attacks are vulnerable to image compressions, as backdoor instances used to trigger backdoor attacks are usually compressed by image compression methods during data transmission. When backdoor instances are compressed, the feature of backdoor trigger will be destroyed, which could result in significant performance degradation for backdoor attacks. As a countermeasure, we propose the first compression-resistant backdoor attack method based on feature consistency training. Specifically, both backdoor images and their compressed versions are used for training, and the feature difference between backdoor images and their compressed versions are minimized through feature consistency training. As a result, the DNN treats the feature of compressed images as the feature of backdoor images in feature space. After training, the backdoor attack will be robust to image compressions. Furthermore, we consider three different image compressions (i.e., JPEG, JPEG2000, WEBP) during the feature consistency training, so that the backdoor attack can be robust to multiple image compression algorithms. Experimental results demonstrate that when the backdoor instances are compressed, the attack success rate of common backdoor attack is 6.63% (JPEG), 6.20% (JPEG2000) and 3.97% (WEBP) respectively, while the attack success rate of the proposed compression-resistant backdoor attack is 98.77% (JPEG), 97.69% (JPEG2000), and 98.93% (WEBP) respectively. The compression-resistant attack is robust under various parameters settings. In addition, extensive experiments have demonstrated that even if only one image compression method is used in the feature consistency training process, the proposed compression-resistant backdoor attack has the generalization ability to resist multiple unseen image compression methods. |
---|---|
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-023-04575-8 |