Random transformations to improve mitigation of query-based black-box attacks
This paper proposes methods to upstage the best-known defences against query-based black-box attacks. These benchmark defences incorporate gaussian noise into input data during inference to achieve state-of-the-art performance in protecting image classification models against the most advanced query...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2025-03, Vol.264, p.125840, Article 125840 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper proposes methods to upstage the best-known defences against query-based black-box attacks. These benchmark defences incorporate gaussian noise into input data during inference to achieve state-of-the-art performance in protecting image classification models against the most advanced query-based black-box attacks. Even so there is a need to improve upon them; for example, the widely benchmarked Random noise defense (RND) method has demonstrated limited robustness – achieving only 53.5% and 18.1% with a ResNet-50 model on the CIFAR-10 and ImageNet datasets, respectively – against the square attack, which is commonly regarded as the state-of-the-art black-box attack. Therefore, in this work, we propose two alternatives to gaussian noise addition at inference time: random crop-resize and random rotation of the input images. Although these transformations are generally used for data augmentation while training to improve model invariance and generalisation, their protective potential against query-based black-box attacks at inference time is unexplored. Therefore, for the first time, we report that for such well-trained models either of the two transformations can also blunt powerful query-based black-box attacks when used at inference time on three popular datasets. The results show that the proposed randomised transformations outperform RND in terms of robust accuracy against a strong adversary that uses a high budget of 100,000 queries based on expectation over transformation (EOT) of 10, by 0.9% on the CIFAR-10 dataset, 9.4% on the ImageNet dataset and 1.6% on the Tiny ImageNet dataset. Crucially, in two even tougher attack settings, that is, high-confidence adversarial examples and EOT-50 adversary, these transformations are even more effective as the margin of improvement over the benchmarks increases further.
•Proposed randomised transformations outperformed the best-known randomised defences against state-of-the-art black-box adversarial attack.•Randomised transformations are shown to be more effective at mitigating query-based black-box attacks than noise-based defences.•The experiments are conducted on three popular computer vision datasets using adversarially trained models.•The defences are tested under an exceptionally strong adversary with up to a 500,000 query budget.•Proposed randomised transformations can also blunt high-confidence adversarial examples. |
---|---|
ISSN: | 0957-4174 |
DOI: | 10.1016/j.eswa.2024.125840 |