Random transformations to improve mitigation of query-based black-box attacks

This paper proposes methods to upstage the best-known defences against query-based black-box attacks. These benchmark defences incorporate gaussian noise into input data during inference to achieve state-of-the-art performance in protecting image classification models against the most advanced query...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2025-03, Vol.264, p.125840, Article 125840
Hauptverfasser:	Ali, Ziad Tariq Muhammad, Azad, R. Muhammad Atif, Azad, Muhammad Ajmal, Holyhead, James, Rice, Iain, Imran, Ali Shariq
Format:	Artikel
Sprache:	eng
Schlagworte:	Adversarial examples Black-box attacks Neural networks Randomised defences
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper proposes methods to upstage the best-known defences against query-based black-box attacks. These benchmark defences incorporate gaussian noise into input data during inference to achieve state-of-the-art performance in protecting image classification models against the most advanced query-based black-box attacks. Even so there is a need to improve upon them; for example, the widely benchmarked Random noise defense (RND) method has demonstrated limited robustness – achieving only 53.5% and 18.1% with a ResNet-50 model on the CIFAR-10 and ImageNet datasets, respectively – against the square attack, which is commonly regarded as the state-of-the-art black-box attack. Therefore, in this work, we propose two alternatives to gaussian noise addition at inference time: random crop-resize and random rotation of the input images. Although these transformations are generally used for data augmentation while training to improve model invariance and generalisation, their protective potential against query-based black-box attacks at inference time is unexplored. Therefore, for the first time, we report that for such well-trained models either of the two transformations can also blunt powerful query-based black-box attacks when used at inference time on three popular datasets. The results show that the proposed randomised transformations outperform RND in terms of robust accuracy against a strong adversary that uses a high budget of 100,000 queries based on expectation over transformation (EOT) of 10, by 0.9% on the CIFAR-10 dataset, 9.4% on the ImageNet dataset and 1.6% on the Tiny ImageNet dataset. Crucially, in two even tougher attack settings, that is, high-confidence adversarial examples and EOT-50 adversary, these transformations are even more effective as the margin of improvement over the benchmarks increases further. •Proposed randomised transformations outperformed the best-known randomised defences against state-of-the-art black-box adversarial attack.•Randomised transformations are shown to be more effective at mitigating query-based black-box attacks than noise-based defences.•The experiments are conducted on three popular computer vision datasets using adversarially trained models.•The defences are tested under an exceptionally strong adversary with up to a 500,000 query budget.•Proposed randomised transformations can also blunt high-confidence adversarial examples.
ISSN:	0957-4174
DOI:	10.1016/j.eswa.2024.125840