Subtle adversarial image manipulations influence both human and machine perception

Although artificial neural networks (ANNs) were inspired by the brain, ANNs exhibit a brittleness not generally observed in human perception. One shortcoming of ANNs is their susceptibility to adversarial perturbations—subtle modulations of natural images that result in changes to classification dec...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature communications 2023-08, Vol.14 (1), p.4933-4933, Article 4933
Hauptverfasser: Veerabadran, Vijay, Goldman, Josh, Shankar, Shreya, Cheung, Brian, Papernot, Nicolas, Kurakin, Alexey, Goodfellow, Ian, Shlens, Jonathon, Sohl-Dickstein, Jascha, Mozer, Michael C., Elsayed, Gamaleldin F.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Although artificial neural networks (ANNs) were inspired by the brain, ANNs exhibit a brittleness not generally observed in human perception. One shortcoming of ANNs is their susceptibility to adversarial perturbations—subtle modulations of natural images that result in changes to classification decisions, such as confidently mislabelling an image of an elephant, initially classified correctly, as a clock. In contrast, a human observer might well dismiss the perturbations as an innocuous imaging artifact. This phenomenon may point to a fundamental difference between human and machine perception, but it drives one to ask whether human sensitivity to adversarial perturbations might be revealed with appropriate behavioral measures. Here, we find that adversarial perturbations that fool ANNs similarly bias human choice. We further show that the effect is more likely driven by higher-order statistics of natural images to which both humans and ANNs are sensitive, rather than by the detailed architecture of the ANN. Artificial neural networks (ANNs) are vulnerable to subtle adversarial perturbations that yield misclassification errors. Here, behavioral studies demonstrate that adversarial perturbations that fool ANNs similarly bias human choice.
ISSN:2041-1723
2041-1723
DOI:10.1038/s41467-023-40499-0