High Intrinsic Dimensionality Facilitates Adversarial Attack: Theoretical Evidence

Machine learning systems are vulnerable to adversarial attack. By applying to the input object a small, carefully-designed perturbation, a classifier can be tricked into making an incorrect prediction. This phenomenon has drawn wide interest, with many attempts made to explain it. However, a complet...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on information forensics and security 2021-01, Vol.16, p.854-865
Hauptverfasser:	Amsaleg, Laurent, Bailey, James, Barbe, Amelie, Erfani, Sarah M., Furon, Teddy, Houle, Michael E., Radovanovic, Milos, Nguyen, Xuan Vinh
Format:	Artikel
Sprache:	eng
Schlagworte:	Adversarial attack Classification Computer Science Content-based retrieval Feature extraction intrinsic dimensionality Learning systems Machine learning nearest neighbor Neighborhoods Neural networks Object recognition Perturbation Perturbation methods Queries Ranking Retrieval
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Machine learning systems are vulnerable to adversarial attack. By applying to the input object a small, carefully-designed perturbation, a classifier can be tricked into making an incorrect prediction. This phenomenon has drawn wide interest, with many attempts made to explain it. However, a complete understanding is yet to emerge. In this paper we adopt a slightly different perspective, still relevant to classification. We consider retrieval, where the output is a set of objects most similar to a user-supplied query object, corresponding to the set of k -nearest neighbors. We investigate the effect of adversarial perturbation on the ranking of objects with respect to a query. Through theoretical analysis, supported by experiments, we demonstrate that as the intrinsic dimensionality of the data domain rises, the amount of perturbation required to subvert neighborhood rankings diminishes, and the vulnerability to adversarial attack rises. We examine two modes of perturbation of the query: either 'closer' to the target point, or 'farther' from it. We also consider two perspectives: 'query-centric', examining the effect of perturbation on the query's own neighborhood ranking, and 'target-centric', considering the ranking of the query point in the target's neighborhood set. All four cases correspond to practical scenarios involving classification and retrieval.
ISSN:	1556-6013 1556-6021
DOI:	10.1109/TIFS.2020.3023274