Multi-targeted audio adversarial example for use against speech recognition systems

Deep neural networks are widely used in fields such as image recognition, speech recognition, text recognition, and pattern recognition. However, such networks exhibit weakness against adversarial examples. An adversarial example is a sample that is correctly recognized by humans but is misclassifie...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & security 2023-05, Vol.128, p.103168, Article 103168
Hauptverfasser: Ko, Kyoungmin, Kim, SungHwan, Kwon, Hyun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Deep neural networks are widely used in fields such as image recognition, speech recognition, text recognition, and pattern recognition. However, such networks exhibit weakness against adversarial examples. An adversarial example is a sample that is correctly recognized by humans but is misclassified by the target model; it is typically created by adding a minimal amount of noise to an original sample. In this paper, we propose a method for creating a multi-targeted audio adversarial example that is designed to be misinterpreted differently by each of several models. The proposed method configures the loss function so that the probability of misclassification into the desired class for each model is highest, in order to insert the optimal amount of adversarial noise into the original sample. In the experimental evaluation, the Mozilla Voice dataset was used as the test data source, and TensorFlow was used as the machine learning library. The results show that the proposed method creates a multi-targeted adversarial example that has a 98.02% attack success rate with different classes for the three models while minimizing the distortion from the original sample to an average value of 137.28.
ISSN:0167-4048
1872-6208
DOI:10.1016/j.cose.2023.103168