Multidiscriminator Sobolev Defense-GAN Against Adversarial Attacks for End-to-End Speech Systems

This paper introduces a defense approach against end-to-end adversarial attacks developed for cutting-edge speech-to-text systems. The proposed defense algorithm has four steps. First, we use the short-time Fourier transform to represent speech signals with 2D spectrograms. Second, we iteratively fi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on information forensics and security 2022, Vol.17, p.2044-2058
Hauptverfasser:	Esmaeilpour, Mohammad, Cardinal, Patrick, Koerich, Alessandro Lameiras
Format:	Artikel
Sprache:	eng
Schlagworte:	adversarial defense Algorithms chordal distance Defense Fourier transforms Generative adversarial networks Generators Hidden Markov models Perturbation methods Psychoacoustic models Regularization Schur decomposition short time Fourier transform Signal quality Spectrogram Spectrograms Speech adversarial attack Speech recognition Synthesis Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper introduces a defense approach against end-to-end adversarial attacks developed for cutting-edge speech-to-text systems. The proposed defense algorithm has four steps. First, we use the short-time Fourier transform to represent speech signals with 2D spectrograms. Second, we iteratively find a safe vector using a spectrogram subspace projection operation. This operation minimizes the chordal distance adjustment between spectrograms with an additional regularization term. Third, we synthesize a spectrogram with such a safe vector using a novel GAN architecture trained with Sobolev integral probability metric. We impose an additional constraint on the generator network to improve the model's performance in terms of stability and the total number of learned modes. Finally, we reconstruct the signal from the synthesized spectrogram and the Griffin-Lim phase approximation technique. We evaluate the proposed defense approach against six strong white and black-box adversarial attacks on DeepSpeech, Kaldi, and Lingvo models. The experimental results show that our algorithm outperforms other state-of-the-art defense algorithms in terms of accuracy and signal quality.
ISSN:	1556-6013 1556-6021
DOI:	10.1109/TIFS.2022.3175603