Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones

While having a wide range of applications, automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, in particular, replay attacks that are effective and easy to implement. Most prior work on detecting replay attacks uses audio from a single acoustic microphone only, leading t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2018-01, Vol.26 (1), p.44-56
Hauptverfasser: Sahidullah, Md, Thomsen, Dennis Alexander Lehmann, Gonzalez Hautamaki, Rosa, Kinnunen, Tomi, Zheng-Hua Tan, Parts, Robert, Pitkanen, Martti
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:While having a wide range of applications, automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, in particular, replay attacks that are effective and easy to implement. Most prior work on detecting replay attacks uses audio from a single acoustic microphone only, leading to difficulties in detecting high-end replay attacks close to indistinguishable from live human speech. In this paper, we study the use of a special body-conducted sensor, throat microphone (TM), for combined voice liveness detection (VLD) and ASV in order to improve both robustness and security of ASV against replay attacks. We first investigate the possibility and methods of attacking a TM-based ASV system, followed by a pilot data collection. Second, we study the use of spectral features for VLD using both single-channel and dual-channel ASV systems. We carry out speaker verification experiments using Gaussian mixture model with universal background model (GMM-UBM) and i-vector based systems on a dataset of 38 speakers collected by us. We have achieved considerable improvement in recognition accuracy, with the use of dual-microphone setup. In experiments with noisy test speech, the false acceptance rate (FAR) of the dual-microphone GMM-UBM based system for recorded speech reduces from 69.69% to 18.75%. The FAR of replay condition further drops to 0% when this dual-channel ASV system is integrated with the new dual-channel voice liveness detector.
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2017.2760243