DNN-Based Score Calibration With Multitask Learning for Noise Robust Speaker Verification

This paper proposes and investigates several deep neural network (DNN) based score compensation, transformation, and calibration algorithms for enhancing the noise robustness of i-vector speaker verification systems. Unlike conventional calibration methods where the required score shift is a linear...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2018-04, Vol.26 (4), p.700-712
Hauptverfasser: Zhili Tan, Man-Wai Mak, Mak, Brian Kan-Wing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper proposes and investigates several deep neural network (DNN) based score compensation, transformation, and calibration algorithms for enhancing the noise robustness of i-vector speaker verification systems. Unlike conventional calibration methods where the required score shift is a linear function of SNR or log-duration, the DNN approach learns the complex relationship between the score shifts and the combination of i-vector pairs and uncalibrated scores. Furthermore, with the flexibility of DNNs, it is possible to explicitly train a DNN to recover the clean scores without having to estimate the score shifts. To alleviate the overfitting problem, multitask learning is applied to incorporate auxiliary information such as SNRs and speaker ID of training utterances into the DNN. Experiments on NIST 2012 SRE show that score calibration derived from multitask DNNs can improve the performance of the conventional score-shift approch significantly, especially under noisy conditions.
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2018.2791105