Efficient speaker verification system using speaker model clustering for T and Z normalizations

In speaker verification (SV) systems based on Gaussian mixture model-universal background model (GMM-UBM), normalization is an important component in the decision stage. Many normalization methods including the T- and Z-norms, have been proposed and investigated and these have contributed to state-o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ravulakollu, K., Apsingekar, V.R., De Leon, P.L.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In speaker verification (SV) systems based on Gaussian mixture model-universal background model (GMM-UBM), normalization is an important component in the decision stage. Many normalization methods including the T- and Z-norms, have been proposed and investigated and these have contributed to state-of-the-art SV systems which have extremely low equal-error rates (EERs). In this paper, we consider application of both T- and Z-norms to a carefully selected subset of speakers using a data driven approach which can significantly reduce computation resulting in faster SV decisions and lower EER. Unfortunately, selection of the subset is critical and must be representative of the entire speaker model space otherwise error rates will increase. In order to properly select the subset of speakers for the normalizations, we propose a novel method which first clusters the speaker models using the K-means algorithm and the Kullback-Leibler (KL) divergence and then selects a set of speakers within the cluster. We evaluate the approach using both the TIMIT, NTIMIT and NIST-2002 corpora and compare against standard T- and Z-normalizations.
ISSN:1071-6572
2153-0742
DOI:10.1109/CCST.2008.4751277