Rapid speaker adaptation in eigenvoice space

This paper describes a new model-based speaker adaptation algorithm called the eigenvoice approach. The approach constrains the adapted model to be a linear combination of a small number of basis vectors obtained offline from a set of reference speakers, and thus greatly reduces the number of free p...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on speech and audio processing 2000-11, Vol.8 (6), p.695-707
Hauptverfasser:	Kuhn, R., Junqua, J.-C., Nguyen, P., Niedzielski, N.
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Adaptation model Algorithms Applied sciences Clustering algorithms Error analysis Exact sciences and technology Information, signal and communications theory Loudspeakers Mathematical analysis Mathematical models Maximum likelihood linear regression Parameter estimation Principal component analysis Recognition Signal processing Speech Speech processing Speech processing and communication systems Speech recognition Studies System testing Tasks Telecommunications and information theory Vectors Vectors (mathematics)
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper describes a new model-based speaker adaptation algorithm called the eigenvoice approach. The approach constrains the adapted model to be a linear combination of a small number of basis vectors obtained offline from a set of reference speakers, and thus greatly reduces the number of free parameters to be estimated from adaptation data. These "eigenvoice" basis vectors are orthogonal to each other and guaranteed to represent the most important components of variation between the reference speakers. Experimental results for a small-vocabulary task (letter recognition) given in the paper show that the approach yields major improvements in performance for tiny amounts of adaptation data. For instance, we obtained 16% relative improvement in error rate with one letter of supervised adaptation data, and 26% relative improvement with four letters of supervised adaptation data. After a comparison of the eigenvoice approach with other speaker adaptation algorithms, the paper concludes with a discussion of future work.
ISSN:	1063-6676 2329-9290 1558-2353 2329-9304
DOI:	10.1109/89.876308