Fast speaker adaptation of large vocabulary continuous density HMM speech recognizer using a basis transform approach

Maximum likelihood transformation-adaptation techniques have proven successful, but it is believed that faster convergence to speaker dependent (SD) performance can be achieved if we incorporate some form of a-priori knowledge in the adaptation process. In this paper, instead of estimating one linea...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Boulis, C., Digalakis, V.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Equations Hidden Markov models Maximum likelihood estimation Maximum likelihood linear regression Prototypes Speech processing Speech recognition Transforms Vectors Vocabulary
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Maximum likelihood transformation-adaptation techniques have proven successful, but it is believed that faster convergence to speaker dependent (SD) performance can be achieved if we incorporate some form of a-priori knowledge in the adaptation process. In this paper, instead of estimating one linear transform per class of models for each new speaker, we transform the speaker-independent (SI) models using multiple linear transforms and a weight vector. To reduce the number of adaptation parameters, the multiple linear transforms are generated from training speakers and the adaptation parameters consist of a single weight vector per class. This can be seen as incorporating a-priori knowledge to our estimation process. Experiments conducted on the Spoken Language Translator database in the Swedish language using SRI's DECIPHER/sup TM/ system, show that the new method outperforms maximum likelihood linear regression on very limited adaptation data.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2000.859128