Fast speaker adaptation of large vocabulary continuous density HMM speech recognizer using a basis transform approach

Maximum likelihood transformation-adaptation techniques have proven successful, but it is believed that faster convergence to speaker dependent (SD) performance can be achieved if we incorporate some form of a-priori knowledge in the adaptation process. In this paper, instead of estimating one linea...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Boulis, C., Digalakis, V.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Maximum likelihood transformation-adaptation techniques have proven successful, but it is believed that faster convergence to speaker dependent (SD) performance can be achieved if we incorporate some form of a-priori knowledge in the adaptation process. In this paper, instead of estimating one linear transform per class of models for each new speaker, we transform the speaker-independent (SI) models using multiple linear transforms and a weight vector. To reduce the number of adaptation parameters, the multiple linear transforms are generated from training speakers and the adaptation parameters consist of a single weight vector per class. This can be seen as incorporating a-priori knowledge to our estimation process. Experiments conducted on the Spoken Language Translator database in the Swedish language using SRI's DECIPHER/sup TM/ system, show that the new method outperforms maximum likelihood linear regression on very limited adaptation data.
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.2000.859128