Joint maximum a posteriori adaptation of transformation and HMM parameters

Model adaptation techniques are an efficient way to reduce the mismatch that typically occurs between the training and test condition of any speech recognizer. Adaptation techniques can usually be divided into two families of approaches. On one hand, direct model adaptation attempts to directly rees...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on speech and audio processing 2001-05, Vol.9 (4), p.417-428
Hauptverfasser:	Siohan, O., Chesta, C., Chin-Hui Lee
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic testing Adaptation Adaptation model Applied sciences Artificial intelligence Automatic speech recognition Channels Computer science control theory systems Estimates Exact sciences and technology Hidden Markov models Information, signal and communications theory Laboratories Loudspeakers Mathematical models Maximum likelihood estimation Parameter estimation Performance enhancement Robustness Signal processing Speech Speech and sound recognition and synthesis. Linguistics Speech processing Speech recognition Statistics Studies Telecommunications and information theory Transformations
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Model adaptation techniques are an efficient way to reduce the mismatch that typically occurs between the training and test condition of any speech recognizer. Adaptation techniques can usually be divided into two families of approaches. On one hand, direct model adaptation attempts to directly reestimate the model parameters, for example using MAP adaptation. Since direct adaptation only reestimates model parameters of the corresponding units appearing in the adaptation data, a large amount of such data is needed to observe any significant improvement in performance. However, nice asymptotic properties are usually observed, meaning that the performance improves as the amount of adaptation data increases. On the other hand, indirect model adaptation applies a general transformation on some clusters of model parameters. Because each individual model is transformed, the approach is quite effective when a small amount of adaptation data is available. However, as the amount of adaptation data increases, the performance improvement quickly saturates. We propose to jointly estimate model parameters and transformation parameters using a single estimation criterion based on Bayesian statistics. We show that by providing a prior distribution for the model parameters and the transformation parameters, it is possible to jointly estimate these two sets of parameters using maximum a posteriori estimation (MAP). Experimental evaluation on nonnative speaker and channel adaptation illustrates the effectiveness of the proposed approach.
ISSN:	1063-6676 2329-9290 1558-2353 2329-9304
DOI:	10.1109/89.917687