Joint maximum a posteriori adaptation of transformation and HMM parameters
Model adaptation techniques are an efficient way to reduce the mismatch that typically occurs between the training and test condition of any speech recognizer. Adaptation techniques can usually be divided into two families of approaches. On one hand, direct model adaptation attempts to directly rees...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on speech and audio processing 2001-05, Vol.9 (4), p.417-428 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Model adaptation techniques are an efficient way to reduce the mismatch that typically occurs between the training and test condition of any speech recognizer. Adaptation techniques can usually be divided into two families of approaches. On one hand, direct model adaptation attempts to directly reestimate the model parameters, for example using MAP adaptation. Since direct adaptation only reestimates model parameters of the corresponding units appearing in the adaptation data, a large amount of such data is needed to observe any significant improvement in performance. However, nice asymptotic properties are usually observed, meaning that the performance improves as the amount of adaptation data increases. On the other hand, indirect model adaptation applies a general transformation on some clusters of model parameters. Because each individual model is transformed, the approach is quite effective when a small amount of adaptation data is available. However, as the amount of adaptation data increases, the performance improvement quickly saturates. We propose to jointly estimate model parameters and transformation parameters using a single estimation criterion based on Bayesian statistics. We show that by providing a prior distribution for the model parameters and the transformation parameters, it is possible to jointly estimate these two sets of parameters using maximum a posteriori estimation (MAP). Experimental evaluation on nonnative speaker and channel adaptation illustrates the effectiveness of the proposed approach. |
---|---|
ISSN: | 1063-6676 2329-9290 1558-2353 2329-9304 |
DOI: | 10.1109/89.917687 |