Dictionary-based discriminative HMM parameter estimation for continuous speech recognition systems

The estimation of the HMM parameters has always been a major issue in the design of speech recognition systems. Discriminative objectives like maximum mutual information (MMI) or minimum classification error (MCE) have proved to be superior over the common maximum likelihood estimation (MLE) in case...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Willett, D., Neukirchen, C., Rottland, J.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Applied sciences Density functional theory Dictionaries Exact sciences and technology Frequency estimation Hidden Markov models Information, signal and communications theory Maximum likelihood estimation Mutual information Parameter estimation Robustness Signal processing Speech processing Speech processing and communication systems Speech recognition State estimation Telecommunications and information theory
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The estimation of the HMM parameters has always been a major issue in the design of speech recognition systems. Discriminative objectives like maximum mutual information (MMI) or minimum classification error (MCE) have proved to be superior over the common maximum likelihood estimation (MLE) in cases where a robust estimation of the probabilistic density functions (PDFs) is not possible. The determination of the overall likelihood of an acoustic observation is the most crucial point of the MMI-parameter estimation when applied to continuous speech systems. Contrary to the common approaches that estimate the overall likelihood of the training observations by evaluating the most confusing sentences or by applying global state frequencies, this paper suggests a dictionary analysis in order to get estimates for the dictionary-based risk of mixing two HMM states. These estimates are used to estimate the observations' likelihood and to control the discriminative MMI training procedure. Results on a monophone SCHMM speech recognition system are presented that prove the practicability of the new approach.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.1997.596238