Viterbi Approximation of Latent Words Language Models for Automatic Speech Recognition

This paper presents a Viterbi approximation of latent words language models (LWLMs) for automatic speech recognition (ASR). The LWLMs are effective against data sparseness because of their soft-decision clustering structure and Bayesian modeling, so LWLMs can perform robustly in multiple ASR tasks....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of Information Processing 2019, Vol.27, pp.168-176
Hauptverfasser:	Masumura, Ryo, Asami, Taichi, Oba, Takanobu, Masataki, Hirokazu, Sakauchi, Sumitaka
Format:	Artikel
Sprache:	eng
Schlagworte:	Approximation Automatic speech recognition Bayesian analysis Clustering Decoding Gibbs sampling Hypotheses Language modeling latent words language model Mathematical analysis n-best rescoring N-Gram language models Sampling Speech recognition Stochastic processes Viterbi approximation Voice recognition
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper presents a Viterbi approximation of latent words language models (LWLMs) for automatic speech recognition (ASR). The LWLMs are effective against data sparseness because of their soft-decision clustering structure and Bayesian modeling, so LWLMs can perform robustly in multiple ASR tasks. Unfortunately, implementing an LWLM to ASR is difficult because of its computation complexity. In our previous work, we implemented an n-gram approximation of LWLM for ASR by sampling words according to a stochastic process and training word n-gram LMs. However, the previous approach cannot take into account a latent word sequence behind a recognition hypothesis. Our solution is the Viterbi approximation that simultaneously decodes both the recognition hypothesis and the latent word sequence. The Viterbi approximation is implemented as a two-pass ASR decoding in which the latent word sequence is estimated from a decoded recognition hypothesis using Gibbs sampling. Experiments show the effectiveness of the Viterbi approximation in an n-best rescoring framework. In addition, we investigate the relationship of the n-gram approximation and the Viterbi approximation.
ISSN:	1882-6652 1882-6652
DOI:	10.2197/ipsjjip.27.168