Progressive early decision of speech recognition results by comparing most likely word sequences

The most likely word sequence determined at the end of an utterance constitutes an optimal recognition result in continuous speech recognition for the entire utterance. However, depending on the application, the delay from the utterance to the determination of the recognition result may pose a pract...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Systems and computers in Japan 2003-12, Vol.34 (14), p.73-82
Hauptverfasser: Imai, Toru, Tanaka, Hideki, Ando, Akio, Isono, Haruo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The most likely word sequence determined at the end of an utterance constitutes an optimal recognition result in continuous speech recognition for the entire utterance. However, depending on the application, the delay from the utterance to the determination of the recognition result may pose a practical problem, and progressive early decision of recognition results during an utterance becomes necessary. Although in the case of a one‐pass search algorithm, progressive early decision of the recognition result by detecting past sole paths during search is possible, an effective early decision scheme is not available for the case of multiple passes. Thus, a scheme for progressive early decision of recognition results by successively comparing the most likely word sequences during an utterance with the past most likely word sequences is proposed and is applied to a one‐pass decoder and a two‐pass decoder. The proposed scheme attempts to shorten the delays associated with word decisions while limiting the degradation of the recognition rate by controlling the word decision margin and the interval for obtaining the most likely word sequences. In speech recognition experiments of broadcast news, the proposed scheme achieved an average word decision delay equal to that of the past sole path detection method in a one‐pass decoder without significantly degrading the word recognition accuracy, and was able to progressively decide recognition results with an average word decision delay time of about 0.5 second in a two‐pass decoder. © 2003 Wiley Periodicals, Inc. Syst Comp Jpn, 34(14): 73–82, 2003; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.10193
ISSN:0882-1666
1520-684X
DOI:10.1002/scj.10193