Pronunciation modeling using a finite-state transducer representation

The MIT summit speech recognition system models pronunciation using a phonemic baseform dictionary along with rewrite rules for modeling phonological variation and multi-word reductions. Each pronunciation component is encoded within a finite-state transducer (FST) representation whose transition we...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Speech communication 2005-06, Vol.46 (2), p.189-203
Hauptverfasser:	Hazen, Timothy J., Hetherington, I. Lee, Shu, Han, Livescu, Karen
Format:	Artikel
Sprache:	eng
Schlagworte:	Applied sciences Exact sciences and technology Information, signal and communications theory Signal processing Speech processing Telecommunications and information theory
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The MIT summit speech recognition system models pronunciation using a phonemic baseform dictionary along with rewrite rules for modeling phonological variation and multi-word reductions. Each pronunciation component is encoded within a finite-state transducer (FST) representation whose transition weights can be trained using an EM algorithm for finite-state networks. This paper explains the modeling approach we use and the details of its realization. We demonstrate the benefits and weaknesses of the approach both conceptually and empirically using the recognizer for our jupiter weather information system. Our experiments demonstrate that the use of phonological rewrite rules within our system achieves word error rate reductions between 4% and 9% over different test sets when compared against a system using no phonological rewrite rules.
ISSN:	0167-6393 1872-7182
DOI:	10.1016/j.specom.2005.03.004