Word/sub-word lattices decomposition and combination for speech recognition
This paper presents the benefit of using multiple lexical units in the post-processing stage of an ASR system. Since the use of sub-word units can reduce the high out-of-vocabulary rate and improve the lack of text resources in statistical language modeling, we propose several methods to decompose,...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper presents the benefit of using multiple lexical units in the post-processing stage of an ASR system. Since the use of sub-word units can reduce the high out-of-vocabulary rate and improve the lack of text resources in statistical language modeling, we propose several methods to decompose, normalize and combine word and sub-word lattices generated from different ASR systems. By using a sub-word information table, every word in a lattice can be decomposed into sub-word units. These decomposed lattices can be combined into a common lattice in order to generate a confusion network. This lattices combination scheme results in an absolute syllable error rate reduction of about 1.4% over the sentence MAP baseline method for a Vietnamese ASR task. By comparing with the N-best lists combination and voting method, the proposed method works better. |
---|---|
ISSN: | 1520-6149 2379-190X |
DOI: | 10.1109/ICASSP.2008.4518611 |