Grapheme-to-Phoneme Conversion of Arabic Numeral Expressions for Embedded TTS Systems
Despite the increasing need for accuracy, current text-to-speech (TTS) systems are still poor at generating the correct pronunciation of Arabic numerals due to their high ambiguity and various interpretations. In this paper, we propose a mini-transliteration system for Arabic-numeral expressions, wh...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on audio, speech, and language processing speech, and language processing, 2007-01, Vol.15 (1), p.296-309 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Despite the increasing need for accuracy, current text-to-speech (TTS) systems are still poor at generating the correct pronunciation of Arabic numerals due to their high ambiguity and various interpretations. In this paper, we propose a mini-transliteration system for Arabic-numeral expressions, which can efficiently and correctly convert Arabic numeral expressions found in Korean text into phonemes for embedded TTS systems. For the purpose of building grapheme-to-phoneme rules, we deduced the components of ANEs, and investigated their pattern and arithmetic features based on the analyzed corpus. A word sense disambiguation based on lexical hierarchies in KorLex 1.0 was developed to resolve ambiguities caused by the homographic components of the ANEs. Our system minimized the amount of memory used by 1) separating the morphological analysis module from the transliteration system, 2) compacting the lexicon size, and 3) removing named entities. It reduced the process time dramatically without any serious loss of accuracy, and showed an accuracy of 97.2%-98.3%, which was 21.4%-22.5% higher than that of the baseline, and 5.5%-19.5% higher than current commercial Korean TTS systems |
---|---|
ISSN: | 1558-7916 2329-9290 1558-7924 2329-9304 |
DOI: | 10.1109/TASL.2006.876761 |