Decoupling Word-Pair Distance and Co-occurrence Information for Effective Long History Context Language Modeling

In this paper, we propose the use of distance and co-occurrence information of word-pairs to improve language modeling. We have empirically shown that, for history-context sizes of up to ten words, the extracted information about distance and co-occurrence complements the n-gram language model well,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2015-07, Vol.23 (7), p.1221-1232
Hauptverfasser: Tze Yuang Chong, Banchs, Rafael E., Eng Siong Chng, Haizhou Li
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we propose the use of distance and co-occurrence information of word-pairs to improve language modeling. We have empirically shown that, for history-context sizes of up to ten words, the extracted information about distance and co-occurrence complements the n-gram language model well, for which learning long-history contexts is inherently difficult. Evaluated on the Wall Street Journal and the Switchboard corpora, our proposed model reduces the trigram model perplexity by up to 11.2% and 6.5%, respectively. As compared to the distant bigram model and the trigger model, our proposed model offers a more effective manner of capturing far context information, as verified in terms of perplexity and computational efficiency, i.e., fewer free parameters to be fine-tuned. Experiments using the proposed model for speech recognition, text classification and word prediction tasks showed improved performance.
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2015.2425223