How to utilize syllable distribution patterns as the input of LSTM for Korean morphological analysis
This paper proposes the use of syllable distribution patterns as deep learning inputs for morphological analysis. The proposed syllable distribution pattern comprises two parts: a distributed syllable embedding vector and a morpheme syllable-level distribution pattern. As a learning method, we utili...
Gespeichert in:
Veröffentlicht in: | Pattern recognition letters 2019-04, Vol.120, p.39-45 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper proposes the use of syllable distribution patterns as deep learning inputs for morphological analysis. The proposed syllable distribution pattern comprises two parts: a distributed syllable embedding vector and a morpheme syllable-level distribution pattern. As a learning method, we utilize bidirectional long short-term memory with a conditional random field layer (Bi-LSTM-CRF) for Korean part-of-speech tagging tasks. After syllable-level outputs are generated by Bi-LSTM-CRF, a morpheme restoration process is performed utilizing pre-analyzed dictionaries that were automatically created from a training corpus. Experimental results reveal outstanding performance for the proposed method with an F1-score of 98.65%. |
---|---|
ISSN: | 0167-8655 1872-7344 |
DOI: | 10.1016/j.patrec.2018.12.019 |