Switch-LSTMs for Multi-Criteria Chinese Word Segmentation

Multi-criteria Chinese word segmentation is a promising but challenging task, which exploits several different segmentation criteria and mines their common underlying knowledge. In this paper, we propose a flexible multi-criteria learning for Chinese word segmentation. Usually, a segmentation criter...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2018-12
Hauptverfasser: Gong, Jingjing, Chen, Xinchi, Gui, Tao, Qiu, Xipeng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multi-criteria Chinese word segmentation is a promising but challenging task, which exploits several different segmentation criteria and mines their common underlying knowledge. In this paper, we propose a flexible multi-criteria learning for Chinese word segmentation. Usually, a segmentation criterion could be decomposed into multiple sub-criteria, which are shareable with other segmentation criteria. The process of word segmentation is a routing among these sub-criteria. From this perspective, we present Switch-LSTMs to segment words, which consist of several long short-term memory neural networks (LSTM), and a switcher to automatically switch the routing among these LSTMs. With these auto-switched LSTMs, our model provides a more flexible solution for multi-criteria CWS, which is also easy to transfer the learned knowledge to new criteria. Experiments show that our model obtains significant improvements on eight corpora with heterogeneous segmentation criteria, compared to the previous method and single-criterion learning.
ISSN:2331-8422