Tibetan word segmentation information processing method and system, storage medium, terminal and application

The invention belongs to the technical field of information processing, and discloses a Tibetan word segmentation information processing method and system, a storage medium, a terminal and application, according to the Tibetan word segmentation information processing method, word segmentation corpus...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: CHENG GUOGEN, LIU QINGMIN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention belongs to the technical field of information processing, and discloses a Tibetan word segmentation information processing method and system, a storage medium, a terminal and application, according to the Tibetan word segmentation information processing method, word segmentation corpus is learned through word vectors, a convolutional neural network and a conditional random field, a Tibetan word boundary rule is generated, and finally word segmentation of Tibetan is achieved. The Tibetan word segmentation information processing system comprises a word vector preprocessing module; a model structure building module; a word vector training module; and a word vector training stop judgment module. In Tibetan, an artificial neural network and deep learning are used for solving the problem, and the boundary of a word is predicted by learning a Tibetan word vector and utilizing a convolutional neural network (CNN) model and a conditional random field (CRF); and matching a character sequence in the senten