Text classification method and device

The invention discloses a text classification method and device. The method comprises the following steps: utilizing each unlabeled linguistic data training word vector model in a corpus to obtain a target word vector model; according to the target word vector model, carrying out word expansion on a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: DUAN HUANZHONG, LU ZHENG
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a text classification method and device. The method comprises the following steps: utilizing each unlabeled linguistic data training word vector model in a corpus to obtain a target word vector model; according to the target word vector model, carrying out word expansion on a preset keyword corresponding to an appointed classification category to obtain a phrase set corresponding to the appointed classification category; according to the corpus, independently training a classifier for each phrase in the phrase set to obtain a target classifier independently corresponding to each phrase; according to a preset validation set, carrying out classification accuracy verification on the target classifier corresponding to each phrase, and selecting the phrase of which the classification accuracy conforms to a first set condition as a target phrase; and according to the target phrase contained in each piece of linguistic data in the corpus, selecting the linguistic data which meets a second set