Rapid text classification method and device
The invention provides a rapid text classification method and device based on a topic model in combination with linear discrimination. A subject model based on word bag and word frequency vector + PCA + linear discrimination + similarity calculation is combined with linear discrimination to quickly...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a rapid text classification method and device based on a topic model in combination with linear discrimination. A subject model based on word bag and word frequency vector + PCA + linear discrimination + similarity calculation is combined with linear discrimination to quickly and accurately discover a handling department to which new appeal data belongs. According to the method, data preprocessing is mainly carried out based on obtained historical appeal data, and the method mainly comprises vacancy value cleaning and data standardization operation. Comprising the following steps: grouping data according to different handling departments to which standardized data belongs in an actual situation; performing feature word extraction on the grouped data of the handling departments by adopting jieba word segmentation; constructing a bag-of-word and word frequency vector by applying a statistical method; training the data of each department and the overall data by using a PCA method based on |
---|