A Fast Algorithm for Chinese Text Categorization Based on Key Tree

To solving Chinese text categorization, a fast algorithm is proposed. The basic idea of the algorithm is: first constructs a weighted value of keywords dictionary which is constructed in key tree, then using the Hash function and the principle of giving priority for long term matching to mapping the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied Mechanics and Materials 2011-06, Vol.58-60, p.1106-1112
Hauptverfasser: Liu, Xin, Liu, Ren Ren, He, Wen Jing
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To solving Chinese text categorization, a fast algorithm is proposed. The basic idea of the algorithm is: first constructs a weighted value of keywords dictionary which is constructed in key tree, then using the Hash function and the principle of giving priority for long term matching to mapping the strings in documentations to the dictionary. After that, calculate the sum of weights of the keywords which has been matched successfully. Finally take the maximum for the result of the classification. The algorithm can avoid the difficulty of Chinese word segmentation and its influence on accuracy of result. Theoretical analysis and experimental results indicate that the accuracy and the time efficiency of the algorithm is higher, whose comprehensive performance reaches to the level of current major technology.
ISSN:1660-9336
1662-7482
1662-7482
DOI:10.4028/www.scientific.net/AMM.58-60.1106