A Fast Algorithm for Chinese Text Categorization Based on Key Tree
To solving Chinese text categorization, a fast algorithm is proposed. The basic idea of the algorithm is: first constructs a weighted value of keywords dictionary which is constructed in key tree, then using the Hash function and the principle of giving priority for long term matching to mapping the...
Gespeichert in:
Veröffentlicht in: | Applied Mechanics and Materials 2011-06, Vol.58-60, p.1106-1112 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | To solving Chinese text categorization, a fast algorithm is proposed. The basic idea of the algorithm is: first constructs a weighted value of keywords dictionary which is constructed in key tree, then using the Hash function and the principle of giving priority for long term matching to mapping the strings in documentations to the dictionary. After that, calculate the sum of weights of the keywords which has been matched successfully. Finally take the maximum for the result of the classification. The algorithm can avoid the difficulty of Chinese word segmentation and its influence on accuracy of result. Theoretical analysis and experimental results indicate that the accuracy and the time efficiency of the algorithm is higher, whose comprehensive performance reaches to the level of current major technology. |
---|---|
ISSN: | 1660-9336 1662-7482 1662-7482 |
DOI: | 10.4028/www.scientific.net/AMM.58-60.1106 |