Research on Text Clustering Based on Concept Weight

Through research on the calculation method of feature words' weight in texts and semantic similarity between words, we proposed a calculation method of feature words' weight based on concept weight for the semantic association phenomenon of text features and the prevalence of high-dimensio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yuqin Li, Xueqiang Lv, Yufang Liu, Shuicai Shi
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Through research on the calculation method of feature words' weight in texts and semantic similarity between words, we proposed a calculation method of feature words' weight based on concept weight for the semantic association phenomenon of text features and the prevalence of high-dimensional problem in a text vector space model. This method reduces the semantic loss of the feature set and the dimension of the text vector, and then makes the text vector space model better and improves the quality of text clustering. Experimental results show the feasibility of the method, and prove that concept-weight-based text clustering increased by 22 percentage points or so than non-concept-weight-based in the final evaluation of the FI index value.
DOI:10.1109/ICGEC.2010.64