A linguistic feature based text clustering method

The traditional K-means algorithm is sensitive to the initial point, easy to fall into local optimum. In order to avoid this kind of flaw, an improved K-means text clustering method WIKTCM is proposed. The new method creates an innovative initial centers selection method and accommodates the contrib...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kansheng Shi, Lemin Li, Jie He, Haitao Liu, Naitong Zhang, Wentao Song
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The traditional K-means algorithm is sensitive to the initial point, easy to fall into local optimum. In order to avoid this kind of flaw, an improved K-means text clustering method WIKTCM is proposed. The new method creates an innovative initial centers selection method and accommodates the contribution of characteristics of different parts of speech to the text. In addition, the impact of outliers is considered. Experimental results show that the new method has better clustering results.
ISSN:2376-5933
2376-595X
DOI:10.1109/CCIS.2011.6045042