An Initialization Method Based on Hybrid Distance for k -Means Algorithm

The traditional -means algorithm has been widely used as a simple and efficient clustering method. However, the performance of this algorithm is highly dependent on the selection of initial cluster centers. Therefore, the method adopted for choosing initial cluster centers is extremely important. In...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computation 2017-11, Vol.29 (11), p.3094-3117
Hauptverfasser: Yang, Jie, Ma, Yan, Zhang, Xiangfen, Li, Shunbao, Zhang, Yuping
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The traditional -means algorithm has been widely used as a simple and efficient clustering method. However, the performance of this algorithm is highly dependent on the selection of initial cluster centers. Therefore, the method adopted for choosing initial cluster centers is extremely important. In this letter, we redefine the density of points according to the number of its neighbors, as well as the distance between points and their neighbors. In addition, we define a new distance measure that considers both Euclidean distance and density. Based on that, we propose an algorithm for selecting initial cluster centers that can dynamically adjust the weighting parameter. Furthermore, we propose a new internal clustering validation measure, the clustering validation index based on the neighbors (CVN), which can be exploited to select the optimal result among multiple clustering results. Experimental results show that the proposed algorithm outperforms existing initialization methods on real-world data sets and demonstrates the adaptability of the proposed algorithm to data sets with various characteristics.
ISSN:0899-7667
1530-888X
DOI:10.1162/neco_a_01014