James–Stein shrinkage to improve k-means cluster analysis

We study a general algorithm to improve the accuracy in cluster analysis that employs the James–Stein shrinkage effect in k-means clustering. We shrink the centroids of clusters toward the overall mean of all data using a James–Stein-type adjustment, and then the James–Stein shrinkage estimators act...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational statistics & data analysis 2010-09, Vol.54 (9), p.2113-2127
Hauptverfasser: Gao, Jinxin, Hitchcock, David B.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We study a general algorithm to improve the accuracy in cluster analysis that employs the James–Stein shrinkage effect in k-means clustering. We shrink the centroids of clusters toward the overall mean of all data using a James–Stein-type adjustment, and then the James–Stein shrinkage estimators act as the new centroids in the next clustering iteration until convergence. We compare the shrinkage results to the traditional k-means method. A Monte Carlo simulation shows that the magnitude of the improvement depends on the within-cluster variance and especially on the effective dimension of the covariance matrix. Using the Rand index, we demonstrate that accuracy increases significantly in simulated data and in a real data example.
ISSN:0167-9473
1872-7352
DOI:10.1016/j.csda.2010.03.018