Accelerating a Lloyd-Type k-Means Clustering Algorithm with Summable Lower Bounds in a Lower-Dimensional Space

This paper presents an efficient acceleration algorithm for Lloyd-type k-means clustering, which is suitable to a large-scale and high-dimensional data set with potentially numerous classes. The algorithm employs a novel projection-based filter (PRJ) to avoid unnecessary distance calculations, resul...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEICE Transactions on Information and Systems 2018/11/01, Vol.E101.D(11), pp.2773-2783
Hauptverfasser: AOYAMA, Kazuo, SAITO, Kazumi, IKEDA, Tetsuo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper presents an efficient acceleration algorithm for Lloyd-type k-means clustering, which is suitable to a large-scale and high-dimensional data set with potentially numerous classes. The algorithm employs a novel projection-based filter (PRJ) to avoid unnecessary distance calculations, resulting in high-speed performance keeping the same results as a standard Lloyd's algorithm. The PRJ exploits a summable lower bound on a squared distance defined in a lower-dimensional space to which data points are projected. The summable lower bound can make the bound tighter dynamically by incremental addition of components in the lower-dimensional space within each iteration although the existing lower bounds used in other acceleration algorithms work only once as a fixed filter. Experimental results on large-scale and high-dimensional real image data sets demonstrate that the proposed algorithm works at high speed and with low memory consumption when large k values are given, compared with the state-of-the-art algorithms.
ISSN:0916-8532
1745-1361
DOI:10.1587/transinf.2017EDP7392