A two-phase K-means algorithm for large datasets
Abstract One of the drawbacks of the K-means algorithm is the need for several iterations over datasets before it converges on a solution. Therefore, its application is limited to relatively small datasets. This paper presents a scalable version of the K-means algorithm that employs a buffering tech...
Gespeichert in:
Veröffentlicht in: | Proceedings of the Institution of Mechanical Engineers. Part C, Journal of mechanical engineering science Journal of mechanical engineering science, 2004-10, Vol.218 (10), p.1269-1273 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Abstract
One of the drawbacks of the K-means algorithm is the need for several iterations over datasets before it converges on a solution. Therefore, its application is limited to relatively small datasets. This paper presents a scalable version of the K-means algorithm that employs a buffering technique. The new algorithm, Two-Phase K-means, can robustly find a good solution in only one iteration. |
---|---|
ISSN: | 0954-4062 2041-2983 |
DOI: | 10.1243/0954406042369008 |