A study of density-grid based clustering algorithms on data streams

Clustering data streams attracted many researchers since the applications that generate data streams have become more popular. Several clustering algorithms have been introduced for data streams based on distance which are incompetent to find clusters of arbitrary shapes and cannot handle the outlie...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Amini, A., Teh Ying Wah, Saybani, M. R., Yazdi, S. R. A. S.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Clustering data streams attracted many researchers since the applications that generate data streams have become more popular. Several clustering algorithms have been introduced for data streams based on distance which are incompetent to find clusters of arbitrary shapes and cannot handle the outliers. Density-based clustering algorithms are remarkable not only to find arbitrarily shaped clusters but also to deal with noise in data. In density-based clustering algorithms, dense areas of objects in the data space are considered as clusters which are segregated by low-density area. Another group of the clustering methods for data streams is grid-based clustering where the data space is quantized into finite number of cells which form the grid structure and perform clustering on the grids. Grid-based clustering maps the infinite number of data records in data streams to finite numbers of grids. In this paper we review the grid based clustering algorithms that use density-based algorithms or density concept for the clustering. We called them density-grid clustering algorithms. We explore the algorithms in details and the merits and limitations of them. The algorithms are also summarized in a table based on the important features. Besides that, we discuss about how well the algorithms address the challenging issues in the clustering data streams.
DOI:10.1109/FSKD.2011.6019867