Clustering based approach for incomplete data streams processing
Recent applications such as sensor networks generate continuous and dynamic data streams. Data streams are often gathered from multiple data sources with some incompleteness. Clustering such data is constrained by incompleteness of data, data distribution, and continuous nature of data streams. Igno...
Gespeichert in:
Veröffentlicht in: | Journal of intelligent & fuzzy systems 2020-01, Vol.38 (3), p.3213-3227 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent applications such as sensor networks generate continuous and dynamic data streams. Data streams are often gathered from multiple data sources with some incompleteness. Clustering such data is constrained by incompleteness of data, data distribution, and continuous nature of data streams. Ignoring missing values in incomplete data clustering, especially in high missing rates decreases the clustering performance. Traditional clustering is applied on the whole data without dealing with data distribution. This paper presents an efficient framework called Fuzzy c-means clustering for Incomplete Data streams (FID) that works adaptively with incomplete data streams even with high missing rates. The proposed FID estimates missing values based on the corresponding nearest-neighbors' intervals. To overcome the previously mentioned data streams clustering problems, the continuous clustering mechanism is adopted and extended to accurately handle the incomplete data streams. Experimental results using two different data sets prove the efficiency of the proposed FID comparing to the alternative approaches. |
---|---|
ISSN: | 1064-1246 1875-8967 |
DOI: | 10.3233/JIFS-191184 |