Self-adaption neighborhood density clustering method for mixed data stream with concept drift

Clustering analysis is an important data mining method for data stream. In this paper, a self-adaption neighborhood density clustering method for mixed data stream is proposed. The method uses a significant metric criteria to make categorical attribute values become numeric and then the dimension of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Engineering applications of artificial intelligence 2020-03, Vol.89, p.103451, Article 103451
Hauptverfasser: Xu, Shuliang, Feng, Lin, Liu, Shenglan, Qiao, Hong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Clustering analysis is an important data mining method for data stream. In this paper, a self-adaption neighborhood density clustering method for mixed data stream is proposed. The method uses a significant metric criteria to make categorical attribute values become numeric and then the dimension of data is reduced by a nonlinear dimensionality reduction method. In the clustering method, each point is evaluated by neighborhood density. The k points are selected from the data set with maximum mutual distance after k is determined according to rough set. In addition, a new similarity measure based on neighborhood entropy is presented. The data points can be partitioned into the nearest cluster and the algorithm adaptively adjusts the clustering center points by clustering error. The experimental results show that the proposed method can obtain better clustering results than the comparison algorithms on the most data sets and the experimental results prove that the proposed algorithm is effective for data stream clustering.
ISSN:0952-1976
1873-6769
DOI:10.1016/j.engappai.2019.103451