An improved clustering algorithm and its application in IoT data analysis
With the popularization of the Internet of Things(IoT), the data are exploding. Data analysis is foundation of IoT based applications, and clustering is an important tool for data analysis. In clustering, determining the number of clusters is an important issue, which can be either designated artifi...
Gespeichert in:
Veröffentlicht in: | Computer networks (Amsterdam, Netherlands : 1999) Netherlands : 1999), 2019-08, Vol.159, p.63-72 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the popularization of the Internet of Things(IoT), the data are exploding. Data analysis is foundation of IoT based applications, and clustering is an important tool for data analysis. In clustering, determining the number of clusters is an important issue, which can be either designated artificially or determined automatically. The artificial methods have many disadvantages. And the automatic methods have distinct advantages, whose critical task is to design an appropriate clusters number updating algorithm. Although many researches have been made, most of them are not effective or cannot guarantee the unique clustering results and the good clustering accuracy rate. Meanwhile, considering that IoT based applications always involved both numerical data and nonnumeric data, and treating all the nonnumeric data in the same way is unpractical, we try to further classify the nonnumeric attributes according to their natures and explore the corresponding similarity metrics respectively. Based on it, an algorithm for determining the initial clustering centers is put forward by the dissimilarities and the densities of data objects. And then, an improved clustering algorithm is designed on a revised inter-cluster entropy for mixed data. The experiments on the 3 datasets in University of California at Irvine(UCI) show that the improved clustering algorithm is a deterministic clustering algorithm with good performance. |
---|---|
ISSN: | 1389-1286 1872-7069 |
DOI: | 10.1016/j.comnet.2019.04.022 |