K-modes Algorithm Based on Rough Set and Information Entropy

The traditional K-modes algorithm is susceptible to interference of redundant attributes, and only adopts the 0-1 matching method to define the distance between attribute values of each two objects, without fully considering the influence of each classify attribute on clustering result. In order to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of physics. Conference series 2021-02, Vol.1754 (1), p.12239
Hauptverfasser: Xingyu, Gong, Ke, Cao, Pengtao, Jia, Shangfu, Gong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The traditional K-modes algorithm is susceptible to interference of redundant attributes, and only adopts the 0-1 matching method to define the distance between attribute values of each two objects, without fully considering the influence of each classify attribute on clustering result. In order to overcome these shortcomings, this paper proposes improved K-modes clustering algorithm based on rough set and information entropy. Aiming at a large number of redundant attributes in the clustering data, this paper firstly utilizes attribute reduction algorithm of rough set to eliminate redundant attributes and determine the importance of each attribute, then combines information gain to determine the weight of each attribute and finally makes performance tests of the traditional algorithm and the improved algorithm on five data sets of UCI machine learning library, such as Soybean-Small and Zoo. The experimental results show that the clustering efficiency and accuracy of improved algorithm is higher than that of traditional algorithm, and the performance of improved algorithm is better.
ISSN:1742-6588
1742-6596
DOI:10.1088/1742-6596/1754/1/012239