Clustering and feature selection via PSO algorithm

Clustering is one of the popular techniques for data analysis. In this paper, we proposed a new method for the simultaneously clustering and feature selection through the use of the multi-objective particle swarm optimization (PSO). Since different features may have different important in various co...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Javani, M., Faez, K., Aghlmandi, D.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Clustering is one of the popular techniques for data analysis. In this paper, we proposed a new method for the simultaneously clustering and feature selection through the use of the multi-objective particle swarm optimization (PSO). Since different features may have different important in various contexts; some features may be irrelevant and some of them may be misleading in clustering. Therefore, we weighted features and by using a threshold value which is automatically produced by the algorithm itself; then some of features with low weight is omitted. Evolutionary algorithms are the most famous technique for clustering. There are two main problems with clustering algorithms based on evolutionary algorithms. First, they are slow; second, they are dependent on the shape of the cluster and mostly work well with a specific dataset. To solve the first problem and increased the speed of the algorithm, we use two local searches to improve cluster centers and to estimate the threshold value. To handle the second problem, we evaluate the clustering by combine the two validation criterion methods of a new proposed KMPBM validation criterion and Conn validation criterion as a multi-objective fitness function. These two validation criterion because based on compactness and connectedness criterion can work independent of the shape of clusters. Experimental on the three Synthetics datasets and three real datasets shows that our proposed algorithm performs clustering independently for the shape of clusters and it can have good accuracy on dataset with any shape.
DOI:10.1109/AISP.2011.5960988