A Hybrid Approach from Ant Colony Optimization and K-nearest Neighbor for Classifying Datasets Using Selected Features

This paper presents an Ant Colony Optimization (ACO) approach for feature selection. The challenge in the feature selection problem is the large search space that exists due to either redundant or irrelevant features which affects the classifier performance negatively. The proposed approach aims to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Informatica (Ljubljana) 2017-12, Vol.41 (4), p.495-506
Hauptverfasser: El Houby, Enas M F, Yassin, Nisreen I R, Omran, Shaimaa
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper presents an Ant Colony Optimization (ACO) approach for feature selection. The challenge in the feature selection problem is the large search space that exists due to either redundant or irrelevant features which affects the classifier performance negatively. The proposed approach aims to minimize the subset of features used in classification and maximize the classification accuracy. The proposed approach uses several groups of ants; each group selects the candidate features using different criteria. The used ACO approach introduces the datasets to a fitness function that is composed of heuristic value component and pheromone value component. The heuristic information is represented with the Class-Separability (CS) value of the candidate feature. The pheromone value calculation is based on the classification accuracy resulted by adding the candidate feature. K-Nearest Neighbor is used as a classifier. The sequential forwardfeature selection has been applied, so it selects from the highest recommendedfeatures sequentially until the accuracy is enhanced. The proposed approach is applied on different medical datasets yielding promising results andfindings. The classification accuracy is increased with the selected features for different datasets. The selected features that achieved the best accuracy for different datasets are given.
ISSN:0350-5596
1854-3871