A Hybrid Feature Selection Optimization Model for High Dimension Data Classification

Feature selection is an NP-hard combinatorial problem, in which the number of possible feature subsets increases exponentially with the number of features. In the case of large dimensionality, the goal of feature selection is to determine the smallest possible features considering the most informati...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2021, Vol.9, p.42884-42895
Hauptverfasser: Qaraad, Mohammed, Amjad, Souad, Manhrawy, Ibrahim I. M., Fathi, Hanaa, Hassan, Bayoumi Ali, Kafrawy, Passent El
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Feature selection is an NP-hard combinatorial problem, in which the number of possible feature subsets increases exponentially with the number of features. In the case of large dimensionality, the goal of feature selection is to determine the smallest possible features considering the most informative subset. In this paper, we proposed a hybrid feature selection optimization model for Cancer Classification called, ENSVM. Our model is based on using the Elastic Net (EN) method that regulates and selects variables for gene selection of genomic microarray data. We applied three different optimization techniques namely Social Ski-Driver (SSD), Randomized SearchCV (RS) and Elastic NetCV (ENCV) for determining Elastic Net with traditional Support Vector Machines for classification. To evaluate the model, we compared the results of applying ENSVM to seven genomic microarray data with the SSD-SVM model and SVM with (RBF) kernel without any feature selection method. The results of the comparison revealed the effect of ENSVM in selecting the optimal feature subset that maximized the classification performance. Accordingly, minimizing the number of features is significant when analyzing high dimensional data for performance nevertheless accuracy. Moreover, the ENSVM model is superior compared with the SSD-SVM model.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2021.3065341