Hybrid feature selection using micro genetic algorithm on microarray gene expression data

Research has proved that DNA Microarray data containing gene expression profiles are potentially excellent diagnostic tools in the medical industry. A persistent problem with regard to accessible microarray datasets is that the number of samples are much lesser than the number of features that are p...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of intelligent & fuzzy systems 2019-01, Vol.36 (3), p.2241-2246
Hauptverfasser: Pragadeesh, C., Jeyaraj, Rohana, Siranjeevi, K., Abishek, R., Jeyakumar, G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Research has proved that DNA Microarray data containing gene expression profiles are potentially excellent diagnostic tools in the medical industry. A persistent problem with regard to accessible microarray datasets is that the number of samples are much lesser than the number of features that are present. Thus, in order to extract accurate information from the dataset, one must use a robust technique. Feature selection (FS) has proved to be an effective way by which irrelevant and noisy data can be discarded. In FS, relevant features are picked, and result in commendable classification accuracy. This paper proposes a model that employs a compounded hybrid feature selection technique (Filter + Wrapper) to classify microarray cancer data. Initially, a filter method called Information Gain (IG) to eliminate redundant features that will not contribute significantly to the final classification is used. Following to that, an evolutionary computing technique (micro Genetic Algorithm (mGA)) to find the best minimal subset of required features is employed. Then the features are classified using a traditional Support Vector Classifier and also cross validated to obtain high classification accuracy, using a minimal number of features. The complexity of the model is reduced significantly by adding mGA, as opposed to already existing models that use various other feature selection algorithms.
ISSN:1064-1246
1875-8967
DOI:10.3233/JIFS-169935