A biobjective feature selection algorithm for large omics datasets

Feature selection is one of the most important concepts in data mining when dimensionality reduction is needed. The performance measures of feature selection encompass predictive accuracy and result comprehensibility. Consistency‐based methods are a significant category of feature selection research...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems 2018-08, Vol.35 (4), p.n/a
Hauptverfasser: Cavique, Luís, Mendes, Armando B., Martiniano, Hugo F.M.C., Correia, Luís
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Feature selection is one of the most important concepts in data mining when dimensionality reduction is needed. The performance measures of feature selection encompass predictive accuracy and result comprehensibility. Consistency‐based methods are a significant category of feature selection research that substantially improves the comprehensibility of the result using the parsimony principle. In this work, the biobjective version of the algorithm logical analysis of inconsistent data is applied to large volumes of data. In order to deal with hundreds of thousands of attributes, heuristic decomposition uses parallel processing to solve a set covering problem and a cross‐validation technique. The biobjective solutions contain the number of reduced features and the accuracy. The algorithm is applied to omics datasets with genome‐like characteristics of patients with rare diseases.
ISSN:0266-4720
1468-0394
DOI:10.1111/exsy.12301