A new filter-based gene selection approach in the DNA microarray domain

The high dimensionality of data hinders the learning ability of machine learning algorithms. Feature selection techniques can be used to reduce dimensionality, which is an important step for processing high-dimensional data. Feature selection solves this problem by removing irrelevant and redundant...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2024-04, Vol.240, p.122504, Article 122504
Hauptverfasser: Ouaderhman, Tayeb, Chamlal, Hasna, Janane, Fatima Zahra
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The high dimensionality of data hinders the learning ability of machine learning algorithms. Feature selection techniques can be used to reduce dimensionality, which is an important step for processing high-dimensional data. Feature selection solves this problem by removing irrelevant and redundant information, which can improve learning models, reduce calculation time, and improve learning accuracy. In this paper, a novel filter in mixed-attribute datasets for feature selection is proposed. The independent attributes are mixed or heterogeneous in the sense that both numerical and categorical attribute types may appear together in the same dataset. Based on the preordonnances theory, we use a new concept to quantify the relevance and redundancy of features even if there are heterogeneous (mixed-type) data. The technique for order preference by similarity to the ideal solution is one of the well-known multicriteria decision-making methods; it is utilized as a weighting and informative feature selection filter. To assess the effectiveness of the proposed method, several experiments, both simulated and real, are performed, including a comparison to other well-known filter methods. The experimental results show that, in most cases, the method yielded competitive results in comparison to other methods. •Introducing new criterions for relevance and complementarity.•Dealing with mixed-attribute datasets without requiring a preprocessing step.•The multi-criteria decision method, namely TOPSIS, is used to score the explanatory features.•Investigate the performance of the proposed filter in high-dimensional data by using simulated and real data sets.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2023.122504