Ensemble Feature Selection in Binary Machine Learning Classification: A Novel Application of the Evaluation Based on Distance from Average Solution (EDAS) Method

Combining filters in an ensemble to improve feature selection performance is a growing field in the literature. Current techniques, however, are focused on approaches that suffer from drawbacks such as sensitivity to skewed distribution, among others. To address this gap, this paper investigates the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Mathematical problems in engineering 2022-09, Vol.2022, p.1-13
Hauptverfasser: Abellana, Dharyll Prince M., Roxas, Robert R., Lao, Demelo M., Mayol, Paula E., Lee, Sanghyuk
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Combining filters in an ensemble to improve feature selection performance is a growing field in the literature. Current techniques, however, are focused on approaches that suffer from drawbacks such as sensitivity to skewed distribution, among others. To address this gap, this paper investigates the applicability of multiple criteria decision-making in ensemble feature selection. This paper adopts the Evaluation based on Distance from Average Solution (EDAS) method due to its many familiar elements to the feature selection community. An experiment was performed on six datasets and a control group. The paper uses the six datasets as levels of the blocking factor. A negative control group (i.e., no feature selection) was adopted to compare with the proposed algorithm. Results show that the proposed ensemble FS algorithm was able to reduce the dataset without compromising the performance of the classifier. The findings in this study would contribute to the literature in several ways. First, the paper is one of the few works to demonstrate how MCDM can be used in feature selection with promising results. Second, this paper is one of the few works to demonstrate the significance of including datasets as levels of a blocking factor when performing significance testing. Finally, this paper is the first to demonstrate the applicability of EDAS as an ensemble FS algorithm. As such, the findings in this paper could spark the cross-fertilization of feature selection and MCDM.
ISSN:1024-123X
1563-5147
DOI:10.1155/2022/4126536