A new correlation-based approach for ensemble selection in random forests

PurposeEnsemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems. Despite the effectiveness of these techniques, studies have shown that ensemble methods generate a large number of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of intelligent computing and cybernetics 2021-04, Vol.14 (2), p.251-268
Hauptverfasser: El Habib Daho, Mostafa, Settouti, Nesma, Bechar, Mohammed El Amine, Boublenza, Amina, Chikh, Mohammed Amine
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:PurposeEnsemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems. Despite the effectiveness of these techniques, studies have shown that ensemble methods generate a large number of hypotheses and that contain redundant classifiers in most cases. Several works proposed in the state of the art attempt to reduce all hypotheses without affecting performance.Design/methodology/approachIn this work, the authors are proposing a pruning method that takes into consideration the correlation between classifiers/classes and each classifier with the rest of the set. The authors have used the random forest algorithm as trees-based ensemble classifiers and the pruning was made by a technique inspired by the CFS (correlation feature selection) algorithm.FindingsThe proposed method CES (correlation-based Ensemble Selection) was evaluated on ten datasets from the UCI machine learning repository, and the performances were compared to six ensemble pruning techniques. The results showed that our proposed pruning method selects a small ensemble in a smaller amount of time while improving classification rates compared to the state-of-the-art methods.Originality/valueCES is a new ordering-based method that uses the CFS algorithm. CES selects, in a short time, a small sub-ensemble that outperforms results obtained from the whole forest and the other state-of-the-art techniques used in this study.
ISSN:1756-378X
1756-3798
DOI:10.1108/IJICC-10-2020-0147