Improved multiclass feature selection via list combination
•We introduce new SVM-RFE feature selection methods for multiclass problems.•We use binary decomposition followed by strategies to combine lists of features.•We discuss statistical approaches and voting theory methods.•One-vs-One methods give better results than One-vs-All methods.•The new K-First m...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2017-12, Vol.88, p.205-216 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •We introduce new SVM-RFE feature selection methods for multiclass problems.•We use binary decomposition followed by strategies to combine lists of features.•We discuss statistical approaches and voting theory methods.•One-vs-One methods give better results than One-vs-All methods.•The new K-First method is the more effective in selecting relevant features.
Feature selection is a crucial machine learning technique aimed at reducing the dimensionality of the input space. By discarding useless or redundant variables, not only it improves model performance but also facilitates its interpretability. The well-known Support Vector Machines–Recursive Feature Elimination (SVM-RFE) algorithm provides good performance with moderate computational efforts, in particular for wide datasets. When using SVM-RFE on a multiclass classification problem, the usual strategy is to decompose it into a series of binary ones, and to generate an importance statistics for each feature on each binary problem. These importances are then averaged over the set of binary problems to synthesize a single value for feature ranking. In some cases, however, this procedure can lead to poor selection. In this paper we discuss six new strategies, based on list combination, designed to yield improved selections starting from the importances given by the binary problems. We evaluate them on artificial and real-world datasets, using both One–Vs–One (OVO) and One–Vs–All (OVA) strategies. Our results suggest that the OVO decomposition is most effective for feature selection on multiclass problems. We also find that in most situations the new K-First strategy can find better subsets of features than the traditional weight average approach. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2017.06.043 |