Informative variable identifier: Expanding interpretability in feature selection

•Interpretability of the solution is provided by a novel feature selection algorithm.•Relevant, redundant and non-informative input variables are identified.•Analysis of weights learned by resampling allows to clarify relations among variables.•Improvement in the interpretability of the results and...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Pattern recognition 2020-02, Vol.98, p.107077, Article 107077
Hauptverfasser:	Muñoz-Romero, Sergio, Gorostiaga, Arantza, Soguero-Ruiz, Cristina, Mora-Jiménez, Inmaculada, Rojo-Álvarez, José Luis
Format:	Artikel
Sprache:	eng
Schlagworte:	Classification Explainable machine learning Feature selection Interpretability Resampling
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Interpretability of the solution is provided by a novel feature selection algorithm.•Relevant, redundant and non-informative input variables are identified.•Analysis of weights learned by resampling allows to clarify relations among variables.•Improvement in the interpretability of the results and in classification performance. There is nowadays an increasing interest in discovering relationships among input variables (also called features) from data to provide better interpretability, which yield more confidence in the solution and provide novel insights about the nature of the problem at hand. We propose a novel feature selection method, called Informative Variable Identifier (IVI), capable of identifying the informative variables and their relationships. It transforms the input-variable space distribution into a coefficient-feature space using existing linear classifiers or a more efficient weight generator that we also propose, Covariance Multiplication Estimator (CME). Informative features and their relationships are determined analyzing the joint distribution of these coefficients with resampling techniques. IVI and CME select the informative variables and then pass them on to any linear or nonlinear classifier. Experiments show that the proposed approach can outperform state-of-art algorithms in terms of feature identification capabilities, and even in classification performance when subsequent classifiers are used.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2019.107077