A Mutual Information estimator for continuous and discrete variables applied to Feature Selection and Classification problems

Currently Mutual Information has been widely used in pattern recognition and feature selection problems. It may be used as a measure of redundancy between features as well as a measure of dependency evaluating the relevance of each feature. Since marginal densities of real datasets are not usually k...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of computational intelligence systems 2016-01, Vol.9 (4), p.726-733
Hauptverfasser: Coelho, Frederico, Braga, Antonio P., Verleysen, Michel
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Currently Mutual Information has been widely used in pattern recognition and feature selection problems. It may be used as a measure of redundancy between features as well as a measure of dependency evaluating the relevance of each feature. Since marginal densities of real datasets are not usually known in advance, mutual information should be evaluated by estimation. There are mutual information estimators in the literature that were specifically designed for continuous or for discrete variables, however, most real problems are composed by a mixture of both. There is, of course, some implicit loss of information when using one of them to deal with mixed continuous and discrete variables. This paper presents a new estimator that is able to deal with mixed set of variables. It is shown in experiments with synthetic and real datasets that the method yields reliable results in such circumstance.
ISSN:1875-6891
1875-6883
1875-6883
DOI:10.1080/18756891.2016.1204120