Disturbing Neighbors Ensembles of Trees for Imbalanced Data

Disturbing Neighbors (DN) is a method for generating classifier ensembles. Moreover, it can be combined with any other ensemble method, generally improving the results. This paper considers the application of these ensembles to imbalanced data: classification problems where the class proportions are...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Rodriguez, J. J., Diez-Pastor, J. F., Maudes, J., Garcia-Osorio, C.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Disturbing Neighbors (DN) is a method for generating classifier ensembles. Moreover, it can be combined with any other ensemble method, generally improving the results. This paper considers the application of these ensembles to imbalanced data: classification problems where the class proportions are significantly different. DN ensembles are compared and combined with Bagging, using three tree methods as base classifiers: conventional decision trees (C4.5), Hellinger distance decision trees (HDDT) -- a method designed for imbalance data -- and model trees (M5P) -- trees with linear models at the leaves -- . The methods are compared using two collections of imbalanced datasets, with 20 and 66 datasets, respectively. The best results are obtained combining Bagging and DN, using conventional decision trees.
DOI:10.1109/ICMLA.2012.181