Boosting and measuring the performance of ensembles for a successful database marketing

This paper provides insights on advantages and disadvantages of two ensemble models: ensembles based on sampling and feature selection. Experimental results confirm that both ensemble methods make robust ensembles and significantly improve the prediction performance of single classifiers at the cost...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2009-03, Vol.36 (2), p.2161-2176
1. Verfasser: Kim, YongSeog
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper provides insights on advantages and disadvantages of two ensemble models: ensembles based on sampling and feature selection. Experimental results confirm that both ensemble methods make robust ensembles and significantly improve the prediction performance of single classifiers at the cost of interpretability and additional computing resources. In particular, classifiers utilizing prior class distributions like support vector machine and naive Bayesian classifier only marginally benefit from ensembles, while classifiers with higher variance like neural networks and tree learners make a strong ensemble. Further, there seems to be an optimal ratio of selecting input variables that maximizes the performance of ensembles while minimizing computational costs when feature selection is used to create ensembles. Finally, we show that most evaluation methods become useless when we compare models on data sets with very skewed class distributions.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2007.12.036