The Generalization Paradox of Ensembles

Ensemble models-built by methods such as bagging, boosting, and Bayesian model averaging-appear dauntingly complex, yet tend to strongly outperform their component models on new data. Doesn't this violate "Occam's razor"-the widespread belief that "the simpler of competing a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of computational and graphical statistics 2003-12, Vol.12 (4), p.853-864
1. Verfasser: Elder, John F
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Ensemble models-built by methods such as bagging, boosting, and Bayesian model averaging-appear dauntingly complex, yet tend to strongly outperform their component models on new data. Doesn't this violate "Occam's razor"-the widespread belief that "the simpler of competing alternatives is preferred"? We argue no: if complexity is measured by function rather than form-for example, according to generalized degrees of freedom (GDF)-the razor's role is restored. On a two-dimensional decision tree problem, bagging several trees is shown to actually have less GDF complexity than a single component tree, removing the generalization paradox of ensembles.
ISSN:1061-8600
1537-2715
DOI:10.1198/1061860032733