The Generalization Paradox of Ensembles
Ensemble models-built by methods such as bagging, boosting, and Bayesian model averaging-appear dauntingly complex, yet tend to strongly outperform their component models on new data. Doesn't this violate "Occam's razor"-the widespread belief that "the simpler of competing a...
Gespeichert in:
Veröffentlicht in: | Journal of computational and graphical statistics 2003-12, Vol.12 (4), p.853-864 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Ensemble models-built by methods such as bagging, boosting, and Bayesian model averaging-appear dauntingly complex, yet tend to strongly outperform their component models on new data. Doesn't this violate "Occam's razor"-the widespread belief that "the simpler of competing alternatives is preferred"? We argue no: if complexity is measured by function rather than form-for example, according to generalized degrees of freedom (GDF)-the razor's role is restored. On a two-dimensional decision tree problem, bagging several trees is shown to actually have less GDF complexity than a single component tree, removing the generalization paradox of ensembles. |
---|---|
ISSN: | 1061-8600 1537-2715 |
DOI: | 10.1198/1061860032733 |