Extending AIC to best subset regression
The Akaike information criterion (AIC) is routinely used for model selection in best subset regression. The standard AIC, however, generally under-penalizes model complexity in the best subset regression setting, potentially leading to grossly overfit models. Recently, Zhang and Cavanaugh (Comput St...
Gespeichert in:
Veröffentlicht in: | Computational statistics 2018-06, Vol.33 (2), p.787-806 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Akaike information criterion (AIC) is routinely used for model selection in best subset regression. The standard AIC, however, generally under-penalizes model complexity in the best subset regression setting, potentially leading to grossly overfit models. Recently, Zhang and Cavanaugh (Comput Stat 31(2):643–669,
2015
) made significant progress towards addressing this problem by introducing an effective multistage model selection procedure. In this paper, we present a rigorous and coherent conceptual framework for extending AIC to best subset regression. A new model selection algorithm derived from our framework possesses well understood and desirable asymptotic properties and consistently outperforms the procedure of Zhang and Cavanaugh in simulation studies. It provides an effective tool for combating the pervasive overfitting that detrimentally impacts best subset regression analysis so that the selected models contain fewer irrelevant predictors and predict future observations more accurately. |
---|---|
ISSN: | 0943-4062 1613-9658 |
DOI: | 10.1007/s00180-018-0797-8 |