Asymptotic Unbiasedness of the Permutation Importance Measure in Random Forest Models
Variable selection in sparse regression models is an important task as applications ranging from biomedical research to econometrics have shown. Especially for higher dimensional regression problems, for which the link function between response and covariates cannot be directly detected, the selecti...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Variable selection in sparse regression models is an important task as
applications ranging from biomedical research to econometrics have shown.
Especially for higher dimensional regression problems, for which the link
function between response and covariates cannot be directly detected, the
selection of informative variables is challenging. Under these circumstances,
the Random Forest method is a helpful tool to predict new outcomes while
delivering measures for variable selection. One common approach is the usage of
the permutation importance. Due to its intuitive idea and flexible usage, it is
important to explore circumstances, for which the permutation importance based
on Random Forest correctly indicates informative covariates. Regarding the
latter, we deliver theoretical guarantees for the validity of the permutation
importance measure under specific assumptions and prove its (asymptotic)
unbiasedness. An extensive simulation study verifies our findings. |
---|---|
DOI: | 10.48550/arxiv.1912.03306 |