Variable selection for multiply-imputed data with application to dioxin exposure study
Multiple imputation (MI) is a commonly used technique for handling missing data in large‐scale medical and public health studies. However, variable selection on multiply‐imputed data remains an important and longstanding statistical problem. If a variable selection method is applied to each imputed...
Gespeichert in:
Veröffentlicht in: | Statistics in medicine 2013-09, Vol.32 (21), p.3646-3659 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multiple imputation (MI) is a commonly used technique for handling missing data in large‐scale medical and public health studies. However, variable selection on multiply‐imputed data remains an important and longstanding statistical problem. If a variable selection method is applied to each imputed dataset separately, it may select different variables for different imputed datasets, which makes it difficult to interpret the final model or draw scientific conclusions. In this paper, we propose a novel multiple imputation‐least absolute shrinkage and selection operator (MI‐LASSO) variable selection method as an extension of the least absolute shrinkage and selection operator (LASSO) method to multiply‐imputed data. The MI‐LASSO method treats the estimated regression coefficients of the same variable across all imputed datasets as a group and applies the group LASSO penalty to yield a consistent variable selection across multiple‐imputed datasets. We use a simulation study to demonstrate the advantage of the MI‐LASSO method compared with the alternatives. We also apply the MI‐LASSO method to the University of Michigan Dioxin Exposure Study to identify important circumstances and exposure factors that are associated with human serum dioxin concentration in Midland, Michigan. Copyright © 2013 John Wiley & Sons, Ltd. |
---|---|
ISSN: | 0277-6715 1097-0258 |
DOI: | 10.1002/sim.5783 |