Strategies for imputing missing covariates in accelerated failure time models

Missing covariates often occur in biomedical studies with survival outcomes. Multiple imputation via chained equations (MICE) is a semi‐parametric and flexible approach that imputes multivariate data by a series of conditional models, one for each incomplete variable. When applying MICE, practitione...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Statistics in medicine 2018-10, Vol.37 (24), p.3417-3436
Hauptverfasser: Qi, Lihong, Wang, Ying‐Fang, Chen, Rongqi, Siddique, Juned, Robbins, John, He, Yulei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Missing covariates often occur in biomedical studies with survival outcomes. Multiple imputation via chained equations (MICE) is a semi‐parametric and flexible approach that imputes multivariate data by a series of conditional models, one for each incomplete variable. When applying MICE, practitioners tend to specify the conditional models in a simple fashion largely dictated by the software, which could lead to suboptimal results. Practical guidelines for specifying appropriate conditional models in MICE are lacking. Motivated by a study of time to hip fractures in the Women's Health Initiative Observational Study using accelerated failure time models, we propose and experiment with some rationales leading to appropriate MICE specifications. This strategy starts with specifying a joint model for the variables involved. We first derive the conditional distribution of each variable under the joint model, then approximate these conditional distributions to the extent which can be characterized by commonly used regression models. We propose to fit separate models to impute incomplete variables by the failure status, which is key to generating appropriate MICE specifications for survival outcomes. The proposed strategy can be conveniently implemented with all available imputation software that uses fully conditional specifications. Our simulation results show that some commonly used simple MICE specifications can produce suboptimal results, while those based on the proposed strategy appear to perform well and be robust toward model misspecifications. Hence, we warn against a mechanical use of MICE and suggest careful modeling of the conditional distributions of variables to ensure proper performance.
ISSN:0277-6715
1097-0258
DOI:10.1002/sim.7809