A robust imputation method for missing responses and covariates in sample selection models

Sample selection arises when the outcome of interest is partially observed in a study. Although sophisticated statistical methods in the parametric and non-parametric framework have been proposed to solve this problem, it is yet unclear how to deal with selectively missing covariate data using simpl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Statistical methods in medical research 2019-01, Vol.28 (1), p.102-116
Hauptverfasser: Ogundimu, Emmanuel O, Collins, Gary S
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Sample selection arises when the outcome of interest is partially observed in a study. Although sophisticated statistical methods in the parametric and non-parametric framework have been proposed to solve this problem, it is yet unclear how to deal with selectively missing covariate data using simple multiple imputation techniques, especially in the absence of exclusion restrictions and deviation from normality. Motivated by the 2003–2004 NHANES data, where previous authors have studied the effect of socio-economic status on blood pressure with missing data on income variable, we proposed the use of a robust imputation technique based on the selection-t sample selection model. The imputation method, which is developed within the frequentist framework, is compared with competing alternatives in a simulation study. The results indicate that the robust alternative is not susceptible to the absence of exclusion restrictions – a property inherited from the parent selection-t model – and performs better than models based on the normal assumption even when the data is generated from the normal distribution. Applications to missing outcome and covariate data further corroborate the robustness properties of the proposed method. We implemented the proposed approach within the MICE environment in R Statistical Software.
ISSN:0962-2802
1477-0334
DOI:10.1177/0962280217715663