Data-Driven Covariate Selection for Confounding Adjustment by Focusing on the Stability of the Effect Estimator
Valid inference of cause-and-effect relations in observational studies necessitates adjusting for common causes of the focal predictor (i.e., treatment) and the outcome. When such common causes, henceforth termed confounders, remain unadjusted for, they generate spurious correlations that lead to bi...
Gespeichert in:
Veröffentlicht in: | Psychological methods 2024-10, Vol.29 (5), p.947-966 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Valid inference of cause-and-effect relations in observational studies necessitates adjusting for common causes of the focal predictor (i.e., treatment) and the outcome. When such common causes, henceforth termed confounders, remain unadjusted for, they generate spurious correlations that lead to biased causal effect estimates. But routine adjustment for all available covariates, when only a subset are truly confounders, is known to yield potentially inefficient and unstable estimators. In this article, we introduce a data-driven confounder selection strategy that focuses on stable estimation of the treatment effect. The approach exploits the causal knowledge that after adjusting for confounders to eliminate all confounding biases, adding any remaining non-confounding covariates associated with only treatment or outcome, but not both, should not systematically change the effect estimator. The strategy proceeds in two steps. First, we prioritize covariates for adjustment by probing how strongly each covariate is associated with treatment and outcome. Next, we gauge the stability of the effect estimator by evaluating its trajectory adjusting for different covariate subsets. The smallest subset that yields a stable effect estimate is then selected. Thus, the strategy offers direct insight into the (in)sensitivity of the effect estimator to the chosen covariates for adjustment. The ability to correctly select confounders and yield valid causal inferences following data-driven covariate selection is evaluated empirically using extensive simulation studies. Furthermore, we compare the introduced method empirically with routine variable selection methods. Finally, we demonstrate the procedure using two publicly available real-world datasets. A step-by-step practical guide with user-friendly R functions is included.
Translational Abstract
A central goal of psychological research is to understand cause-and-effect relations. Thoughtfully designed and meticulously conducted randomized experiments are the gold standard for examining the causal impact of an intervention, or treatment, on an outcome. But such randomized studies are often practically unfeasible for ethical and logistical reasons. In such contexts, observational or nonexperimental studies where individuals are exposed to nonrandomized treatments become the only viable option for causal inference. However, the statistical correlation between a nonrandomized treatment and an outcome can be due to noncausal |
---|---|
ISSN: | 1082-989X 1939-1463 1939-1463 |
DOI: | 10.1037/met0000564 |