Double machine learning for sample selection models
This paper considers the evaluation of discretely distributed treatments when outcomes are only observed for a subpopulation due to sample selection or outcome attrition. For identification, we combine a selection-on-observables assumption for treatment assignment with either selection-on-observable...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper considers the evaluation of discretely distributed treatments when
outcomes are only observed for a subpopulation due to sample selection or
outcome attrition. For identification, we combine a selection-on-observables
assumption for treatment assignment with either selection-on-observables or
instrumental variable assumptions concerning the outcome attrition/sample
selection process. We also consider dynamic confounding, meaning that
covariates that jointly affect sample selection and the outcome may (at least
partly) be influenced by the treatment. To control in a data-driven way for a
potentially high dimensional set of pre- and/or post-treatment covariates, we
adapt the double machine learning framework for treatment evaluation to sample
selection problems. We make use of (a) Neyman-orthogonal, doubly robust, and
efficient score functions, which imply the robustness of treatment effect
estimation to moderate regularization biases in the machine learning-based
estimation of the outcome, treatment, or sample selection models and (b) sample
splitting (or cross-fitting) to prevent overfitting bias. We demonstrate that
the proposed estimators are asymptotically normal and root-n consistent under
specific regularity conditions concerning the machine learners and investigate
their finite sample properties in a simulation study. We also apply our
proposed methodology to the Job Corps data for evaluating the effect of
training on hourly wages which are only observed conditional on employment. The
estimator is available in the causalweight package for the statistical software
R. |
---|---|
DOI: | 10.48550/arxiv.2012.00745 |