Semiparametric methods for response-selective and missing data problems in regression
Suppose that data are generated according to the model f(y∣ x; θ) g(x), where y is a response and x are covariates. We derive and compare semiparametric likelihood and pseudo-likelihood methods for estimating θ for situations in which units generated are not fully observed and in which it is impossi...
Gespeichert in:
Veröffentlicht in: | Journal of the Royal Statistical Society. Series B, Statistical methodology Statistical methodology, 1999-01, Vol.61 (2), p.413-438 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Suppose that data are generated according to the model f(y∣ x; θ) g(x), where y is a response and x are covariates. We derive and compare semiparametric likelihood and pseudo-likelihood methods for estimating θ for situations in which units generated are not fully observed and in which it is impossible or undesirable to model the covariate distribution. The probability that a unit is fully observed may depend on y, and there may be a subset of covariates which is observed only for a subsample of individuals. Our key assumptions are that the probability that a unit has missing data depends only on which of a finite number of strata that (y, x) belongs to and that the stratum membership is observed for every unit. Applications include case-control studies in epidemiology, field reliability studies and broad classes of missing data and measurement error problems. Our results make fully efficient estimation of θ feasible, and they generalize and provide insight into a variety of methods that have been proposed for specific problems. |
---|---|
ISSN: | 1369-7412 1467-9868 |
DOI: | 10.1111/1467-9868.00185 |