Semiparametric methods for response-selective and missing data problems in regression

Suppose that data are generated according to the model f(y∣ x; θ) g(x), where y is a response and x are covariates. We derive and compare semiparametric likelihood and pseudo-likelihood methods for estimating θ for situations in which units generated are not fully observed and in which it is impossi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the Royal Statistical Society. Series B, Statistical methodology Statistical methodology, 1999-01, Vol.61 (2), p.413-438
Hauptverfasser: Lawless, J. F., Kalbfleisch, J. D., Wild, C. J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Suppose that data are generated according to the model f(y∣ x; θ) g(x), where y is a response and x are covariates. We derive and compare semiparametric likelihood and pseudo-likelihood methods for estimating θ for situations in which units generated are not fully observed and in which it is impossible or undesirable to model the covariate distribution. The probability that a unit is fully observed may depend on y, and there may be a subset of covariates which is observed only for a subsample of individuals. Our key assumptions are that the probability that a unit has missing data depends only on which of a finite number of strata that (y, x) belongs to and that the stratum membership is observed for every unit. Applications include case-control studies in epidemiology, field reliability studies and broad classes of missing data and measurement error problems. Our results make fully efficient estimation of θ feasible, and they generalize and provide insight into a variety of methods that have been proposed for specific problems.
ISSN:1369-7412
1467-9868
DOI:10.1111/1467-9868.00185