Improving estimation and prediction in linear regression incorporating external information from an established reduced model

We consider a situation where there is rich historical data available for the coefficients and their standard errors in a linear regression model describing the association between a continuous outcome variable Y and a set of predicting factors X, from a large study. We would like to use this summar...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Statistics in medicine 2018-04, Vol.37 (9), p.1515-1530
Hauptverfasser: Cheng, Wenting, Taylor, Jeremy M. G., Vokonas, Pantel S., Park, Sung Kyun, Mukherjee, Bhramar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We consider a situation where there is rich historical data available for the coefficients and their standard errors in a linear regression model describing the association between a continuous outcome variable Y and a set of predicting factors X, from a large study. We would like to use this summary information for improving inference in an expanded model of interest, Y given X,B. The additional variable B is a new biomarker, measured on a small number of subjects in a new dataset. We formulate the problem in an inferential framework where the historical information is translated in terms of nonlinear constraints on the parameter space and propose both frequentist and Bayes solutions to this problem. We show that a Bayesian transformation approach proposed by Gunn and Dunson is a simple and effective computational method to conduct approximate Bayesian inference for this constrained parameter problem. The simulation results comparing these methods indicate that historical information on E(Y|X) can improve the efficiency of estimation and enhance the predictive power in the regression model of interest E(Y|X,B). We illustrate our methodology by enhancing a published prediction model for bone lead levels in terms of blood lead and other covariates, with a new biomarker defined through a genetic risk score.
ISSN:0277-6715
1097-0258
DOI:10.1002/sim.7600