Combining case-control studies for identifiability and efficiency improvement in logistic regression

Can two separate case-control studies, one about Hepatitis disease and the other about Fibrosis, for example, be combined together? It would be hugely beneficial if two or more separately conducted case-control studies, even for entirely irrelevant purposes, can be merged together with a unified ana...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2021-06
Hauptverfasser: Tang, Wenlu, Lin, Yuanyuan, Dai, Linlin, Chen, Kani
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Can two separate case-control studies, one about Hepatitis disease and the other about Fibrosis, for example, be combined together? It would be hugely beneficial if two or more separately conducted case-control studies, even for entirely irrelevant purposes, can be merged together with a unified analysis that produces better statistical properties, e.g., more accurate estimation of parameters. In this paper, we show that, when using the popular logistic regression model, the combined/integrative analysis produces a more accurate estimation of the slope parameters than the single case-control study. It is known that, in a single logistic case-control study, the intercept is not identifiable, contrary to prospective studies. In combined case-control studies, however, the intercepts are proved to be identifiable under mild conditions. The resulting maximum likelihood estimates of the intercepts and slopes are proved to be consistent and asymptotically normal, with asymptotic variances achieving the semiparametric efficiency lower bound.
ISSN:2331-8422