Improving reproducibility by using high-throughput observational studies with empirical calibration

Concerns over reproducibility in science extend to research using existing healthcare data; many observational studies investigating the same topic produce conflicting results, even when using the same data. To address this problem, we propose a paradigm shift. The current paradigm centres on genera...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences physical, and engineering sciences, 2018-09, Vol.376 (2128), p.20170356-20170356
Hauptverfasser:	Schuemie, Martijn J., Ryan, Patrick B., Hripcsak, George, Madigan, David, Suchard, Marc A.
Format:	Artikel
Sprache:	eng
Schlagworte:	Best practice Calibration Medical research Medicine Observational Research Observational studies Publication Bias Reproducibility
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Concerns over reproducibility in science extend to research using existing healthcare data; many observational studies investigating the same topic produce conflicting results, even when using the same data. To address this problem, we propose a paradigm shift. The current paradigm centres on generating one estimate at a time using a unique study design with unknown reliability and publishing (or not) one estimate at a time. The new paradigm advocates for high-throughput observational studies using consistent and standardized methods, allowing evaluation, calibration and unbiased dissemination to generate a more reliable and complete evidence base. We demonstrate this new paradigm by comparing all depression treatments for a set of outcomes, producing 17 718 hazard ratios, each using methodology on par with current best practice. We furthermore include control hypotheses to evaluate and calibrate our evidence generation process. Results show good transitivity and consistency between databases, and agree with four out of the five findings from clinical trials. The distribution of effect size estimates reported in the literature reveals an absence of small or null effects, with a sharp cut-off at p = 0.05. No such phenomena were observed in our results, suggesting more complete and more reliable evidence. This article is part of a discussion meeting issue 'The growing ubiquity of algorithms in society: implications, impacts and innovations'.
ISSN:	1364-503X 1471-2962
DOI:	10.1098/rsta.2017.0356