Validity of Privacy-Protecting Analytical Methods That Use Only Aggregate-Level Information to Conduct Multivariable-Adjusted Analysis in Distributed Data Networks

Abstract Distributed data networks enable large-scale epidemiologic studies, but protecting privacy while adequately adjusting for a large number of covariates continues to pose methodological challenges. Using 2 empirical examples within a 3-site distributed data network, we tested combinations of...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	American journal of epidemiology 2019-04, Vol.188 (4), p.709-723
Hauptverfasser:	Li, Xiaojuan, Fireman, Bruce H, Curtis, Jeffrey R, Arterburn, David E, Fisher, David P, Moyneur, Érick, Gallagher, Mia, Raebel, Marsha A, Nowell, W Benjamin, Lagreid, Lindsay, Toh, Sengwee
Format:	Artikel
Sprache:	eng
Schlagworte:	Analytical methods Confidentiality - standards Data Aggregation Data analysis Empirical analysis Epidemiologic Research Design Epidemiology Health risks Humans Information Dissemination - methods Information Services Matching Meta-analysis Multivariate Analysis Networks Practice of Epidemiology Privacy Probabilistic methods Propensity Score Risk Weighting
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Abstract Distributed data networks enable large-scale epidemiologic studies, but protecting privacy while adequately adjusting for a large number of covariates continues to pose methodological challenges. Using 2 empirical examples within a 3-site distributed data network, we tested combinations of 3 aggregate-level data-sharing approaches (risk-set, summary-table, and effect-estimate), 4 confounding adjustment methods (matching, stratification, inverse probability weighting, and matching weighting), and 2 summary scores (propensity score and disease risk score) for binary and time-to-event outcomes. We assessed the performance of combinations of these data-sharing and adjustment methods by comparing their results with results from the corresponding pooled individual-level data analysis (reference analysis). For both types of outcomes, the method combinations examined yielded results identical or comparable to the reference results in most scenarios. Within each data-sharing approach, comparability between aggregate- and individual-level data analysis depended on adjustment method; for example, risk-set data-sharing with matched or stratified analysis of summary scores produced identical results, while weighted analysis showed some discrepancies. Across the adjustment methods examined, risk-set data-sharing generally performed better, while summary-table and effect-estimate data-sharing more often produced discrepancies in settings with rare outcomes and small sample sizes. Valid multivariable-adjusted analysis can be performed in distributed data networks without sharing of individual-level data.
ISSN:	0002-9262 1476-6256
DOI:	10.1093/aje/kwy265