Application of a data continuity prediction algorithm to an electronic health record‐based pharmacoepidemiology study
Background and Objectives Use of algorithms to identify patients with high data‐continuity in electronic health records (EHRs) may increase study validity. Practical experience with this approach remains limited. Methods We developed and validated four algorithms to identify patients with high data...
Gespeichert in:
Veröffentlicht in: | Journal of evaluation in clinical practice 2024-06, Vol.30 (4), p.716-725 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Background and Objectives
Use of algorithms to identify patients with high data‐continuity in electronic health records (EHRs) may increase study validity. Practical experience with this approach remains limited.
Methods
We developed and validated four algorithms to identify patients with high data continuity in an EHR‐based data source. Selected algorithms were then applied to a pharmacoepidemiologic study comparing rates of COVID‐19 hospitalization in patients exposed to insulin versus noninsulin antidiabetic drugs.
Results
A model using a short list of five EHR‐derived variables performed as well as more complex models to distinguish high‐ from low‐data continuity patients. Higher data continuity was associated with more accurate ascertainment of key variables. In the pharmacoepidemiologic study, patients with higher data continuity had higher observed rates of the COVID‐19 outcome and a large unadjusted association between insulin use and the outcome, but no association after propensity score adjustment.
Discussion
We found that a simple, portable algorithm to predict data continuity gave comparable performance to more complex methods. Use of the algorithm significantly impacted the results of an empirical study, with evidence of more valid results at higher levels of data continuity.
Plain Language Summary
When doing research using electronic health records, one big problem is that sometimes information is missing or not consistently recorded. There are methods (algorithms) that can help identify which patients' records are more complete and reliable. However, it's not clear how much these methods actually improve the results of studies about how drugs affect populations (pharmacoepidemiologic studies). This manuscript demonstrates that using a straightforward and easy‐to‐use method seems to make the results of such a drug study more accurate. |
---|---|
ISSN: | 1356-1294 1365-2753 |
DOI: | 10.1111/jep.14002 |