Detecting Opioid Use Disorder in Health Claims Data With Positive Unlabeled Learning

Accurate detection and prevalence estimation of behavioral health conditions, such as opioid use disorder (OUD), are crucial for identifying at-risk individuals, determining treatment needs, monitoring prevention and intervention efforts, and recruiting treatment-naive participants for clinical tria...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal of biomedical and health informatics 2024-12, p.1-8
Hauptverfasser: Kumar, Praveen, Moomtaheen, Fariha, Malec, Scott A., Yang, Jeremy J., Bologa, Cristian G., Schneider, Kristan A, Zhu, Yiliang, Tohen, Mauricio, Villarreal, Gerardo, Perkins, Douglas J., Fielstein, Elliot M., Davis, Sharon E., Matheny, Michael E., Lambert, Christophe G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Accurate detection and prevalence estimation of behavioral health conditions, such as opioid use disorder (OUD), are crucial for identifying at-risk individuals, determining treatment needs, monitoring prevention and intervention efforts, and recruiting treatment-naive participants for clinical trials. The availability of extensive health data, combined with advancements in machine learning (ML) frameworks, has enabled researchers to employ various ML techniques to predict or identify OUD within patient health data. Ideally, we could directly estimate the prevalence, or the proportion of a population with a condition over time. However, underdiagnosis and undercoding of conditions in patient health records make it challenging to determine the true prevalence of these conditions and to identify at-risk patients with less severe conditions who are more likely to be missed. Consequently, patients without diagnoses may comprise positive and negative examples for a given condition. Treating all undiagnosed (uncoded) patients as negative when applying ML methods can introduce bias into models, affecting their predictive power. To address this issue, we employed Positive Unlabeled Learning Selected Not At Random (PULSNAR), a Positive and Unlabeled (PU) learning technique, to estimate the probability of a given patient having OUD during a time window and the overall population prevalence of OUD. In a sample of 3,342,044 commercially insured US patients with at least one opioid prescription filled, PULSNAR estimated that 5.08% of patients have a cumulative prevalence of OUD over a 2-5 a observation period, compared to the 1.35% with a recorded OUD diagnosis, with 73.5% of cases not diagnosed/coded. The prevalence estimates provided by PULSNAR are consistent with those reported in other studies.
ISSN:2168-2194
2168-2208
DOI:10.1109/JBHI.2024.3515805