Natural Language Processing for the Ascertainment and Phenotyping of Left Ventricular Hypertrophy and Hypertrophic Cardiomyopathy on Echocardiogram Reports

Extracting and accurately phenotyping electronic health documentation is critical for medical research and clinical care. We sought to develop a highly accurate and open-source natural language processing (NLP) module to ascertain and phenotype left ventricular hypertrophy (LVH) and hypertrophic car...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The American journal of cardiology 2023-11, Vol.206, p.247-253
Hauptverfasser: Berman, Adam N., Ginder, Curtis, Sporn, Zachary A., Tanguturi, Varsha, Hidrue, Michael K., Shirkey, Linnea B., Zhao, Yunong, Blankstein, Ron, Turchin, Alexander, Wasfy, Jason H.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Extracting and accurately phenotyping electronic health documentation is critical for medical research and clinical care. We sought to develop a highly accurate and open-source natural language processing (NLP) module to ascertain and phenotype left ventricular hypertrophy (LVH) and hypertrophic cardiomyopathy (HCM) diagnoses from echocardiogram reports within a diverse hospital network. After the initial development on 17,250 echocardiogram reports, 700 unique reports from 6 hospitals were randomly selected from data repositories within the Mass General Brigham healthcare system and manually adjudicated by physicians for 10 subtypes of LVH and diagnoses of HCM. Using an open-source NLP system, the module was formally tested on 300 training set reports and validated on 400 reports. The sensitivity, specificity, positive predictive value, and negative predictive value were calculated to assess the discriminative accuracy of the NLP module. The NLP demonstrated robust performance across the 10 LVH subtypes, with the overall sensitivity and specificity exceeding 96%. In addition, the NLP module demonstrated excellent performance in detecting HCM diagnoses, with sensitivity and specificity exceeding 93%. In conclusion, we designed a highly accurate NLP module to determine the presence of LVH and HCM on echocardiogram reports. Our work demonstrates the feasibility and accuracy of NLP to detect diagnoses on imaging reports, even when described in free text. This module has been placed in the public domain to advance research, trial recruitment, and population health management for patients with LVH-associated conditions. [Display omitted]
ISSN:0002-9149
1879-1913
DOI:10.1016/j.amjcard.2023.08.109