Co-occurrence of medical conditions: Exposing patterns through probabilistic topic modeling of snomed codes

[Display omitted] •Identify patterns of co-occurring medical conditions in kidney patients.•Topic modeling is used in a non-traditional way on >13,000 patient records.•A topic is characterized by a few highly-probable and unique disease-codes.•Most conditions grouped in a topic are known to co-oc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of biomedical informatics 2018-06, Vol.82, p.31-40
Hauptverfasser: Bhattacharya, Moumita, Jurkovitz, Claudine, Shatkay, Hagit
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:[Display omitted] •Identify patterns of co-occurring medical conditions in kidney patients.•Topic modeling is used in a non-traditional way on >13,000 patient records.•A topic is characterized by a few highly-probable and unique disease-codes.•Most conditions grouped in a topic are known to co-occur in the medical literature.•Topics also expose indirect associations that have hitherto gone unreported. Patients associated with multiple co-occurring health conditions often face aggravated complications and less favorable outcomes. Co-occurring conditions are especially prevalent among individuals suffering from kidney disease, an increasingly widespread condition affecting 13% of the general population in the US. This study aims to identify and characterize patterns of co-occurring medical conditions in patients employing a probabilistic framework. Specifically, we apply topic modeling in a non-traditional way to find associations across SNOMED-CT codes assigned and recorded in the EHRs of >13,000 patients diagnosed with kidney disease. Unlike most prior work on topic modeling, we apply the method to codes rather than to natural language. Moreover, we quantitatively evaluate the topics, assessing their tightness and distinctiveness, and also assess the medical validity of our results. Our experiments show that each topic is succinctly characterized by a few highly probable and unique disease codes, indicating that the topics are tight. Furthermore, inter-topic distance between each pair of topics is typically high, illustrating distinctiveness. Last, most coded conditions grouped together within a topic, are indeed reported to co-occur in the medical literature. Notably, our results uncover a few indirect associations among conditions that have hitherto not been reported as correlated in the medical literature.
ISSN:1532-0464
1532-0480
1532-0480
DOI:10.1016/j.jbi.2018.04.008