Leveraging unstructured electronic medical record notes to derive population-specific suicide risk models

•Establishes method to leverage electronic medical record notes to derive suicide prediction metric.•Uses natural language processing to analyze representative sample of VA patients, including all those that died by suicide in 2015 and 2016.•Derived metric has strong predictive accuracy, correctly p...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Psychiatry research 2022-09, Vol.315, p.114703-114703, Article 114703
Hauptverfasser:	Levis, Maxwell, Levy, Joshua, Dufort, Vincent, Gobbel, Glenn T., Watts, Bradley V., Shiner, Brian
Format:	Artikel
Sprache:	eng
Schlagworte:	Electronic medical records Natural language processing Suicide prediction Suicide prevention
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Establishes method to leverage electronic medical record notes to derive suicide prediction metric.•Uses natural language processing to analyze representative sample of VA patients, including all those that died by suicide in 2015 and 2016.•Derived metric has strong predictive accuracy, correctly predicting 74% of those that proceeded to die by suicide. Electronic medical record (EMR)-based suicide risk prediction methods typically rely on analysis of structured variables such as demographics, visit history, and prescription data. Leveraging unstructured EMR notes may improve predictive accuracy by allowing access to nuanced clinical information. We utilized natural language processing (NLP) to analyze a large EMR note corpus to develop a data-driven suicide risk prediction model. We developed a matched case-control sample of U.S. Department of Veterans Affairs (VA) patients in 2015 and 2016. We randomly matched each case (all patients that died by suicide in that interval, n = 5029) with five controls (patients that remained alive). We processed note corpus using NLP methods and applied machine-learning classification algorithms to output. We calculated area under the curve (AUC) and risk tiers to determine predictive accuracy. NLP-derived models demonstrated strong predictive accuracy. Patients that scored within top 10% of risk model accounted for up to 29% of suicide decedents. NLP-derived model compares positively to other leading prediction methods. Our approach is highly implementable, only requiring access to text data and open-source software. Additional studies should evaluate ensemble models incorporating NLP-derived information alongside more typical structured variables.
ISSN:	0165-1781 1872-7123
DOI:	10.1016/j.psychres.2022.114703