Using natural language processing to improve suicide classification requires consideration of race

Objectives To improve the accuracy of classification of deaths of undetermined intent and to examine racial differences in misclassification. Methods We used natural language processing and statistical text analysis on restricted‐access case narratives of suicides, homicides, and undetermined deaths...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Suicide & life-threatening behavior 2022-08, Vol.52 (4), p.782-791
Hauptverfasser: Rahman, Nusrat, Mozer, Reagan, McHugh, R. Kathryn, Rockett, Ian R. H., Chow, Clifton M., Vaughan, Gregory
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Objectives To improve the accuracy of classification of deaths of undetermined intent and to examine racial differences in misclassification. Methods We used natural language processing and statistical text analysis on restricted‐access case narratives of suicides, homicides, and undetermined deaths in 37 states collected from the National Violent Death Reporting System (NVDRS) (2017). We fit separate race‐specific classification models to predict suicide among undetermined cases using data from known homicide cases (true negatives) and known suicide cases (true positives). Results A classifier trained on an all‐race dataset predicts less than half of these cases as suicide. Importantly, our analysis yields an estimated suicide rate for the Black population comparable with the typical detection rate for the White population, indicating that misclassification excess is endemic for Black suicide. This problem may be mitigated by using race‐specific data. Our findings, based on the statistical text analysis, also reveal systematic differences in the phrases identified as most predictive of suicide. Conclusions This study highlights the need to understand the reasons underlying suicide rate differences and for further testing of strategies to reduce misclassification, particularly among people of color.
ISSN:0363-0234
1943-278X
DOI:10.1111/sltb.12862