Underlying cause of death identification from death certificates using reverse coding to text and a NLP based deep learning approach

The identification of the underlying cause of death is a matter of primary importance and one of the most challenging issues in the setting of healthcare policy making. The World Health Organisation provides guidelines for death certificates coding using the ICD-10 classification. Guidelines can be...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Informatics in medicine unlocked 2020, Vol.21, p.100456, Article 100456
Hauptverfasser: Della Mea, Vincenzo, Popescu, Mihai Horia, Roitero, Kevin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The identification of the underlying cause of death is a matter of primary importance and one of the most challenging issues in the setting of healthcare policy making. The World Health Organisation provides guidelines for death certificates coding using the ICD-10 classification. Guidelines can be manually applied, but there exist some coding support systems that implement them to simplify the coding work. Nevertheless, there is disparity among countries with respect to the level and the quality of death certificates registration. In this work we propose an effective supervised model based on Natural Language Processing algorithms to the aim of correctly classifying the underlying cause of death from death certificates. In our study we compared tabular representations of the death certificate, including the hierarchical path of each condition in the classification, with a novel representation consisting in translating back to their standard title the conditions expressed as ICD-10 codes. Our experimental evaluation, after training on 10.5 million certificates, reached a 99.03% accuracy, which currently outperforms state-of-the-art systems. For its practical applicability, we studied performance by classification chapter and found that accuracy is low only for chapters including very rare death causes. Finally, to show the robustness of our model, we leverage the model confidence to help identifying death certificates for which a manual coding is needed. •We trained FNNs and NLP models with 10 M death certificates and tested on over 3 M.•The Underlying Cause of Death can be predicted from ICD-10 coded death certificates.•Reverse coding of ICD-10 codes back to text provides even more effectiveness.•The trained model is robust to small variations in ICD-10 and in mortality coding rules.
ISSN:2352-9148
2352-9148
DOI:10.1016/j.imu.2020.100456