A Knowledge Distillation Ensemble Framework for Predicting Short- and Long-Term Hospitalization Outcomes From Electronic Health Records Data

The ability to perform accurate prognosis is crucial for proactive clinical decision making, informed resource management and personalised care. Existing outcome prediction models suffer from a low recall of infrequent positive outcomes. We present a highly-scalable and robust machine learning frame...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE journal of biomedical and health informatics 2022-01, Vol.26 (1), p.423-435
Hauptverfasser:	Ibrahim, Zina M., Bean, Daniel, Searle, Thomas, Qian, Linglong, Wu, Honghan, Shek, Anthony, Kraljevic, Zeljko, Galloway, James, Norton, Sam, Teo, James T, Dobson, Richard JB
Format:	Artikel
Sprache:	eng
Schlagworte:	Biological system modeling Case studies Clinical Outcome Prediction Data models Decision making Distillation Electronic Health Records Electronic medical records Ensemble Learning Gradient Boost Hospitalization Hospitals Humans Imbalanced time-series Learning algorithms Length of Stay Long Short Term Memory networks (LSTM) Machine Learning Mortality Mortality Prediction Outlier Detection Oxygen Physiology Prediction models Predictive models Recall Representations Resource management Retrospective Studies ROC Curve Stacked Ensemble Time series
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The ability to perform accurate prognosis is crucial for proactive clinical decision making, informed resource management and personalised care. Existing outcome prediction models suffer from a low recall of infrequent positive outcomes. We present a highly-scalable and robust machine learning framework to automatically predict adversity represented by mortality and ICU admission and readmission from time-series of vital signs and laboratory results obtained within the first 24 hours of hospital admission. The stacked ensemble platform comprises two components: a) an unsupervised LSTM Autoencoder that learns an optimal representation of the time-series, using it to differentiate the less frequent patterns which conclude with an adverse event from the majority patterns that do not, and b) a gradient boosting model, which relies on the constructed representation to refine prediction by incorporating static features. The model is used to assess a patient's risk of adversity and provides visual justifications of its prediction. Results of three case studies show that the model outperforms existing platforms in ICU and general ward settings, achieving average Precision-Recall Areas Under the Curve (PR-AUCs) of 0.891 (95% CI: 0.878-0.939) for mortality and 0.908 (95% CI: 0.870-0.935) in predicting ICU admission and readmission.
ISSN:	2168-2194 2168-2208 2168-2208
DOI:	10.1109/JBHI.2021.3089287