Using interpretability approaches to update “black-box” clinical prediction models: an external validation study in nephrology

•There is a dearth of external validation studies, i.e., in a different setting, in clinical predictive modeling.•Machine learning-based prediction models are especially prone to not generalize well in validation studies.•The use of interpretability methods helps to shed light on model performance i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Artificial intelligence in medicine 2021-01, Vol.111, p.101982, Article 101982
Hauptverfasser:	da Cruz, Harry Freitas, Pfahringer, Boris, Martensen, Tom, Schneider, Frederic, Meyer, Alexander, Böttinger, Erwin, Schapranow, Matthieu-P.
Format:	Artikel
Sprache:	eng
Schlagworte:	Acute Kidney Injury - diagnosis Clinical predictive modeling Cohort Studies Hospitals Humans Interpretability methods Machine Learning Nephrology Validation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•There is a dearth of external validation studies, i.e., in a different setting, in clinical predictive modeling.•Machine learning-based prediction models are especially prone to not generalize well in validation studies.•The use of interpretability methods helps to shed light on model performance in external validation.•Using knowledge distilled from interpretability methods helps to perform model update for simpler, potentially more generalizable models. Despite advances in machine learning-based clinical prediction models, only few of such models are actually deployed in clinical contexts. Among other reasons, this is due to a lack of validation studies. In this paper, we present and discuss the validation results of a machine learning model for the prediction of acute kidney injury in cardiac surgery patients initially developed on the MIMIC-III dataset when applied to an external cohort of an American research hospital. To help account for the performance differences observed, we utilized interpretability methods based on feature importance, which allowed experts to scrutinize model behavior both at the global and local level, making it possible to gain further insights into why it did not behave as expected on the validation cohort. The knowledge gleaned upon derivation can be potentially useful to assist model update during validation for more generalizable and simpler models. We argue that interpretability methods should be considered by practitioners as a further tool to help explain performance differences and inform model update in validation studies.
ISSN:	0933-3657 1873-2860 1873-2860
DOI:	10.1016/j.artmed.2020.101982