How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare

Medical use cases for machine learning (ML) are growing exponentially. The first hospitals are already using ML systems as decision support systems in their daily routine. At the same time, most ML systems are still opaque and it is not clear how these systems arrive at their predictions. In this pa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Artificial intelligence in medicine 2023-09, Vol.143, p.102616-102616, Article 102616
Hauptverfasser: Allgaier, Johannes, Mulansky, Lena, Draelos, Rachel Lea, Pryss, Rüdiger
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Medical use cases for machine learning (ML) are growing exponentially. The first hospitals are already using ML systems as decision support systems in their daily routine. At the same time, most ML systems are still opaque and it is not clear how these systems arrive at their predictions. In this paper, we provide a brief overview of the taxonomy of explainability methods and review popular methods. In addition, we conduct a systematic literature search on PubMed to investigate which explainable artificial intelligence (XAI) methods are used in 450 specific medical supervised ML use cases, how the use of XAI methods has emerged recently, and how the precision of describing ML pipelines has evolved over the past 20 years. A large fraction of publications with ML use cases do not use XAI methods at all to explain ML predictions. However, when XAI methods are used, open-source and model-agnostic explanation methods are more commonly used, with SHapley Additive exPlanations (SHAP) and Gradient Class Activation Mapping (Grad-CAM) for tabular and image data leading the way. ML pipelines have been described in increasing detail and uniformity in recent years. However, the willingness to share data and code has stagnated at about one-quarter. XAI methods are mainly used when their application requires little effort. The homogenization of reports in ML use cases facilitates the comparability of work and should be advanced in the coming years. Experts who can mediate between the worlds of informatics and medicine will become more and more in demand when using ML systems due to the high complexity of the domain. [Display omitted] •We estimate that only 16 % of the reported explainability methods could be understood by patients.•The distribution of data types for explainable ML applications for tabular | image | text | audio data is 51 % | 32 % | 3 % | 0 %.•The quality of the description of machine learning pipelines increased in recent years with more homogeneity.•The data and code sharing ratio stagnated in about one quarter.•Most popular explainability methods are SHAP, LIME, and Grad-CAM.
ISSN:0933-3657
1873-2860
DOI:10.1016/j.artmed.2023.102616