Can physician judgment enhance model trustworthiness? A case study on predicting pathological lymph nodes in rectal cancer

Explainability is key to enhancing the trustworthiness of artificial intelligence in medicine. However, there exists a significant gap between physicians’ expectations for model explainability and the actual behavior of these models. This gap arises from the absence of a consensus on a physician-cen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Artificial intelligence in medicine 2024-08, Vol.154, p.102929, Article 102929
Hauptverfasser: Kobayashi, Kazuma, Takamizawa, Yasuyuki, Miyake, Mototaka, Ito, Sono, Gu, Lin, Nakatsuka, Tatsuya, Akagi, Yu, Harada, Tatsuya, Kanemitsu, Yukihide, Hamamoto, Ryuji
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Explainability is key to enhancing the trustworthiness of artificial intelligence in medicine. However, there exists a significant gap between physicians’ expectations for model explainability and the actual behavior of these models. This gap arises from the absence of a consensus on a physician-centered evaluation framework, which is needed to quantitatively assess the practical benefits that effective explainability should offer practitioners. Here, we hypothesize that superior attention maps, as a mechanism of model explanation, should align with the information that physicians focus on, potentially reducing prediction uncertainty and increasing model reliability. We employed a multimodal transformer to predict lymph node metastasis of rectal cancer using clinical data and magnetic resonance imaging. We explored how well attention maps, visualized through a state-of-the-art technique, can achieve agreement with physician understanding. Subsequently, we compared two distinct approaches for estimating uncertainty: a standalone estimation using only the variance of prediction probability, and a human-in-the-loop estimation that considers both the variance of prediction probability and the quantified agreement. Our findings revealed no significant advantage of the human-in-the-loop approach over the standalone one. In conclusion, this case study did not confirm the anticipated benefit of the explanation in enhancing model reliability. Superficial explanations could do more harm than good by misleading physicians into relying on uncertain predictions, suggesting that the current state of attention mechanisms should not be overestimated in the context of model explainability. [Display omitted] •Good model explanations should align with physicians and low prediction uncertainty.•We evaluated attention maps for lymph node metastasis prediction in rectal cancer.•Expected benefit of physician agreement was not observed in the study.•Attention mechanisms may confuse practitioners due to superficial clarity.
ISSN:0933-3657
1873-2860
1873-2860
DOI:10.1016/j.artmed.2024.102929