Evolutionary Strategies Enable Systematic and Reliable Uncertainty Quantification: A Proof-of-Concept Pilot Study on Resting-State Functional MRI Language Lateralization

Reliable and trustworthy artificial intelligence (AI), particularly in high-stake medical diagnoses, necessitates effective uncertainty quantification (UQ). Existing UQ methods using model ensembles often introduce invalid variability or computational complexity, rendering them impractical and ineff...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of imaging informatics in medicine 2024-07
Hauptverfasser: Stember, Joseph N, Dishner, Katharine, Jenabi, Mehrnaz, Pasquini, Luca, K Peck, Kyung, Saha, Atin, Shah, Akash, O'Malley, Bernard, Ilica, Ahmet Turan, Kelly, Lori, Arevalo-Perez, Julio, Hatzoglou, Vaios, Holodny, Andrei, Shalu, Hrithwik
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Reliable and trustworthy artificial intelligence (AI), particularly in high-stake medical diagnoses, necessitates effective uncertainty quantification (UQ). Existing UQ methods using model ensembles often introduce invalid variability or computational complexity, rendering them impractical and ineffective in clinical workflow. We propose a UQ approach based on deep neuroevolution (DNE), a data-efficient optimization strategy. Our goal is to replicate trends observed in expert-based UQ. We focused on language lateralization maps from resting-state functional MRI (rs-fMRI). Fifty rs-fMRI maps were divided into training/testing (30:20) sets, representing two labels: "left-dominant" and "co-dominant." DNE facilitated acquiring an ensemble of 100 models with high training and testing set accuracy. Model uncertainty was derived from distribution entropies over the 100 model predictions. Expert reviewers provided user-based uncertainties for comparison. Model (epistemic) and user-based (aleatoric) uncertainties were consistent in the independently and identically distributed (IID) testing set, mainly indicating low uncertainty. In a mostly out-of-distribution (OOD) holdout set, both model and user-based entropies correlated but displayed a bimodal distribution, with one peak representing low and another high uncertainty. We also found a statistically significant positive correlation between epistemic and aleatoric uncertainties. DNE-based UQ effectively mirrored user-based uncertainties, particularly highlighting increased uncertainty in OOD images. We conclude that DNE-based UQ correlates with expert assessments, making it reliable for our use case and potentially for other radiology applications.
ISSN:2948-2933
2948-2933
DOI:10.1007/s10278-024-01188-6