Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging

To evaluate the trustworthiness of saliency maps for abnormality localization in medical imaging. Using two large publicly available radiology datasets (Society for Imaging Informatics in Medicine-American College of Radiology Pneumothorax Segmentation dataset and Radiological Society of North Ameri...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Radiology. Artificial intelligence 2021-11, Vol.3 (6), p.e200267-e200267
Hauptverfasser: Arun, Nishanth, Gaw, Nathan, Singh, Praveer, Chang, Ken, Aggarwal, Mehak, Chen, Bryan, Hoebel, Katharina, Gupta, Sharut, Patel, Jay, Gidwani, Mishka, Adebayo, Julius, Li, Matthew D, Kalpathy-Cramer, Jayashree
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To evaluate the trustworthiness of saliency maps for abnormality localization in medical imaging. Using two large publicly available radiology datasets (Society for Imaging Informatics in Medicine-American College of Radiology Pneumothorax Segmentation dataset and Radiological Society of North America Pneumonia Detection Challenge dataset), the performance of eight commonly used saliency map techniques were quantified in regard to localization utility (segmentation and detection), sensitivity to model weight randomization, repeatability, and reproducibility. Their performances versus baseline methods and localization network architectures were compared, using area under the precision-recall curve (AUPRC) and structural similarity index measure (SSIM) as metrics. All eight saliency map techniques failed at least one of the criteria and were inferior in performance compared with localization networks. For pneumothorax segmentation, the AUPRC ranged from 0.024 to 0.224, while a U-Net achieved a significantly superior AUPRC of 0.404 ( < .005). For pneumonia detection, the AUPRC ranged from 0.160 to 0.519, while a RetinaNet achieved a significantly superior AUPRC of 0.596 (
ISSN:2638-6100
2638-6100
DOI:10.1148/ryai.2021200267