External Evaluation of a Mammography-based Deep Learning Model for Predicting Breast Cancer in an Ethnically Diverse Population

To externally evaluate a mammography-based deep learning (DL) model (Mirai) in a high-risk racially diverse population and compare its performance with other mammographic measures. A total of 6435 screening mammograms in 2096 female patients (median age, 56.4 years ± 11.2 [SD]) enrolled in a hospita...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Radiology. Artificial intelligence 2023-11, Vol.5 (6), p.e220299-e220299
Hauptverfasser: Omoleye, Olasubomi J, Woodard, Anna E, Howard, Frederick M, Zhao, Fangyuan, Yoshimatsu, Toshio F, Zheng, Yonglan, Pearson, Alexander T, Levental, Maksim, Aribisala, Benjamin S, Kulkarni, Kirti, Karczmar, Gregory S, Olopade, Olufunmilayo I, Abe, Hiroyuki, Huo, Dezheng
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To externally evaluate a mammography-based deep learning (DL) model (Mirai) in a high-risk racially diverse population and compare its performance with other mammographic measures. A total of 6435 screening mammograms in 2096 female patients (median age, 56.4 years ± 11.2 [SD]) enrolled in a hospital-based case-control study from 2006 to 2020 were retrospectively evaluated. Pathologically confirmed breast cancer was the primary outcome. Mirai scores were the primary predictors. Breast density and Breast Imaging Reporting and Data System (BI-RADS) assessment categories were comparative predictors. Performance was evaluated using area under the receiver operating characteristic curve (AUC) and concordance index analyses. Mirai achieved 1- and 5-year AUCs of 0.71 (95% CI: 0.68, 0.74) and 0.65 (95% CI: 0.64, 0.67), respectively. One-year AUCs for nondense versus dense breasts were 0.72 versus 0.58 ( = .10). There was no evidence of a difference in near-term discrimination performance between BI-RADS and Mirai (1-year AUC, 0.73 vs 0.68; = .34). For longer-term prediction (2-5 years), Mirai outperformed BI-RADS assessment (5-year AUC, 0.63 vs 0.54; < .001). Using only images of the unaffected breast reduced the discriminatory performance of the DL model ( < .001 at all time points), suggesting that its predictions are likely dependent on the detection of ipsilateral premalignant patterns. A mammography DL model showed good performance in a high-risk external dataset enriched for African American patients, benign breast disease, and mutation carriers, and study findings suggest that the model performance is likely driven by the detection of precancerous changes. Breast, Cancer, Computer Applications, Convolutional Neural Network, Deep Learning Algorithms, Informatics, Epidemiology, Machine Learning, Mammography, Oncology, Radiomics . © RSNA, 2023See also commentary by Kontos and Kalpathy-Cramer in this issue.
ISSN:2638-6100
2638-6100
DOI:10.1148/ryai.220299