AI performance by mammographic density in a retrospective cohort study of 99,489 participants in BreastScreen Norway

Objective To explore the ability of artificial intelligence (AI) to classify breast cancer by mammographic density in an organized screening program. Materials and method We included information about 99,489 examinations from 74,941 women who participated in BreastScreen Norway, 2013–2019. All exami...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:European radiology 2024-10, Vol.34 (10), p.6298-6308
Hauptverfasser: Bergan, Marie Burns, Larsen, Marthe, Moshina, Nataliia, Bartsch, Hauke, Koch, Henrik Wethe, Aase, Hildegunn Siv, Satybaldinov, Zhanbolat, Haldorsen, Ingfrid Helene Salvesen, Lee, Christoph I., Hofvind, Solveig
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Objective To explore the ability of artificial intelligence (AI) to classify breast cancer by mammographic density in an organized screening program. Materials and method We included information about 99,489 examinations from 74,941 women who participated in BreastScreen Norway, 2013–2019. All examinations were analyzed with an AI system that assigned a malignancy risk score (AI score) from 1 (lowest) to 10 (highest) for each examination. Mammographic density was classified into Volpara density grade (VDG), VDG1–4; VDG1 indicated fatty and VDG4 extremely dense breasts. Screen-detected and interval cancers with an AI score of 1–10 were stratified by VDG. Results We found 10,406 (10.5% of the total) examinations to have an AI risk score of 10, of which 6.7% (704/10,406) was breast cancer. The cancers represented 89.7% (617/688) of the screen-detected and 44.6% (87/195) of the interval cancers. 20.3% (20,178/99,489) of the examinations were classified as VDG1 and 6.1% (6047/99,489) as VDG4. For screen-detected cancers, 84.0% (68/81, 95% CI, 74.1–91.2) had an AI score of 10 for VDG1, 88.9% (328/369, 95% CI, 85.2–91.9) for VDG2, 92.5% (185/200, 95% CI, 87.9–95.7) for VDG3, and 94.7% (36/38, 95% CI, 82.3–99.4) for VDG4. For interval cancers, the percentages with an AI score of 10 were 33.3% (3/9, 95% CI, 7.5–70.1) for VDG1 and 48.0% (12/25, 95% CI, 27.8–68.7) for VDG4. Conclusion The tested AI system performed well according to cancer detection across all density categories, especially for extremely dense breasts. The highest proportion of screen-detected cancers with an AI score of 10 was observed for women classified as VDG4. Clinical relevance statement Our study demonstrates that AI can correctly classify the majority of screen-detected and about half of the interval breast cancers, regardless of breast density. Key Points • Mammographic density is important to consider in the evaluation of artificial intelligence in mammographic screening. • Given a threshold representing about 10% of those with the highest malignancy risk score by an AI system, we found an increasing percentage of cancers with increasing mammographic density. • Artificial intelligence risk score and mammographic density combined may help triage examinations to reduce workload for radiologists.
ISSN:1432-1084
0938-7994
1432-1084
DOI:10.1007/s00330-024-10681-z