Evaluating a Foundation Artificial Intelligence Model for Glaucoma Detection Using Color Fundus Photographs

To evaluate RETFound, a foundation artificial intelligence model, using a diverse clinical research dataset to assess its accuracy in detecting glaucoma using optic disc photographs. The model's accuracy for glaucoma detection was evaluated across race, age, glaucoma severity, and various train...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Ophthalmology science (Online) 2025-01, Vol.5 (1), p.100623, Article 100623
Hauptverfasser: Chuter, Benton, Huynh, Justin, Hallaj, Shahin, Walker, Evan, Liebmann, Jeffrey M., Fazio, Massimo A., Girkin, Christopher A., Weinreb, Robert N., Christopher, Mark, Zangwill, Linda M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To evaluate RETFound, a foundation artificial intelligence model, using a diverse clinical research dataset to assess its accuracy in detecting glaucoma using optic disc photographs. The model's accuracy for glaucoma detection was evaluated across race, age, glaucoma severity, and various training cycles (epochs) and dataset sample sizes. Evaluation of a diagnostic technology. The study included 9787 color fundus photographs (CFPs) from 2329 participants of diverse race (White [73.4%], Black [13.6%] and other [13%]), disease severity (21.8% mild glaucoma, 7.2% moderate or advanced glaucoma, 60.3% not glaucoma, and 10.7% unreported), and age (48.8% 60 years) from the Diagnostic Innovations in Glaucoma Study and the African Descent and Glaucoma Evaluation Study. All fundus photographs were graded as "Glaucomatous" or "Non-glaucomatous." The study employed RETFound, a self-supervised learning model, to perform binary glaucoma classification. The diagnostic accuracy of RETFound was iteratively tested across different combinations of dataset sample sizes (50–2000 optic disc photographs), training cycles (5–50), and study subpopulations stratified by severity of glaucoma, age, and race). Diagnostic accuracy area under the receiver operating characteristic curve (AUC) for classifying CFP as "Glaucomatous" or "Non-glaucomatous." Performance increased with larger training datasets and more training cycles, improving from 50 training images and 5 epochs (AUC: 0.52) to 2000 training images and 50 epochs (AUC: 0.86), with reduced gain in performance from approximately 500 and 1000 training images (AUC of 0.82 and 0.83, respectively). Performance was consistent across race and age for all training size and cycle number combinations: Black (AUC = 0.87) vs. other (AUC = 0.86), and >60 years (AUC = 0.84) vs. 
ISSN:2666-9145
2666-9145
DOI:10.1016/j.xops.2024.100623