Towards improved fundus disease detection using Swin Transformers

Ocular diseases can have debilitating consequences on visual acuity if left untreated, necessitating early and accurate diagnosis to improve patients' quality of life. Although the contemporary clinical prognosis involving fundus screening is a cost-effective method for detecting ocular abnorma...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2024-02, Vol.83 (32), p.78125-78159
Hauptverfasser: Jawad, M Abdul, Khursheed, Farida, Nawaz, Shah, Mir, A. H.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Ocular diseases can have debilitating consequences on visual acuity if left untreated, necessitating early and accurate diagnosis to improve patients' quality of life. Although the contemporary clinical prognosis involving fundus screening is a cost-effective method for detecting ocular abnormalities, however, it is time-intensive due to limited resources and expert ophthalmologists. While computer-aided detection, including traditional machine learning and deep learning, has been employed for enhanced prognosis from fundus images, conventional deep learning models often face challenges due to limited global modeling ability, inducing bias and suboptimal performance on unbalanced datasets. Presently, most studies on ocular disease detection focus on cataract detection or diabetic retinopathy severity prediction, leaving a myriad of vision-impairing conditions unexplored. Minimal research has been conducted utilizing deep models for identifying diverse ocular abnormalities from fundus images, with limited success. The study leveraged the capabilities of four Swin Transformer models (Swin-T, Swin-S, Swin-B, and Swin-L) for detecting various significant ocular diseases (including Cataracts, Hypertensive Retinopathy, Diabetic Retinopathy, Myopia, and Age-Related Macular Degeneration) from fundus images of the ODIR dataset. Swin Transformer models, confining self-attention to local windows while enabling cross-window interactions, demonstrated superior performance and computational efficiency. Upon assessment across three specific ODIR test sets, utilizing metrics such as AUC, F1-score, Kappa score, and a composite metric representing an average of these three (referred to as the final score), all Swin models exhibited superior performance metric scores than those documented in contemporary studies. The Swin-L model, in particular, achieved final scores of 0.8501, 0.8211, and 0.8616 on the Off-site, On-site, and Balanced ODIR test sets, respectively. An external validation on a Retina dataset further substantiated the generalizability of Swin models, with the models reporting final scores of 0.9058 (Swin-T), 0.92907 (Swin-S), 0.95917 (Swin-B), and 0.97042 (Swin-L). The results, corroborated by statistical analysis, underline the consistent and stable performance of Swin models across varied datasets, emphasizing their potential as reliable tools for multi-ocular disease detection from fundus images, thereby aiding in the early diagnosis and intervention of ocul
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-024-18627-9