Towards improved fundus disease detection using Swin Transformers

Ocular diseases can have debilitating consequences on visual acuity if left untreated, necessitating early and accurate diagnosis to improve patients' quality of life. Although the contemporary clinical prognosis involving fundus screening is a cost-effective method for detecting ocular abnorma...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2024-02, Vol.83 (32), p.78125-78159
Hauptverfasser: Jawad, M Abdul, Khursheed, Farida, Nawaz, Shah, Mir, A. H.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 78159
container_issue 32
container_start_page 78125
container_title Multimedia tools and applications
container_volume 83
creator Jawad, M Abdul
Khursheed, Farida
Nawaz, Shah
Mir, A. H.
description Ocular diseases can have debilitating consequences on visual acuity if left untreated, necessitating early and accurate diagnosis to improve patients' quality of life. Although the contemporary clinical prognosis involving fundus screening is a cost-effective method for detecting ocular abnormalities, however, it is time-intensive due to limited resources and expert ophthalmologists. While computer-aided detection, including traditional machine learning and deep learning, has been employed for enhanced prognosis from fundus images, conventional deep learning models often face challenges due to limited global modeling ability, inducing bias and suboptimal performance on unbalanced datasets. Presently, most studies on ocular disease detection focus on cataract detection or diabetic retinopathy severity prediction, leaving a myriad of vision-impairing conditions unexplored. Minimal research has been conducted utilizing deep models for identifying diverse ocular abnormalities from fundus images, with limited success. The study leveraged the capabilities of four Swin Transformer models (Swin-T, Swin-S, Swin-B, and Swin-L) for detecting various significant ocular diseases (including Cataracts, Hypertensive Retinopathy, Diabetic Retinopathy, Myopia, and Age-Related Macular Degeneration) from fundus images of the ODIR dataset. Swin Transformer models, confining self-attention to local windows while enabling cross-window interactions, demonstrated superior performance and computational efficiency. Upon assessment across three specific ODIR test sets, utilizing metrics such as AUC, F1-score, Kappa score, and a composite metric representing an average of these three (referred to as the final score), all Swin models exhibited superior performance metric scores than those documented in contemporary studies. The Swin-L model, in particular, achieved final scores of 0.8501, 0.8211, and 0.8616 on the Off-site, On-site, and Balanced ODIR test sets, respectively. An external validation on a Retina dataset further substantiated the generalizability of Swin models, with the models reporting final scores of 0.9058 (Swin-T), 0.92907 (Swin-S), 0.95917 (Swin-B), and 0.97042 (Swin-L). The results, corroborated by statistical analysis, underline the consistent and stable performance of Swin models across varied datasets, emphasizing their potential as reliable tools for multi-ocular disease detection from fundus images, thereby aiding in the early diagnosis and intervention of ocul
doi_str_mv 10.1007/s11042-024-18627-9
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3100673810</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3100673810</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-a041d42a618f8f87b04bcc081b255055cb5b2bfb0d1d1c8f7ad5c8a391397b3b3</originalsourceid><addsrcrecordid>eNp9kM1Lw0AQxRdRsFb_AU8Bz9GZ3Ww2OZbiFxQ8WM_LfqWk2KTuJBb_e1cj6EnmMHN4783jx9glwjUCqBtChILnwIscq5KrvD5iM5RK5EpxPP5zn7Izoi0AlpIXM7ZY9wcTPWXtbh_79-CzZuz8SJlvKRgKmQ9DcEPbd9lIbbfJng9tl62j6ajp4y5EOmcnjXmlcPGz5-zl7na9fMhXT_ePy8Uqd1zBkBso0BfclFg1aZSFwjoHFVouJUjprLTcNhY8enRVo4yXrjKiRlErK6yYs6spN_V8GwMNetuPsUsvtUgMSiUqhKTik8rFniiGRu9juzPxQyPoL1R6QqUTKv2NStfJJCYTJXG3CfE3-h_XJzYzbEQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3100673810</pqid></control><display><type>article</type><title>Towards improved fundus disease detection using Swin Transformers</title><source>SpringerLink Journals - AutoHoldings</source><creator>Jawad, M Abdul ; Khursheed, Farida ; Nawaz, Shah ; Mir, A. H.</creator><creatorcontrib>Jawad, M Abdul ; Khursheed, Farida ; Nawaz, Shah ; Mir, A. H.</creatorcontrib><description>Ocular diseases can have debilitating consequences on visual acuity if left untreated, necessitating early and accurate diagnosis to improve patients' quality of life. Although the contemporary clinical prognosis involving fundus screening is a cost-effective method for detecting ocular abnormalities, however, it is time-intensive due to limited resources and expert ophthalmologists. While computer-aided detection, including traditional machine learning and deep learning, has been employed for enhanced prognosis from fundus images, conventional deep learning models often face challenges due to limited global modeling ability, inducing bias and suboptimal performance on unbalanced datasets. Presently, most studies on ocular disease detection focus on cataract detection or diabetic retinopathy severity prediction, leaving a myriad of vision-impairing conditions unexplored. Minimal research has been conducted utilizing deep models for identifying diverse ocular abnormalities from fundus images, with limited success. The study leveraged the capabilities of four Swin Transformer models (Swin-T, Swin-S, Swin-B, and Swin-L) for detecting various significant ocular diseases (including Cataracts, Hypertensive Retinopathy, Diabetic Retinopathy, Myopia, and Age-Related Macular Degeneration) from fundus images of the ODIR dataset. Swin Transformer models, confining self-attention to local windows while enabling cross-window interactions, demonstrated superior performance and computational efficiency. Upon assessment across three specific ODIR test sets, utilizing metrics such as AUC, F1-score, Kappa score, and a composite metric representing an average of these three (referred to as the final score), all Swin models exhibited superior performance metric scores than those documented in contemporary studies. The Swin-L model, in particular, achieved final scores of 0.8501, 0.8211, and 0.8616 on the Off-site, On-site, and Balanced ODIR test sets, respectively. An external validation on a Retina dataset further substantiated the generalizability of Swin models, with the models reporting final scores of 0.9058 (Swin-T), 0.92907 (Swin-S), 0.95917 (Swin-B), and 0.97042 (Swin-L). The results, corroborated by statistical analysis, underline the consistent and stable performance of Swin models across varied datasets, emphasizing their potential as reliable tools for multi-ocular disease detection from fundus images, thereby aiding in the early diagnosis and intervention of ocular abnormalities.</description><identifier>ISSN: 1573-7721</identifier><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-024-18627-9</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Abnormalities ; Age related diseases ; Cataracts ; Computer Communication Networks ; Computer Science ; Data Structures and Information Theory ; Datasets ; Deep learning ; Diabetes ; Diabetic retinopathy ; Diagnosis ; Disease ; Eye diseases ; Image enhancement ; Machine learning ; Medical imaging ; Multimedia Information Systems ; Prognosis ; Special Purpose and Application-Based Systems ; Statistical analysis ; Test sets ; Track 2: Medical Applications of Multimedia ; Transformers ; Visual acuity</subject><ispartof>Multimedia tools and applications, 2024-02, Vol.83 (32), p.78125-78159</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-a041d42a618f8f87b04bcc081b255055cb5b2bfb0d1d1c8f7ad5c8a391397b3b3</cites><orcidid>0000-0002-5963-269X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11042-024-18627-9$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11042-024-18627-9$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Jawad, M Abdul</creatorcontrib><creatorcontrib>Khursheed, Farida</creatorcontrib><creatorcontrib>Nawaz, Shah</creatorcontrib><creatorcontrib>Mir, A. H.</creatorcontrib><title>Towards improved fundus disease detection using Swin Transformers</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>Ocular diseases can have debilitating consequences on visual acuity if left untreated, necessitating early and accurate diagnosis to improve patients' quality of life. Although the contemporary clinical prognosis involving fundus screening is a cost-effective method for detecting ocular abnormalities, however, it is time-intensive due to limited resources and expert ophthalmologists. While computer-aided detection, including traditional machine learning and deep learning, has been employed for enhanced prognosis from fundus images, conventional deep learning models often face challenges due to limited global modeling ability, inducing bias and suboptimal performance on unbalanced datasets. Presently, most studies on ocular disease detection focus on cataract detection or diabetic retinopathy severity prediction, leaving a myriad of vision-impairing conditions unexplored. Minimal research has been conducted utilizing deep models for identifying diverse ocular abnormalities from fundus images, with limited success. The study leveraged the capabilities of four Swin Transformer models (Swin-T, Swin-S, Swin-B, and Swin-L) for detecting various significant ocular diseases (including Cataracts, Hypertensive Retinopathy, Diabetic Retinopathy, Myopia, and Age-Related Macular Degeneration) from fundus images of the ODIR dataset. Swin Transformer models, confining self-attention to local windows while enabling cross-window interactions, demonstrated superior performance and computational efficiency. Upon assessment across three specific ODIR test sets, utilizing metrics such as AUC, F1-score, Kappa score, and a composite metric representing an average of these three (referred to as the final score), all Swin models exhibited superior performance metric scores than those documented in contemporary studies. The Swin-L model, in particular, achieved final scores of 0.8501, 0.8211, and 0.8616 on the Off-site, On-site, and Balanced ODIR test sets, respectively. An external validation on a Retina dataset further substantiated the generalizability of Swin models, with the models reporting final scores of 0.9058 (Swin-T), 0.92907 (Swin-S), 0.95917 (Swin-B), and 0.97042 (Swin-L). The results, corroborated by statistical analysis, underline the consistent and stable performance of Swin models across varied datasets, emphasizing their potential as reliable tools for multi-ocular disease detection from fundus images, thereby aiding in the early diagnosis and intervention of ocular abnormalities.</description><subject>Abnormalities</subject><subject>Age related diseases</subject><subject>Cataracts</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Data Structures and Information Theory</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Diabetes</subject><subject>Diabetic retinopathy</subject><subject>Diagnosis</subject><subject>Disease</subject><subject>Eye diseases</subject><subject>Image enhancement</subject><subject>Machine learning</subject><subject>Medical imaging</subject><subject>Multimedia Information Systems</subject><subject>Prognosis</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Statistical analysis</subject><subject>Test sets</subject><subject>Track 2: Medical Applications of Multimedia</subject><subject>Transformers</subject><subject>Visual acuity</subject><issn>1573-7721</issn><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kM1Lw0AQxRdRsFb_AU8Bz9GZ3Ww2OZbiFxQ8WM_LfqWk2KTuJBb_e1cj6EnmMHN4783jx9glwjUCqBtChILnwIscq5KrvD5iM5RK5EpxPP5zn7Izoi0AlpIXM7ZY9wcTPWXtbh_79-CzZuz8SJlvKRgKmQ9DcEPbd9lIbbfJng9tl62j6ajp4y5EOmcnjXmlcPGz5-zl7na9fMhXT_ePy8Uqd1zBkBso0BfclFg1aZSFwjoHFVouJUjprLTcNhY8enRVo4yXrjKiRlErK6yYs6spN_V8GwMNetuPsUsvtUgMSiUqhKTik8rFniiGRu9juzPxQyPoL1R6QqUTKv2NStfJJCYTJXG3CfE3-h_XJzYzbEQ</recordid><startdate>20240227</startdate><enddate>20240227</enddate><creator>Jawad, M Abdul</creator><creator>Khursheed, Farida</creator><creator>Nawaz, Shah</creator><creator>Mir, A. H.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-5963-269X</orcidid></search><sort><creationdate>20240227</creationdate><title>Towards improved fundus disease detection using Swin Transformers</title><author>Jawad, M Abdul ; Khursheed, Farida ; Nawaz, Shah ; Mir, A. H.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-a041d42a618f8f87b04bcc081b255055cb5b2bfb0d1d1c8f7ad5c8a391397b3b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Abnormalities</topic><topic>Age related diseases</topic><topic>Cataracts</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Data Structures and Information Theory</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Diabetes</topic><topic>Diabetic retinopathy</topic><topic>Diagnosis</topic><topic>Disease</topic><topic>Eye diseases</topic><topic>Image enhancement</topic><topic>Machine learning</topic><topic>Medical imaging</topic><topic>Multimedia Information Systems</topic><topic>Prognosis</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Statistical analysis</topic><topic>Test sets</topic><topic>Track 2: Medical Applications of Multimedia</topic><topic>Transformers</topic><topic>Visual acuity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jawad, M Abdul</creatorcontrib><creatorcontrib>Khursheed, Farida</creatorcontrib><creatorcontrib>Nawaz, Shah</creatorcontrib><creatorcontrib>Mir, A. H.</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jawad, M Abdul</au><au>Khursheed, Farida</au><au>Nawaz, Shah</au><au>Mir, A. H.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards improved fundus disease detection using Swin Transformers</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2024-02-27</date><risdate>2024</risdate><volume>83</volume><issue>32</issue><spage>78125</spage><epage>78159</epage><pages>78125-78159</pages><issn>1573-7721</issn><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>Ocular diseases can have debilitating consequences on visual acuity if left untreated, necessitating early and accurate diagnosis to improve patients' quality of life. Although the contemporary clinical prognosis involving fundus screening is a cost-effective method for detecting ocular abnormalities, however, it is time-intensive due to limited resources and expert ophthalmologists. While computer-aided detection, including traditional machine learning and deep learning, has been employed for enhanced prognosis from fundus images, conventional deep learning models often face challenges due to limited global modeling ability, inducing bias and suboptimal performance on unbalanced datasets. Presently, most studies on ocular disease detection focus on cataract detection or diabetic retinopathy severity prediction, leaving a myriad of vision-impairing conditions unexplored. Minimal research has been conducted utilizing deep models for identifying diverse ocular abnormalities from fundus images, with limited success. The study leveraged the capabilities of four Swin Transformer models (Swin-T, Swin-S, Swin-B, and Swin-L) for detecting various significant ocular diseases (including Cataracts, Hypertensive Retinopathy, Diabetic Retinopathy, Myopia, and Age-Related Macular Degeneration) from fundus images of the ODIR dataset. Swin Transformer models, confining self-attention to local windows while enabling cross-window interactions, demonstrated superior performance and computational efficiency. Upon assessment across three specific ODIR test sets, utilizing metrics such as AUC, F1-score, Kappa score, and a composite metric representing an average of these three (referred to as the final score), all Swin models exhibited superior performance metric scores than those documented in contemporary studies. The Swin-L model, in particular, achieved final scores of 0.8501, 0.8211, and 0.8616 on the Off-site, On-site, and Balanced ODIR test sets, respectively. An external validation on a Retina dataset further substantiated the generalizability of Swin models, with the models reporting final scores of 0.9058 (Swin-T), 0.92907 (Swin-S), 0.95917 (Swin-B), and 0.97042 (Swin-L). The results, corroborated by statistical analysis, underline the consistent and stable performance of Swin models across varied datasets, emphasizing their potential as reliable tools for multi-ocular disease detection from fundus images, thereby aiding in the early diagnosis and intervention of ocular abnormalities.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-024-18627-9</doi><tpages>35</tpages><orcidid>https://orcid.org/0000-0002-5963-269X</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1573-7721
ispartof Multimedia tools and applications, 2024-02, Vol.83 (32), p.78125-78159
issn 1573-7721
1380-7501
1573-7721
language eng
recordid cdi_proquest_journals_3100673810
source SpringerLink Journals - AutoHoldings
subjects Abnormalities
Age related diseases
Cataracts
Computer Communication Networks
Computer Science
Data Structures and Information Theory
Datasets
Deep learning
Diabetes
Diabetic retinopathy
Diagnosis
Disease
Eye diseases
Image enhancement
Machine learning
Medical imaging
Multimedia Information Systems
Prognosis
Special Purpose and Application-Based Systems
Statistical analysis
Test sets
Track 2: Medical Applications of Multimedia
Transformers
Visual acuity
title Towards improved fundus disease detection using Swin Transformers
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T14%3A54%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20improved%20fundus%20disease%20detection%20using%20Swin%20Transformers&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Jawad,%20M%20Abdul&rft.date=2024-02-27&rft.volume=83&rft.issue=32&rft.spage=78125&rft.epage=78159&rft.pages=78125-78159&rft.issn=1573-7721&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-024-18627-9&rft_dat=%3Cproquest_cross%3E3100673810%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3100673810&rft_id=info:pmid/&rfr_iscdi=true