Performance of deep neural network-based artificial intelligence method in diabetic retinopathy screening: a systematic review and meta-analysis of diagnostic test accuracy

Objective Automatic diabetic retinopathy screening system based on neural networks has been used to detect diabetic retinopathy (DR). However, there is no quantitative synthesis of performance of these methods. We aimed to estimate the sensitivity and specificity of neural networks in DR grading. Me...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	European journal of endocrinology 2020-07, Vol.183 (1), p.41-49
Hauptverfasser:	Wang, Shirui, Zhang, Yuelun, Lei, Shubin, Zhu, Huijuan, Li, Jianqiang, Wang, Qing, Yang, Jijiang, Chen, Shi, Pan, Hui
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Artificial Intelligence Clinical Study Diabetes Diabetes mellitus Diabetic retinopathy Diabetic Retinopathy - diagnosis Diabetic Retinopathy - pathology Diagnostic tests Edema Endocrinology & Metabolism Fundus Oculi Humans Life Sciences & Biomedicine Macular Edema - diagnosis Mass Screening - methods Meta-analysis Neural networks Neural Networks, Computer Retinopathy Science & Technology Sensitivity and Specificity Statistical analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Objective Automatic diabetic retinopathy screening system based on neural networks has been used to detect diabetic retinopathy (DR). However, there is no quantitative synthesis of performance of these methods. We aimed to estimate the sensitivity and specificity of neural networks in DR grading. Methods Medline, Embase, IEEE Xplore, and Cochrane Library were searched up to 23 July 2019. Studies that evaluated performance of neural networks in detection of moderate or worse DR or diabetic macular edema using retinal fundus images with ophthalmologists’ judgment as reference standard were included. Two reviewers extracted data independently. Risk of bias of eligible studies was assessed using QUDAS-2 tool. Results Twenty-four studies involving 235 235 subjects were included. Quantitative random-effects meta-analysis using the Rutter and Gatsonis hierarchical receiver operating characteristics (HSROC) model revealed a pooled sensitivity of 91.9% (95% CI: 89.6% to 94.3%) and specificity of 91.3% (95% CI: 89.0% to 93.5%). Subgroup analyses and meta-regression did not provide any statistically significant findings for the heterogeneous diagnostic accuracy in studies with different image resolutions, sample sizes of training sets, architecture of convolutional neural networks, or diagnostic criteria. Conclusions State-of-the-art neural networks could effectively detect clinical significant DR. To further improve diagnostic accuracy of neural networks, researchers might need to develop new algorithms rather than simply enlarge sample sizes of training sets or optimize image quality.
ISSN:	0804-4643 1479-683X
DOI:	10.1530/EJE-19-0968