Attention-based multi-scale feature fusion network for myopia grading using optical coherence tomography images
Myopia is a serious threat to eye health and can even cause blindness. It is important to grade myopia and carry out targeted intervention. Nowadays, various studies using deep learning models based on optical coherence tomography (OCT) images to screen for high myopia. However, since regions of int...
Gespeichert in:
Veröffentlicht in: | The Visual computer 2024-09, Vol.40 (9), p.6627-6638 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Myopia is a serious threat to eye health and can even cause blindness. It is important to grade myopia and carry out targeted intervention. Nowadays, various studies using deep learning models based on optical coherence tomography (OCT) images to screen for high myopia. However, since regions of interest (ROIs) of pre-myopia and low myopia on OCT images are relatively small, it is rather difficult to use OCT images to conduct detailed myopia grading. There are few studies using OCT images for more detailed myopia grading. To address these problems, we propose a novel attention-based multi-scale feature fusion network named AMFF for myopia grading using OCT images. The proposed AMFF mainly consists of five modules: a pre-trained vision transformer (ViT) module, a multi-scale convolutional module, an attention feature fusion module, an Avg-TopK pooling module and a fully connected (FC) classifier. Firstly, unsupervised pre-training of ViT on the training set can better extract feature maps. Secondly, multi-scale convolutional layers further extract multi-scale feature maps to obtain more receptive fields and extract scale-invariant features. Thirdly, feature maps of different scales are fused through channel attention and spatial attention to further obtain more meaningful features. Lastly, the most prominent features are obtained by the weighted average of the highest activation values of each channel, and then they are used to classify myopia through a fully connected layer. Extensive experiments show that our proposed model has the superior performance compared with other state-of-the-art myopia grading models. |
---|---|
ISSN: | 0178-2789 1432-2315 |
DOI: | 10.1007/s00371-023-03189-y |