ELANet: Effective Lightweight Attention-Guided Network for Real-Time Semantic Segmentation

Deep neural networks have greatly facilitated the applications of semantic segmentation. However, most of the existing neural networks bring massive calculations with lots of model parameters for achieving a higher precision, which is unaffordable for resource-constrained edge devices. To achieve an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural processing letters 2023-10, Vol.55 (5), p.6425-6442
Hauptverfasser:	Yi, Qingming, Dai, Guoshuai, Shi, Min, Huang, Zunkai, Luo, Aiwen
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Artificial Intelligence Artificial neural networks Complex Systems Computational Intelligence Computer Science Convolution Datasets Lightweight Neural networks Parameters Real time Semantic segmentation Semantics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep neural networks have greatly facilitated the applications of semantic segmentation. However, most of the existing neural networks bring massive calculations with lots of model parameters for achieving a higher precision, which is unaffordable for resource-constrained edge devices. To achieve an appropriate trade-off between computing efficiency and segmentation accuracy, we proposed an effective lightweight attention-guided network (ELANet) for real-time semantic segmentation based on an asymmetrical encoder–decoder framework in this work. In the encoding phase, we combined atrous convolution and depth-wise convolution to design two types of effective context guidance blocks to learn contextual semantic information. A refined feature fusion module with a dual attention-guided fusion (DAF) unit was developed in the decoder to exploit different levels of features. Without any pretraining, we estimated the performance of multi-attention ELANet with extensive experiments on the Cityscapes dataset with an input resolution of 512 × 1024, resulting in 75.4% mIoU and 83 FPS inference speed with only 0.76 M parameters and 10.34 GFLOPs on a single 3090 GPU. The code is publicly available at https://github.com/DGS666/ELANet .
ISSN:	1370-4621 1573-773X
DOI:	10.1007/s11063-023-11145-z