Multilayer feature fusion and attention-based network for crops and weeds segmentation

Distinguishing weeds from crops is a critical challenge in agriculture, with the existing agriculture semantic segmentation networks simply combining low-level with high-level features at the encoder and decoder stages to improve performance. However, a simple low-level and high-level feature fusion...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of plant diseases and protection (2006) 2022-12, Vol.129 (6), p.1475-1489
Hauptverfasser: Wang, Haoyu, Song, Haiyu, Wu, Haiyan, Zhang, Zhiqiang, Deng, Shengchun, Feng, Xiaoqing, Chen, Yanhong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Distinguishing weeds from crops is a critical challenge in agriculture, with the existing agriculture semantic segmentation networks simply combining low-level with high-level features at the encoder and decoder stages to improve performance. However, a simple low-level and high-level feature fusion may not be effective due to the semantic and spatial resolution gap. Hence, this paper proposes a novel dual attention network (DA-Net), based on branch attention blocks in the encoding stage and spatial attention blocks in the decoding stage, to bridge the gap between low-level and high-level features. Our method first adds a branch selection module at the residual connection between the encoder and decoder, enabling low-level futures to select higher-level features for fusion adaptively. Then, a cascaded convolution block utilizing asymmetric convolution is constructed, supporting the receptive field’s expansion without increasing the computational burden or the parameter cardinality. We design a spatial attention block in the fusion stage to capture rich contextual dependencies. Finally, we construct a novel block named densely channel fusion, which utilizes a sub-pixel layer to encode most channel information into spatial information. The experimental results demonstrate that DA-Net is superior to ExFuse, Ddeeplabv3 + , and PSPNet on three public datasets, with each added component significantly affecting the overall performance.
ISSN:1861-3829
1861-3837
DOI:10.1007/s41348-022-00663-y