FCAU-Net for the Semantic Segmentation of Fine-Resolution Remotely Sensed Images

The semantic segmentation of fine-resolution remotely sensed images is an urgent issue in satellite image processing. Solving this problem can help overcome various obstacles in urban planning, land cover classification, and environmental protection, paving the way for scene-level landscape pattern...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Remote sensing (Basel, Switzerland) Switzerland), 2022-01, Vol.14 (1), p.215
Hauptverfasser: Niu, Xuerui, Zeng, Qiaolin, Luo, Xiaobo, Chen, Liangfu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The semantic segmentation of fine-resolution remotely sensed images is an urgent issue in satellite image processing. Solving this problem can help overcome various obstacles in urban planning, land cover classification, and environmental protection, paving the way for scene-level landscape pattern analysis and decision making. Encoder-decoder structures based on attention mechanisms have been frequently used for fine-resolution image segmentation. In this paper, we incorporate a coordinate attention (CA) mechanism, adopt an asymmetric convolution block (ACB), and design a refinement fusion block (RFB), forming a network named the fusion coordinate and asymmetry-based U-Net (FCAU-Net). Furthermore, we propose novel convolutional neural network (CNN) architecture to fully capture long-term dependencies and fine-grained details in fine-resolution remotely sensed imagery. This approach has the following advantages: (1) the CA mechanism embeds position information into a channel attention mechanism to enhance the feature representations produced by the network while effectively capturing position information and channel relationships; (2) the ACB enhances the feature representation ability of the standard convolution layer and captures and refines the feature information in each layer of the encoder; and (3) the RFB effectively integrates low-level spatial information and high-level abstract features to eliminate background noise when extracting feature information, reduces the fitting residuals of the fused features, and improves the ability of the network to capture information flows. Extensive experiments conducted on two public datasets (ZY-3 and DeepGlobe) demonstrate the effectiveness of the FCAU-Net. The proposed FCAU-Net transcends U-Net, Attention U-Net, the pyramid scene parsing network (PSPNet), DeepLab v3+, the multistage attention residual U-Net (MAResU-Net), MACU-Net, and the Transformer U-Net (TransUNet). Specifically, the FCAU-Net achieves a 97.97% (95.05%) pixel accuracy (PA), a 98.53% (91.27%) mean PA (mPA), a 95.17% (85.54%) mean intersection over union (mIoU), and a 96.07% (90.74%) frequency-weighted IoU (FWIoU) on the ZY-3 (DeepGlobe) dataset.
ISSN:2072-4292
2072-4292
DOI:10.3390/rs14010215