Contour-enhanced Visual State-Space Model for Remote Sensing Image Classification
Accurate classification of remote sensing images can quickly identify various geographical features, which is important for planning, utilizing, and protecting natural resources. Recently, the visual Mamba model, as an extension of the vision transformer, is attracting widespread attention due to it...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on geoscience and remote sensing 2024-12, p.1-1 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Accurate classification of remote sensing images can quickly identify various geographical features, which is important for planning, utilizing, and protecting natural resources. Recently, the visual Mamba model, as an extension of the vision transformer, is attracting widespread attention due to its global receptive field and linear complexity. However, the self-attention mechanism of visual transformers can lead to feature collapse in the deep layers, resulting in the disappearance of low-level visual features. In remote sensing images, low-level features, and especially luminance gradient features, can help discern object boundaries and contour information. This is beneficial for the accurate classification of images but has not been fully leveraged. To make full use of contour information and explore the impact of using handcrafted low-level features on the deep layers of the model, in this study, a contour-enhanced Mamba model based on VMamba is proposed, named G-VMamba. The core novelty of G-VMamba lies in its contour enhancement module. First, two separate paths are used to extract adaptive luminance gradients and multidimensional convolutional features at each network layer. Subsequently, the features are combined to impose the constraints of low-level features onto the deeper networks. Remote sensing image classification experiments were conducted to evaluate the model's performance, and the results demonstrate the superior performance of G-VMamba in classification tasks. An analysis of class activation maps across different categories shows that G-VMamba focuses more on color (or luminance) change significantly regions in images than models like VMamba, highlighting the efficacy of contour enhancement. The code will be available at https://github.com/yanliyue/Contour-enhanced-Visual-State-Space-Model. |
---|---|
ISSN: | 0196-2892 1558-0644 |
DOI: | 10.1109/TGRS.2024.3520635 |