SVAFormer: Integrating Random and Hierarchical Spectral View Attention for Hyperspectral Image Classification

Recently, hyperspectral image (HSI) classification methods based on Transformers have developed rapidly. However, these methods still face challenges in handling the widely varying scales and diverse spatial distribution patterns commonly found in HSIs. To address these issues, this article proposes...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on geoscience and remote sensing 2024, Vol.62, p.1-13
Hauptverfasser: Chen, Ning, Huang, Zhou, Yue, Xia, Liu, Anfeng, Lu, Meiyun, Yue, Jun, Fang, Leyuan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, hyperspectral image (HSI) classification methods based on Transformers have developed rapidly. However, these methods still face challenges in handling the widely varying scales and diverse spatial distribution patterns commonly found in HSIs. To address these issues, this article proposes a simple, yet novel HSI classification framework named the spectral view attention Transformer (SVAFormer). Built on the Transformer mechanism, this framework enhances the integration of spectral and spatial features by allowing the spectral token, corresponding to the pixel to be classified, to access spatial neighborhood information from multiple perspectives and levels. Specifically, the framework employs random masking techniques to provide spectral tokens with spatial neighborhood information from different viewpoints, enabling the model to handle diverse land-cover distribution patterns. Additionally, the framework introduces a spectral token-aware pooling layer between adjacent Transformer blocks, which preserves the central role of spectral tokens while progressively expanding the spatial scale represented by each token. This reduces the Transformer's focus on spatially fragmented information and enables spectral tokens to concentrate on spatial neighborhood information at various levels and scales. The key characteristic of this framework is its ability to effectively handle land-cover features of different scales and shapes by strengthening the fusion of spectral and spatial characteristics. Experimental results on multiple public datasets demonstrate that our framework outperforms previous state-of-the-art methods. For the sake of reproducibility, the source code of SVAFormer will be publicly available at https://github.com/chenning0115/SVAFormer .
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2024.3509478