MSTNet: A Multilevel Spectral–Spatial Transformer Network for Hyperspectral Image Classification

Convolutional neural networks (CNNs) have been widely used in hyperspectral image classification (HSIC). Although the current CNN-based methods have achieved good performance, they still face a series of challenges. For example, the receptive field is limited, information is lost in down-sampling la...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on geoscience and remote sensing 2022, Vol.60, p.1-13
Hauptverfasser: Yu, Haoyang, Xu, Zhen, Zheng, Ke, Hong, Danfeng, Yang, Hao, Song, Meiping
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Convolutional neural networks (CNNs) have been widely used in hyperspectral image classification (HSIC). Although the current CNN-based methods have achieved good performance, they still face a series of challenges. For example, the receptive field is limited, information is lost in down-sampling layer, and a lot of computing resources are consumed for deep networks. To overcome these problems, we proposed a multilevel spectral–spatial transformer network (MSTNet) for HSIC. The structure of MSTNet is an image-based classification framework, which is efficient and straightforward. Based on this framework, we designed a self-attentive encoder. First, HSIs are processed into sequences. Meanwhile, a learned positional embedding (PE) is added to integrate spatial information. Then, a pure transformer encoder (TE) is employed to learn feature representations. Finally, the multilevel features are processed by decoders to generate the classification results in the original image size. The experimental results based on three real hyperspectral datasets demonstrate the efficiency of the proposed method in comparison with the other related CNN-based methods.
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2022.3186400