Global and pyramid convolutional neural network with hybrid attention mechanism for hyperspectral image classification

Convolutional neural networks (CNNs) have shown impressive results in the hyperspectral image (HSI) classification. However, they still face certain limitations that can impact their effectiveness. The kernels in standard convolution have fixed spatial sizes and spectral depths, which cannot capture...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Geocarto international 2023-12, Vol.38 (1)
Hauptverfasser:	Wu, Linfeng, Wang, Huajun
Format:	Artikel
Sprache:	eng
Schlagworte:	attention mechanism global convolution Hyperspectral image classification multiscale convolution neural network
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Convolutional neural networks (CNNs) have shown impressive results in the hyperspectral image (HSI) classification. However, they still face certain limitations that can impact their effectiveness. The kernels in standard convolution have fixed spatial sizes and spectral depths, which cannot capture global semantic features from HSI and the existing methods for extracting multiscale information are inadequate. Based on these, we propose a global and pyramid convolutional network with a hybrid attention mechanism (GPHANet) for HSI classification. GPHANet adopts a two-branch architecture to extract both local and global hyperspectral features, leveraging the strengths of each to enhance classification performance. The local branch employs dynamic pyramid convolution, which customizes parameters for convolutional kernels of multiple scales based on the input image. This design enables the model to accurately and comprehensively extract multiscale feature information, capturing subtle variations in the HSI. The global branch utilizes circular convolution with a global receptive field, enabling each position in the convolutional layers to gather information from the entire input space. This allows the model to learn global contextual features, enhancing its understanding of the overall structure and semantics of the HSI. After extracting deep global and local features, we apply a compact hybrid attention mechanism (HAM) to capture long-range dependencies along both spatial and spectral dimensions, resulting in a more discriminative feature representation. On the Pavia University, Salinas, and Hong Hu datasets, the proposed model achieved overall accuracies of 99.22%, 98.74%, and 96.80%, respectively, using limited training samples. These results are much better compared to the state-of-the-art methods.
ISSN:	1010-6049 1752-0762
DOI:	10.1080/10106049.2023.2226112