Remote Sensing Scene Classification Using Sparse Representation-Based Framework With Deep Feature Fusion

Scene classification of high-resolution remote sensing (RS) images has attracted increasing attentions due to its vital role in a wide range of applications. Convolutional neural networks (CNNs) have recently been applied on many computer vision tasks and have significantly boosted the performance i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal of selected topics in applied earth observations and remote sensing 2021, Vol.14, p.5867-5878
Hauptverfasser: Mei, Shaohui, Yan, Keli, Ma, Mingyang, Chen, Xiaoning, Zhang, Shun, Du, Qian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Scene classification of high-resolution remote sensing (RS) images has attracted increasing attentions due to its vital role in a wide range of applications. Convolutional neural networks (CNNs) have recently been applied on many computer vision tasks and have significantly boosted the performance including imagery scene classification, object detection, and so on. However, the classification performance heavily relies on the features that can accurately represent the scene of images, thus, how to fully explore the feature learning ability of CNNs is of crucial importance for scene classification. Another problem in CNNs is that it requires a large number of labeled samples, which is impractical in RS image processing. To address these problems, a novel sparse representation-based framework for small-sample-size RS scene classification with deep feature fusion is proposed. Specially, multilevel features are first extracted from different layers of CNNs to fully exploit the feature learning ability of CNNs. Note that the existing well-trained CNNs, e.g., AlexNet, VGGNet, and ResNet50, are used for feature extraction, in which no labeled samples is required. Then, sparse representation-based classification is designed to fuse the multilevel features, which is especially effective when only a small number of training samples are available. Experimental results over two benchmark datasets, e.g., UC-Merced and WHU-RS19, demonstrated that the proposed method can effectively fuse different levels of features learned in CNNs, and clearly outperform several state-of-the-art methods especially with limited training samples.
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2021.3084441