Multi-crop Fusion Strategy Based on Prototype Assignment for Remote Sensing Image Scene Classification

The gap between self-supervised visual representation learning and supervised learning is gradually closing. Self-supervised learning does not rely on a large amount of labeled data and reduces the loss of human labeled information. Compared with natural images, remote sensing images require rich sa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on geoscience and remote sensing 2022, p.1-1
Hauptverfasser:	Ma, Siteng, Hou, Biao, Guo, Xianpeng, Li, Zhihao, Wu, Zitong, Wang, Shuang, Jiao, Licheng
Format:	Artikel
Sprache:	eng
Schlagworte:	clustering idea Codes fusion strategy interpretability multiple views Predictive models prototype assignment Prototypes Remote sensing Self-supervised learning Task analysis Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The gap between self-supervised visual representation learning and supervised learning is gradually closing. Self-supervised learning does not rely on a large amount of labeled data and reduces the loss of human labeled information. Compared with natural images, remote sensing images require rich samples and human annotation by experts. Moreover, many algorithms have poor interpretability and unconvincing results. Therefore, this paper proposes a self-supervised method based on prototype assignment by designing a pretext task so that the network maps features to prototypes in the process of learning, swaps the code corresponding to the obtained features, combines them with another data-enhancing feature, and then optimizes the network. The prototype is introduced to explain the clustering idea embodied in the whole process. Considering the existence of the scene information-rich characteristic of remote sensing images, we introduce multiple views with different resolutions to capture more detailed information on the images. Finally, if the data enhancement method is not powerful enough, the network can easily fall into an overfitting state, which prevents the network from learning subtle differences and detailed information. To address this shortcoming, we propose a fusion strategy to flatten the decision boundary of the framework so that the model can also learn the soft similarity between sample pairs. We name the whole framework MFPC. In extensive experiments conducted on three common remote sensing image datasets (i.e., UCMerced, AID, and NWPU45), MFPC achieves a maximum improvement of 4.3% over some existing self-supervised algorithms, indicating that it can achieve good results.
ISSN:	0196-2892
DOI:	10.1109/TGRS.2022.3216831