MISGNet: A Multilevel Intertemporal Semantic Guidance Network for Remote Sensing Images Change Detection

The precise identification of semantic changes in remote sensing images is of great significance in the domains of urban planning and disaster assessment. Nevertheless, current change detection models are inadequate when it comes to model semantic interactions in pairings of temporal images. This le...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE journal of selected topics in applied earth observations and remote sensing 2024-11, p.1-14
Hauptverfasser:	Cui, Binge, Liu, Chenglong, Li, Haojie, Yu, Jianzhi
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy change detection(CD) Convolutional neural networks Data mining difference aggregates Feature extraction Recurrent neural networks Remote sensing semantic guidance Semantics Streams Training Transformers
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The precise identification of semantic changes in remote sensing images is of great significance in the domains of urban planning and disaster assessment. Nevertheless, current change detection models are inadequate when it comes to model semantic interactions in pairings of temporal images. This leads to poor identification of identical semantic targets that have unique features. In this paper, we put forth a proposal for a Multilevel Intertemporal Semantic Guidance Network (MISGNet) that would effectively derive representations of semantic changes. The bitemporal images' multilevel features are initially extracted using a transformer feature extractor. These extracted features are then bi-directionally semantically augmented using a Semantic Guidance Module (SGM) in order to acquire more comprehensive semantic representations. Particularly, in order to obtain the object's semantic representation, the land cover objects in the multilevel features of the bitemporal images are soft clustered and mapped to the graph space, with each vertex representing an object in the graph. Then, bidirectional semantic enhancement is achieved through the use of intertemporal non-local operations, which strengthen the semantic representation of the bitemporal images. Moreover, a Multilevel Difference Aggregation Module (MDAM) is implemented in order to enhance the efficacy of summarizing distinctions between different levels and highlighting semantic changes in features. This is achieved through the utilization of pixel-wise addition and pixel-wise multiplication. Extensive experiments conducted on three publicly available datasets provide irrefutable evidence that our model outperforms alternative methods across various evaluation metrics.
ISSN:	1939-1404
DOI:	10.1109/JSTARS.2024.3508692