Unified building change detection pre-training method with masked semantic annotations
Building change detection (CD) using remote sensing images plays a vital role in urban development, and deep learning models attracted attention for their potential to accomplish CD tasks automatically. However, most methods are still facing challenges, such as the costly and time-consuming process...
Gespeichert in:
Veröffentlicht in: | International journal of applied earth observation and geoinformation 2023-06, Vol.120, p.103346, Article 103346 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Building change detection (CD) using remote sensing images plays a vital role in urban development, and deep learning models attracted attention for their potential to accomplish CD tasks automatically. However, most methods are still facing challenges, such as the costly and time-consuming process of constructing CD datasets and the severely imbalanced distribution of positive and negative samples preventing loss functions from functioning as desired in the training process. Inspired by weak supervision have demonstrated excellent performance in solving the above-mentioned problems, a unified change detection pre-training paradigm is proposed to accomplish the CD task using a small number of samples and improve the inference accuracy of building change detection. The keys of this proposed method are as follows. First, the pre-training paradigm detects building changes using pseudo-labels and them generated by highly available semantic segmentation datasets. Second, a balanced sample distribution is ensured by using the proposed method for semantic masked building areas and controlling the proportion of the areas. Third, multi-task networks for simultaneous building extraction and change detection are used in the proposed unified paradigm, owing to the semantic information can be employed as an effective supervision signal to assist with the CD training to solve the problem that pseudo-labels adversely affect the ability of the algorithm to converge. In particular, experiments were performed on three challenging datasets. For aerial WHU-CD and satellite Gaofen Challenge-CD datasets, our pre-trained weights generated with pseudo-bitemporal samples were applied to subsets containing different proportions of ground truth for fine-tuning, respectively. Notably, 10% of ground truth for fine-tuning with our pre-trained weights obtained intersection over union (IoU) comparable to that obtained using 100% of the CD ground truth without our pre-trained weights, whereas an even greater IoU was achieved using 30% of ground truth with our pre-trained weights. Experiments demonstrated that the performance of our method based on a small number of samples was superior to that of conventional supervised learning methods. When conducting experiments with 100% ground truth, the results showed that the use of our pre-trained weights yields IoU that substantially exceeds that of conventional supervised learning methods. The results of experiments conducted on the LEVIR-CD dat |
---|---|
ISSN: | 1569-8432 1872-826X |
DOI: | 10.1016/j.jag.2023.103346 |