Multilevel Unsupervised Domain Adaptation for Single-Stage Object Detection in Remote Sensing Images

We propose a novel multilevel unsupervised domain adaptive framework for single-stage object detection in remote sensing images. Our framework combines pixel-level adaptation together with feature-level adaptation in a progressive learning scheme. Pixel-level adaptation usually suffers from the impe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal of selected topics in applied earth observations and remote sensing 2024, Vol.17, p.19420-19435
Hauptverfasser: Luo, Sihao, Ma, Li, Yang, Xiaoquan, Du, Qian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We propose a novel multilevel unsupervised domain adaptive framework for single-stage object detection in remote sensing images. Our framework combines pixel-level adaptation together with feature-level adaptation in a progressive learning scheme. Pixel-level adaptation usually suffers from the imperfect translation problem with respect to local region deformation. To address this problem, we introduce a semantically important region-attentive pixel-level domain adaptation based on a cycleGAN-like translation design, which incorporates an attention module and a learnable normalization function to facilitate shape transformation and image style transfer across domains. Moreover, to adapt single-stage detector while removing the need for explicit local features, we introduce the attention-guided multiscale feature-level domain adaptation, which employs multiple domain discriminators at different scales to perform multiscale feature alignment for objects of different sizes. This alignment process is guided from global to local by exploiting a self-attention mechanism that allows the model to gradually recognize local regions. The experimental results on several remote sensing datasets demonstrate the validity of our proposed framework. Compared with the baseline detector trained on the source dataset, our approach consistently improves the detection performance on the target dataset by 9.1%-16.0% mAP and achieves state-of-the-art results under various datasets.
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2024.3479224