Progressive Critical Region Transfer for Cross-Domain Visual Object Detection

Well-trained visual object detectors are generally confronted with a severe performance decline when deployed in a novel driving scenario due to the impact of domain shift. Despite excellent improvements in unsupervised domain adaptive object detection achieved by adversarial training, those approac...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent transportation systems 2024-08, Vol.25 (8), p.9427-9441
Hauptverfasser: Wang, Xiaowei, Jiang, Peiwen, Li, Yang, Hu, Manjiang, Gao, Ming, Cao, Dongpu, Ding, Rongjun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Well-trained visual object detectors are generally confronted with a severe performance decline when deployed in a novel driving scenario due to the impact of domain shift. Despite excellent improvements in unsupervised domain adaptive object detection achieved by adversarial training, those approaches fail to capture the transfer core underlying the holistic scenes. To solve this problem, we propose a progressive critical region transfer framework for cross-domain visual object detection. Specifically, we exploit a potential foreground mining (PFM) module and a semantic-specific RoI aggregation (SRA) module to improve the robustness of the cross-domain detection framework. Upon the critical regions in the broad sense, the PFM module first highlights the foreground regions by reweighting the hierarchical feature maps in sequence, and then modifies location biases at the downstream position of the backbone network for more accurate upstream predictions. Deep into the critical regions in the narrow sense, the SRA module concentrates on establishing an appropriate matching between batch-wise RoIs and all semantic centers, and further strengthens the aggregation of cross-domain identical semantic with the complement of context references. Together these modules are obligated to transform the adaptation importance from the whole scope to the latent foreground areas, and afterward to the informative regions of interest along the detection pipeline. Experiments show that our progressive critical region transfer framework achieves a state-of-the-art performance in adverse weather, camera configuration, and complicated scene adaptation, which outperforms the baselines by 19.4%, 5.0%, and 6.1%, respectively.
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2024.3382841