Cross-domain learning using optimized pseudo labels: toward adaptive car detection in different weather conditions and urban cities

Convolutional neural networks based object detection usually assumes that training and test data have the same distribution, which, however, does not always hold in real-world applications. In autonomous vehicles, the driving scene (target domain) consists of unconstrained road environments which ca...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2022-03, Vol.34 (6), p.4519-4529
Hauptverfasser: Wang, Ke, Zhang, Lianhua, Xia, Qin, Pu, Liang, Chen, Junlan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Convolutional neural networks based object detection usually assumes that training and test data have the same distribution, which, however, does not always hold in real-world applications. In autonomous vehicles, the driving scene (target domain) consists of unconstrained road environments which cannot all possibly be observed in training data (source domain) and this will lead to a sharp drop in the accuracy of the detector. In this paper, we propose a domain adaptation framework based on pseudo-labels to solve the domain shift. First, the pseudo-labels of the target domain images are generated by the baseline detector (BD) and optimized by our data optimization module to correct the errors. Then, the hard samples in a single image are labeled based on the optimization results of pseudo-labels. The adaptive sampling module is approached to sample target domain data according to the number of hard samples per image to select more effective data. Finally, a modified knowledge distillation loss is applied in the retraining module, and we investigate two ways of assigning soft-labels to the training examples from the target domain to retrain the detector. We evaluate the average precision of our approach in various source/target domain pairs and demonstrate that the framework improves over 10% average precision of BD on multiple domain adaptation scenarios on the Cityscapes, KITTI, and Apollo datasets.
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-021-06609-z