From Less to More: Progressive Generalized Zero-Shot Detection With Curriculum Learning

Object detection, as one of the most important environment perception tasks for traffic safety in intelligent transportation systems, has been widely investigated recently. However, most of the researches focus on the fully supervised scenario, and inevitably lead to model failure. With the continuo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent transportation systems 2022-10, Vol.23 (10), p.19016-19029
Hauptverfasser: Liu, Jingren, Chen, Yi, Liu, Huajun, Zhang, Haofeng, Zhang, Yudong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Object detection, as one of the most important environment perception tasks for traffic safety in intelligent transportation systems, has been widely investigated recently. However, most of the researches focus on the fully supervised scenario, and inevitably lead to model failure. With the continuous development of Zero-Shot Learning (ZSL) models, Generalized Zero-Shot Detection (GZSD) has attracted great attention due to its ability of detecting unseen objects. Many researchers tend to map the detected visual features to semantic attributes and then separate seen and unseen domains during inference. But they have ignore that the generative methods generally have higher performance than these visual-semantic mapping methods, and they have been confirmed from previous GZSL methods. In order to make up for the vacancy of GZSD in the generative methods, we propose an idea of using curriculum learning to generate more precise unseen visual features. And with the excellent performance of WGAN-based method in sample synthesis, we realize the function of using semantics to generate visual features for unseen domains. In addition, we also adopt part of the idea of meta-learning to progressively correct the capability of the generator for better mitigating domain shift problem during the generation process. Through the above ideas, we can detect both seen and unseen bounding boxes and classify them accurately, by combining with the excellent detection ability of Faster-RCNN. Extensive experimental results on two popular datasets, i.e., MSCOCO and KITTI, show that our proposed method can outperform the state-of-the-art methods.
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2022.3151073