Cut-in maneuver detection with self-supervised contrastive video representation learning

The detection of the maneuvers of the surrounding vehicles is important for autonomous vehicles to act accordingly to avoid possible accidents. This study proposes a framework based on contrastive representation learning to detect potentially dangerous cut-in maneuvers that can happen in front of th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Signal, image and video processing image and video processing, 2023-09, Vol.17 (6), p.2915-2923
Hauptverfasser:	Nalcakan, Yagiz, Bastanlar, Yalin
Format:	Artikel
Sprache:	eng
Schlagworte:	Coders Computer Imaging Computer Science Datasets Image Processing and Computer Vision Learning Maneuvers Multimedia Information Systems Original Paper Pattern Recognition and Graphics Representations Signal,Image and Speech Processing Vehicles Video Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The detection of the maneuvers of the surrounding vehicles is important for autonomous vehicles to act accordingly to avoid possible accidents. This study proposes a framework based on contrastive representation learning to detect potentially dangerous cut-in maneuvers that can happen in front of the ego vehicle. First, the encoder network is trained in a self-supervised fashion with contrastive loss where two augmented videos of the same video clip stay close to each other in the embedding space, while augmentations from different videos stay far apart. Since no maneuver labeling is required in this step, a relatively large dataset can be used. After this self-supervised training, the encoder is fine-tuned with our cut-in/lane-pass labeled datasets. Instead of using original video frames, we simplified the scene by highlighting surrounding vehicles and ego-lane. We have investigated the use of several classification heads, augmentation types, and scene simplification alternatives. The most successful model outperforms the best fully supervised model by ∼ 2% with an accuracy of 92.52%.
ISSN:	1863-1703 1863-1711
DOI:	10.1007/s11760-023-02512-3