Optimizing Convolution Operations for YOLOv4-based Object Detection on GPU

Real-time object detection is crucial for autonomous vehicles, and YOLO (You Only Look Once) algorithms have demonstrated their effectiveness for this purpose. This study examines the performance of YOLOv4 [3] for real-time object detection on an embedded architecture. We focus on optimizing the com...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ITM web of conferences 2024-01, Vol.69, p.4008
Hauptverfasser: Guerrouj, Fatima Zahra, Rodríguez Flórez, Sergio, El Ouardi, Abdelhafid, Abouzahir, Mohamed, Ramzi, Mustapha
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Real-time object detection is crucial for autonomous vehicles, and YOLO (You Only Look Once) algorithms have demonstrated their effectiveness for this purpose. This study examines the performance of YOLOv4 [3] for real-time object detection on an embedded architecture. We focus on optimizing the computationally intensive convolution operations by employing the cuDNN library to achieve efficient inference. The evaluation assesses critical performance metrics, including object detection accuracy in terms of Mean Average Precision (mAP) and inference latency on the embedded architecture. We conduct a comparative analysis using the publicly available KITTI [7] database. The reported results establish a benchmark between the parallelized YOLOv4 model and the baseline implementation, assessing the advantages of cuDNN acceleration for real-time object detection on resource-constrained devices.
ISSN:2271-2097
2431-7578
2271-2097
DOI:10.1051/itmconf/20246904008