Deep learning-enhanced environment perception for autonomous driving: MDNet with CSP-DarkNet53

Implementing environmental perception in intelligent vehicles is a crucial application, but the parallel processing of numerous algorithms on the vehicle side is complex, and their integration remains a critical challenge. To address this problem, this paper proposes a multitask detection algorithm...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2025-04, Vol.160, p.111174, Article 111174
Hauptverfasser: Guo, Xuyao, Jiang, Feng, Chen, Quanzhen, Wang, Yuxuan, Sha, Kaiyue, Chen, Jing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Implementing environmental perception in intelligent vehicles is a crucial application, but the parallel processing of numerous algorithms on the vehicle side is complex, and their integration remains a critical challenge. To address this problem, this paper proposes a multitask detection algorithm Multitask Detection Network (MDNet) based on Cross Stage Partial Networks with Darknet53 Backbone (CSP-DarkNet53) with high feature extraction capability, which can simultaneously detect vehicles, pedestrians, traffic lights, traffic signs, and bicycles as well as lane lines. MDNet obtains exceptional results in multitask scenarios by employing innovative architectural designs consisting of a Feature Extraction Module, Target-level Branches, and Pixel-level Branches. The feature extraction module proposes an improved CSPPF structure to extract features more efficiently for three tasks, facilitating MDNet's capacity. The target-level branch suggests PFPN, which combines features from the backbone network, and the pixel-level branch utilizes a primary feature fusion network and an enhanced C2F_Faster method to spot lane lines more precisely. By incorporating these designs, MDNet's performance in complex environments is enhanced significantly. The algorithm underwent testing on the Berkeley DeepDrive 100K (BDD100K) and Cityscapes datasets, in which it could identify traffic targets and lane lines in numerous challenging settings, resulting in a 9.8 % measure of improvement in detection accuracy map for all three tasks relative to You Only Look Once for Panoptic Driving Perception (YOLOP, a multitask detection network), an 8.9 % improvement in IoU, a 22.1 % improvement in accuracy. It reached a speed of 46fps, which serves the practical applications' requirements more effectively.
ISSN:0031-3203
DOI:10.1016/j.patcog.2024.111174