MDFOaNet: A Novel Multi-Modal Pedestrian Detection Network Based on Multi-Scale Image Dynamic Feature Optimization and Attention Mapping

To address the problem of traditional pedestrian detection methods being subject to random interference from the external environment and insufficient utilization of pedestrian feature information, a novel multi-modal pedestrian detection network called MDFOaNet is proposed. The proposed detection n...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent transportation systems 2025-01, Vol.26 (1), p.268-282
Hauptverfasser: Hao, Shuai, Li, Jiahao, Sun, Xizi, Ma, Xu, An, Beiyi, He, Tian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To address the problem of traditional pedestrian detection methods being subject to random interference from the external environment and insufficient utilization of pedestrian feature information, a novel multi-modal pedestrian detection network called MDFOaNet is proposed. The proposed detection network consists of two key components: a marine predator-based multi-scale image fusion module and a pedestrian detection module with an enhanced target visual accuracy attention model. In the fusion module, a contrast-based layered enhancement method for infrared images and a sharpness-based enhancement method for visible images are proposed for the problem of blurred pedestrian features in images. Moreover, to control the trade-off between fusion sub-layers, a dynamic image reconstruction model that relies on adaptive optimization based on marine predators is designed. Meanwhile, in the pedestrian detection module, to pay more attention to the main information in image and ignore some irrelevant information, an EVAM attention model is designed under the framework of YOLOv5s detection network, which improves the saliency of pedestrian targets and suppress the background interference. The experimental results show that compared with nine typical algorithms, the proposed algorithm can achieve accurate detection of multi-scale targets in complex environments, and is significantly superior to the compared detection algorithms in both subjective and objective evaluation indicators. The mAP and recall rates of the proposed network can reach 88.9% and 87.5%, respectively.
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2024.3483892