DM-YOLOX aerial object detection method with intensive attention mechanism

In aerial image detection, difficulties in feature extraction and low detection accuracy arise due to background interference, occlusion, and the presence of multiple small objects. This paper proposes a DM-YOLOX aerial object target detection method with intensive attention mechanism. Firstly, the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing 2024, Vol.80 (9), p.12790-12812
Hauptverfasser: Li, Xiangyu, Wang, Fengping, Wang, Wei, Han, Yanjiang, Zhang, Jianyang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In aerial image detection, difficulties in feature extraction and low detection accuracy arise due to background interference, occlusion, and the presence of multiple small objects. This paper proposes a DM-YOLOX aerial object target detection method with intensive attention mechanism. Firstly, the proposed approach incorporates coordinate attention (CA) and a dense connection method into the backbone network architecture, enabling adaptive channel weighting throughout the feature extraction process. This facilitates the enhancement of significant features while suppressing less relevant ones, thereby augmenting the network’s capacity to represent object features and ensuring retention and reinforcement of key features. Secondly, the multibranch extraction module (MBE) is incorporated into the feature fusion network to enhance the network’s ability in extracting multi-scale feature information from images with extensive coverage, thereby enhancing the detection accuracy and efficiency of small- and medium-sized objects in complex scenes. Finally, the utilization of SIoU instead of IoU as the bounding box loss function effectively addresses the issue of mismatch between real and predicted boxes, leading to accelerated network convergence and improved performance during model training. After training and testing on the VisDrone 2019 dataset, this method effectively detects small objects in complex environments. The DM-YOLOX model shows a significant improvement of 2.7% in mAP compared to the baseline network, while achieving an 8% increase in frames per second (FPS).
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-024-05944-x