DPH-YOLOv8: Improved YOLOv8 Based on Double Prediction Heads for the UAV Image Object Detection

Object detection on unmanned aerial vehicle (UAV) images has been a hot research topic recently. However, object detection models for general scenarios struggle with UAV images due to the challenges of detecting small targets and handling complex image backgrounds. To solve the two issues, we propos...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on geoscience and remote sensing 2024, Vol.62, p.1-15
Hauptverfasser: Wang, Jian, Li, Xinqi, Chen, Jiafu, Zhou, Lihui, Guo, Linyang, He, Zihao, Zhou, Hao, Zhang, Zechen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Object detection on unmanned aerial vehicle (UAV) images has been a hot research topic recently. However, object detection models for general scenarios struggle with UAV images due to the challenges of detecting small targets and handling complex image backgrounds. To solve the two issues, we proposed DPH-YOLOv8, an enhanced version of YOLOv8 tailored for UAV scenarios. First, we improved the prediction head from three prediction heads to double prediction heads (DPHs), reducing the model's parameters by 42.9% while improving the mAP for small targets. Second, we designed a tiny-path feature fusion (TP-Fusion) module to fuse richer detailed information, enabling the detector to accurately match targets of different sizes and shapes. Third, we introduced a coordinate attention (CA) module to help the model reduce the interference of complex background information and focus on detecting foreground targets. Finally, bottleneck modules were added before the prediction heads to enhance the extraction of small target features. Extensive experimental results on both the VisDrone2021 and UAVDT benchmarks demonstrated that DPH-YOLOv8 not only improved the mAP by 4.5% on VisDrone2021 and 2.2% on UAVDT but also reduced localization error by 0.53%, confusion with objects by 0.69%, and confusion with background by 0.45%. These enhancements, along with a reduction in the model's parameters, make DPH-YOLOv8 more suitable for UAV scenarios.
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2024.3487191