SCA-YOLO: a new small object detection model for UAV images

Object detection from UAV (unmanned aerial vehicle) images is a crucial and challenging task in the field of computer vision. The task suffers from the difficulties of small dense objects, low pixel occupation of objects, and features that are not easily extracted in images. In this paper, we propos...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Visual computer 2024-03, Vol.40 (3), p.1787-1803
Hauptverfasser: Zeng, Shuang, Yang, Wenzhu, Jiao, Yanyan, Geng, Lei, Chen, Xinting
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Object detection from UAV (unmanned aerial vehicle) images is a crucial and challenging task in the field of computer vision. The task suffers from the difficulties of small dense objects, low pixel occupation of objects, and features that are not easily extracted in images. In this paper, we proposed a multilayer feature fusion algorithm named SCA-YOLO (spatial and coordinate attention enhancement YOLO) for small object detection with hybrid attention mechanisms. It uses the single-stage detection algorithm YOLOv5 as the base framework. Firstly, a hybrid attention module with associated coordinate attention is designed to enhance the feature extraction of small objects. Secondly, to address the problem that small objects are vulnerable to being disturbed by the complex background information on UAV images, an improved SEB (simple and efficient bottleneck) module is designed to further distinguish foreground and background features. Thirdly, a multilayer feature fusion structure is built to perform channel stitching of shallow and deep feature maps, as well as to enrich the semantic information of shallow features by adding horizontal jump connections. Finally, experiments are conducted on the VisDrone2020 dataset, which involves a large number of small objects photographed by drones. In addition, we also conduct extended experiments on the DOTA dataset and PASCAL VOC dataset. Comparative experimental results indicate that the proposed method considerably improves the accuracy of small object detection on multiple benchmark datasets.
ISSN:0178-2789
1432-2315
DOI:10.1007/s00371-023-02886-y