Adaptive feedback connection with a single‐level feature for object detection

From the perspective of detector optimisation, detecting objects using only a one‐level feature cannot provide good performance for a wide range of scales. Various complex feature pyramidal structures address this problem using the divide‐and‐conquer strategy and multi‐scale feature fusion. However,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IET Computer Vision 2022-12, Vol.16 (8), p.736-746
Hauptverfasser: Ruan, Zhongling, Cao, Jianzhong, Wang, Hao, Guo, Huinan, Yang, Xin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:From the perspective of detector optimisation, detecting objects using only a one‐level feature cannot provide good performance for a wide range of scales. Various complex feature pyramidal structures address this problem using the divide‐and‐conquer strategy and multi‐scale feature fusion. However, this requires adding too many additional convolutional layers and fusion operations. To address the issue, a simple detection part is proposed, which includes three components, namely a one‐level feature map for detection, the encoder structure with feedback connection, and a decoupled head. The redesigned encoder and decoupled head can successfully address the performance decline caused by the one‐level feature‐based detection. Moreover, the proposed method can accelerate the convergence of the detector and achieve a faster inference time. Based on the optimised detection part, an adaptive feedback connection with a single‐level feature (AFS) is proposed for object detection. The experiments conducted on the MS COCO 2017 benchmark show that the proposed method can achieve comparable results with its multi‐scale pyramid counterpart, You Only Look Once v4 (YOLOv4). In addition, AFS can help the YOLOv4 achieve 44.9 mAP at 27 frame per second and converging 82 epochs earlier under the image size of 608×608, which represents a 42.1% improvements in the convergence speed.
ISSN:1751-9632
1751-9640
DOI:10.1049/cvi2.12121