Multi-task feature-aligned head in one-stage object detection

Existing one-stage detectors usually use two decoupled branches to optimize two subtasks, i.e., object localization and classification. However, this design paradigm will lead to misalignment of spatial features due to inconsistency in localization and classification. To mitigate this problem, we pr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Signal, image and video processing image and video processing, 2023-06, Vol.17 (4), p.1345-1353
Hauptverfasser: Liu, Zeting, Shao, Mingwen, Sun, Yuantao, Peng, Zilu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Existing one-stage detectors usually use two decoupled branches to optimize two subtasks, i.e., object localization and classification. However, this design paradigm will lead to misalignment of spatial features due to inconsistency in localization and classification. To mitigate this problem, we propose a plug-in and simple AF-Head (Aligned Features) that can generate aligned features for each task. Our proposed AF-Head contains Focus-Guided Feature Enhancement Module (FGM) and Auxiliary Positioning Module (APM). Specifically, in our FGM, we propose a focus branch representing the joint representation of localization confidence and classification scores. Then, we combine the focus and classification branches to alleviate the gap between training and inference. In addition, APM generates more accurate offsets for the localization branch to align with the classification branch. Moreover, we propose AF-Net based on the AF-Head. Extensive experiments on the MS-COCO demonstrate that our AF-Head can boost 0.7 ∼ 1.7 AP on different state-of-the-art one-stage detectors. Notably, AF-Net with a standard ResNeXt-101-32x4d-DCN backbone achieves 49.2 AP on the COCO test - dev .
ISSN:1863-1703
1863-1711
DOI:10.1007/s11760-022-02342-9