AFGN: Attention Feature Guided Network for object detection in optical remote sensing image

Object detection in optical remote sensing (RS) images is crucial for both military and civilian applications. However, a major challenge in RS object detection lies in the complexity of texture details within the images, which makes it difficult to accurately identify the objects. Currently, many o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neurocomputing (Amsterdam) 2024-12, Vol.610, p.128527, Article 128527
Hauptverfasser: Zhang, Ruiqing, Lei, Yinjie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Object detection in optical remote sensing (RS) images is crucial for both military and civilian applications. However, a major challenge in RS object detection lies in the complexity of texture details within the images, which makes it difficult to accurately identify the objects. Currently, many object detection methods based on deep learning focus primarily on network architecture and label assignment design. These methods often employ an end-to-end training approach, where the loss function only directly constraints the final output layer. However, this approach gives each module within the network a significant amount of freedom during the optimization process, which can hinder the network’s ability to effectively focus on the object and limit detection accuracy. To address these limitations, this paper proposes a novel approach called the Attention Feature Guided Network (AFGN). In this approach, a Attention Feature Guided Branch (AFGB) is introduced during the training phase of the CNN-based end-to-end detection network. The AFGB provides additional shallow supervision outside the detector’s output layer, guiding the backbone to effectively focus on the object amidst complex backgrounds. Additionally, a new operation called Background Blur Mask (BBM) is proposed, which is embedded in the AFGB to achieve image-level attention. Experiments conducted on the DIOR dataset demonstrate the effectiveness and efficiency of the proposed method. Our method achieves an mAP (mean average precision) of 0.777, surpassing many state-of-the-art object detection methods.
ISSN:0925-2312
DOI:10.1016/j.neucom.2024.128527