PR-Deformable DETR: DETR for Remote Sensing Object Detection
Identifying objects in remote sensing images remains a critical challenge. However, remote sensing images typically encompass numerous small objects, significant variations in object sizes, and a dispersed distribution of objects, all of which pose challenges to the performance of existing object de...
Gespeichert in:
Veröffentlicht in: | IEEE geoscience and remote sensing letters 2024, Vol.21, p.1-5 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Identifying objects in remote sensing images remains a critical challenge. However, remote sensing images typically encompass numerous small objects, significant variations in object sizes, and a dispersed distribution of objects, all of which pose challenges to the performance of existing object detectors. We present PR-Deformable DEtection Transformer (DETR), a novel model for remote sensing object detection to address these challenges. First, we introduce the tridirectional adaptive feature fusion pyramid network (TAFFPN) feature pyramid module to adaptively fuse data from diverse feature map layers, thereby enhancing the model's multiscale representation capability. Second, we propose the Res-Deformable Encoder, which integrates deformable encoders across different input scales via residual connections, generating feature vectors that capture rich semantic information of remote sensing objects. Last, we introduce the dynamic reference point module (DRPM) Decoder, which leverages 4-D reference points enriched with high-level (HL) feature priors to strengthen the model's object localization capabilities. Experimental results demonstrate that PR-Deformable DETR achieves state-of-the-art remote sensing object detection accuracy, achieving 88.3% mean average precision (mAP) on the NWPU VHR-10 dataset and 95.1% mAP on the RSOD dataset, with a corresponding 16% reduction in GFLOPs. These results satisfy the performance standards required for remote sensing object detection tasks. |
---|---|
ISSN: | 1545-598X 1558-0571 |
DOI: | 10.1109/LGRS.2024.3483217 |