SCNet: A Lightweight and Efficient Object Detection Network for Remote Sensing
Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultr...
Gespeichert in:
Veröffentlicht in: | IEEE geoscience and remote sensing letters 2024, Vol.21, p.1-5 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Detecting small objects in remote sensing images is meaningful challenging, especially when deploying existing object detection models on edge terminal devices with limited hardware resources. In this study, we present an efficient remote sensing object detection model named SCNet, based on the ultra-lightweight (you only look once, YOLOv5n). To address the significant feature loss issue in small objects within the model's neck, we introduce the selective feature enhancement block (SFEB). The SFEB selectively processes a portion of feature maps that contribute more to semantic information extraction while retaining another portion, enabling us to extract rich semantic information while preserving crucial details information necessary for small object detection. Furthermore, we incorporate the contextual transformer block (CTB) at the neck and backbone junction, which enhances the model's ability to understand relationships and boundaries between objects and backgrounds by exploring contextual information in shallow-level feature maps. This improves the model's capability to detect challenging small and medium objects. Experimental results on the NWPU VHR-10 and DIOR datasets demonstrate the model's performance, achieving mean average precisions (mAPs) of 96.6% and 72.6% at IOU = 0.5. The model operates at 487 frames/s with a batch size of 32 (FPS32), requiring only 4.6 giga floating-point operations per second (GFLOPs) and 1.8 million params. |
---|---|
ISSN: | 1545-598X 1558-0571 |
DOI: | 10.1109/LGRS.2023.3344937 |