Delve into balanced and accurate approaches for ship detection in aerial images
Ship detection in aerial images is a challenging task compared with generic object detection, mainly due to the three reasons: expensive and lack of aerial datasets, imbalance problems caused by scale variation and small objects, scarce feature representation brought by particular perspectives in ae...
Gespeichert in:
Veröffentlicht in: | Neural computing & applications 2022-09, Vol.34 (18), p.15293-15312 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Ship detection in aerial images is a challenging task compared with generic object detection, mainly due to the three reasons: expensive and lack of aerial datasets, imbalance problems caused by scale variation and small objects, scarce feature representation brought by particular perspectives in aerial images. In this paper, we propose four methods for mitigating the problems above. First, we use the virtual 3D engine to create scenes with ship objects and annotate the collected images with bounding boxes automatically to generate the synthetic ship detection dataset, called unreal-ship. Second, we adopt a more balanced feature fusion structure Balanced Feature Pyramids (BFP), integrating and strengthening the features from each pyramid to obtain high-level and low-level information to reduce the imbalance problems of feature levels. Then we design an efficient anchor generation structure Guided Anchor, utilizing the semantic information to guide and generate high-quality anchors. Last, we adopt the IoU-based sampling method to reduce the uneven distribution of examples’ IoUs caused by random sampling. Our experiments prove that our ship object detection tasks will profit from the synthetic dataset by adding the synthetic dataset as additional data. Without bells and whistles, the structures, including BFP, Guided Anchor, and IoU-based sampling, achieve 2.5 points, 1.8 points, and 1.3 points Average Precision (AP) higher than our baseline algorithm, respectively. All the three structures promote an average of over 4 points AP. Extensive experiments on different backbones show that our methods achieve state-of-the-art performance compared with the remarkable detection frameworks and perform well in our detection models with different backbone networks. |
---|---|
ISSN: | 0941-0643 1433-3058 |
DOI: | 10.1007/s00521-021-06275-1 |