MashFormer: A Novel Multiscale Aware Hybrid Detector for Remote Sensing Object Detection

Object detection is a critical and demanding topic in the subject of processing satellite and airborne images. The targets acquired in remote sensing imagery are at various sizes, and the backgrounds are complicated, which makes object detection extremely challenging. We address these aforementioned...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal of selected topics in applied earth observations and remote sensing 2023, Vol.16, p.2753-2763
Hauptverfasser: Wang, Keyan, Bai, Feiyu, Li, Jiaojiao, Liu, Yajing, Li, Yunsong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Object detection is a critical and demanding topic in the subject of processing satellite and airborne images. The targets acquired in remote sensing imagery are at various sizes, and the backgrounds are complicated, which makes object detection extremely challenging. We address these aforementioned issues in this article by introducing the MashFormer, an innovative multiscale aware convolutional neural network (CNN) and transformer integrated hybrid detector. Specifically, MashFormer employs the transformer block to complement the CNN-based feature extraction backbone, which could obtain the relationships between long-range features and enhance the representative ability in complex background scenarios. With the intention of improving the detection performance for objects with multiscale characteristic, since in remote sensing scenarios, the size of object varies greatly. A multilevel feature aggregation component, incorporate with a cross-level feature alignment module is designed to alleviate the semantic discrepancy between features from shallow and deep layers. To verify the effectiveness of the suggested MashFormer, comparative experiments are carried out with other cutting-edge methodologies using the publicly available high resolution remote sensing detection and Northwestern Polytechnical University VHR-10 datasets. The experimental findings confirm the effectiveness and superiority of our suggested model by indicating that our approach has greater mean average precision than the other methodologies.
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2023.3254047