SSOD-QCTR: Semi-Supervised Query Consistent Transformer for Optical Remote Sensing Image Object Detection

This paper proposes a semi-supervised query consistent transformer for optical remote sensing image object detection (SSOD-QCTR). A detection transformer (DETR)-like model is adopted as the basic network, and it follows the teacher–student training scheme. The proposed method makes three major contr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Remote sensing (Basel, Switzerland) Switzerland), 2024-12, Vol.16 (23), p.4556
Hauptverfasser: Ma, Xinyu, Lv, Pengyuan, Gong, Xunqiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper proposes a semi-supervised query consistent transformer for optical remote sensing image object detection (SSOD-QCTR). A detection transformer (DETR)-like model is adopted as the basic network, and it follows the teacher–student training scheme. The proposed method makes three major contributions. Firstly, to consider the problem of inaccurate pseudo-labels generated in the initial training epochs, a dynamic geometry-aware-based intersection over union (DGAIoU) loss function is proposed to dynamically update the weight coefficients according to the quality of the pseudo-labels in the current epoch. Secondly, we propose an improved focal (IF) loss function, which deals with the category imbalance problem by decreasing the category probability coefficients of the major categories. Thirdly, to solve the problem of uncertain correspondence between the output of the teacher and student models caused by the random initialization of the object queries, a query consistency (QC)-based loss function is proposed to introduce a consistency constraint of the outputs of the two models by taking the same regions of interest extracted from the pseudo-labels as the input object query. Extensive exploratory experiments on two publicly available datasets, DIOR and HRRSD, demonstrated that SSOD-QCTR outperforms the related methods, achieving a mAP of 65.28% and 81.73% for the DIOR and HRRSD datasets, respectively.
ISSN:2072-4292
2072-4292
DOI:10.3390/rs16234556