Adaptive Granularity-Fused Keypoint Detection for 6D Pose Estimation of Space Targets

Estimating the 6D pose of a space target is an intricate task due to factors such as occlusions, changes in visual appearance, and background clutter. Accurate pose determination requires robust algorithms capable of handling these complexities while maintaining reliability under various environment...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Remote sensing (Basel, Switzerland) Switzerland), 2024-11, Vol.16 (22), p.4138
Hauptverfasser:	Gu, Xu, Yang, Xi, Liu, Hong, Yang, Dong
Format:	Artikel
Sprache:	eng
Schlagworte:	6D pose estimation Accuracy Algorithms Analysis Clutter Critical point Datasets Deep learning Environmental conditions Errors Estimation Information processing keypoint detection Machine learning Methods Neural networks Pose estimation Robustness space target Spatial discrimination learning Target detection Three dimensional models Visual aspects Visual tasks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Estimating the 6D pose of a space target is an intricate task due to factors such as occlusions, changes in visual appearance, and background clutter. Accurate pose determination requires robust algorithms capable of handling these complexities while maintaining reliability under various environmental conditions. Conventional pose estimation for space targets unfolds in two stages: establishing 2D–3D correspondences using keypoint detection networks and 3D models, followed by pose estimation via the perspective-n-point algorithm. The accuracy of this process hinges critically on the initial keypoint detection, which is currently limited by predominantly singular-scale detection techniques and fails to exploit sufficient information. To tackle the aforementioned challenges, we propose an adaptive dual-stream aggregation network (ADSAN), which enables the learning of finer local representations and the acquisition of abundant spatial and semantic information by merging features from both inter-layer and intra-layer perspectives through a multi-grained approach, consolidating features within individual layers and amplifying the interaction of distinct resolution features between layers. Furthermore, our ADSAN implements the selective keypoint focus module (SKFM) algorithm to alleviate problems caused by partial occlusions and viewpoint alterations. This mechanism places greater emphasis on the most challenging keypoints, ensuring the network prioritizes and optimizes its learning around these critical points. Benefiting from the finer and more robust information of space objects extracted by the ADSAN and SKFM, our method surpasses the SOTA method PoET (5.8°, 8.1°/0.0351%, 0.0744%) by 0.5°, 0.9°, and 0.0084%, 0.0354%, achieving 5.3°, 7.2° in rotation angle errors and 0.0267%, 0.0390% in normalized translation errors on the Speed and SwissCube datasets, respectively.
ISSN:	2072-4292 2072-4292
DOI:	10.3390/rs16224138