LA-Net: An End-to-End Category-Level Object Attitude Estimation Network Based on Multi-Scale Feature Fusion and an Attention Mechanism

In category-level object pose estimation tasks, determining how to mitigate intra-class shape variations and improve pose estimation accuracy for complex objects remains a challenging problem to solve. To address this issue, this paper proposes a new network architecture, LA-Net, to efficiently asce...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Electronics (Basel) 2024-07, Vol.13 (14), p.2809
Hauptverfasser: Wang, Jing, Liu, Guohan, Guo, Cheng, Ma, Qianglong, Song, Wanying
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In category-level object pose estimation tasks, determining how to mitigate intra-class shape variations and improve pose estimation accuracy for complex objects remains a challenging problem to solve. To address this issue, this paper proposes a new network architecture, LA-Net, to efficiently ascertain object poses from features. Firstly, we extend the 3D graph convolution network architecture by introducing the LS-Layer (Linear Connection Layer), which enables the network to acquire features from different layers and perform multi-scale feature fusion. Secondly, LA-Net employs a novel attention mechanism (PSA) and a Max-Pooling layer to extract local and global geometric information, which enhances the network’s ability to perceive object poses. Finally, the proposed LA-Net recovers the rotation information of an object by decoupling the rotation mechanism. The experimental results show that LA-Net can has much better accuracy in object pose estimation compared to the baseline method (HS-Pose). Especially for objects with complex shapes, its performance is 8.2% better for the 10°5 cm metric and 5% better for the 10°2 cm metric.
ISSN:2079-9292
2079-9292
DOI:10.3390/electronics13142809