PPE: Point position embedding for single object tracking in point clouds

Existing 3D single object tracking methods primarily extract features from the global coordinates of point clouds, overlooking the potential exploitation of their positional information. However, due to the unordered, sparse, and irregular nature of point clouds, effectively exploring their position...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Electronics Letters 2023-08, Vol.59 (15), p.n/a
Hauptverfasser:	Su, Yuanzhi, Wang, Yuan‐Gen, Wang, Weijia, Zhu, Guopu
Format:	Artikel
Sprache:	eng
Schlagworte:	Augmented Reality computer vision Datasets Embedding Empirical analysis Feature extraction image motion analysis image recognition Methods Modules object tracking Optical radar Remote sensing Three dimensional models Tracking
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Existing 3D single object tracking methods primarily extract features from the global coordinates of point clouds, overlooking the potential exploitation of their positional information. However, due to the unordered, sparse, and irregular nature of point clouds, effectively exploring their positional information presents a significant challenge. In this letter, the network is explicitly reformulated by introducing a point position embedding module in conjunction with a self‐attention coding module, replacing the use of global coordinate inputs. The proposed reformulation is further integrated into a top‐notch model M2‐Track, called Point Position Embedding (PPE) in this letter. Comprehensive empirical analysis are performed on the KITTI and NuScenes datasets. Experimental results show that the PPE surpasses M2‐Track by a large margin in overall performance. Especially for the challenging NuScenes dataset, the method attains the highest precision and success in all classes compared to state‐of‐the‐art methods. The code is available at https://github.com/GZHU‐DVL/PPE. This letter explicitly reformulates the network as a point position embedding module in conjunction with a self‐attention coding module, and the proposed reformulation is further integrated into a top‐notch model M2‐Track. Experimental results show that our method surpasses M2‐Track by a large margin in overall performance. Especially for the challenging NuScenes dataset, our method attains the highest precision and success in all classes compared to state‐of‐the‐art methods.
ISSN:	0013-5194 1350-911X
DOI:	10.1049/ell2.12914