DPCN++: Differentiable Phase Correlation Network for Versatile Pose Registration

Pose registration is critical in vision and robotics. This article focuses on the challenging task of initialization-free pose registration up to 7DoF for homogeneous and heterogeneous measurements. While recent learning-based methods show promise using differentiable solvers, they either rely on he...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2023-12, Vol.45 (12), p.14366-14384
Hauptverfasser: Chen, Zexi, Liao, Yiyi, Du, Haozhe, Zhang, Haodong, Xu, Xuecheng, Lu, Haojian, Xiong, Rong, Wang, Yue
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Pose registration is critical in vision and robotics. This article focuses on the challenging task of initialization-free pose registration up to 7DoF for homogeneous and heterogeneous measurements. While recent learning-based methods show promise using differentiable solvers, they either rely on heuristically defined correspondences or require initialization. Phase correlation seeks solutions in the spectral domain and is correspondence-free and initialization-free. Following this, we propose a differentiable solver and combine it with simple feature extraction networks, namely DPCN++. It can perform registration for homo/hetero inputs and generalizes well on unseen objects. Specifically, the feature extraction networks first learn dense feature grids from a pair of homogeneous/heterogeneous measurements. These feature grids are then transformed into a translation and scale invariant spectrum representation based on Fourier transform and spherical radial aggregation, decoupling translation and scale from rotation. Next, the rotation, scale, and translation are independently and efficiently estimated in the spectrum step-by-step. The entire pipeline is differentiable and trained end-to-end. We evaluate DCPN++ on a wide range of tasks taking different input modalities, including 2D bird's-eye view images, 3D object and scene measurements, and medical images. Experimental results demonstrate that DCPN++ outperforms both classical and learning-based baselines, especially on partially observed and heterogeneous measurements.
ISSN:0162-8828
2160-9292
1939-3539
DOI:10.1109/TPAMI.2023.3317501