SFGAN: Unsupervised Generative Adversarial Learning of 3D Scene Flow from the 3D Scene Self

Scene flow tracks the 3D motion of each point in adjacent point clouds. It provides fundamental 3D motion perception for autonomous driving and server robot. Although red green blue depth (RGBD) camera or light detection and ranging (LiDAR) capture discrete 3D points in space, the objects and motion...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Advanced intelligent systems 2022-04, Vol.4 (4), p.n/a
Hauptverfasser:	Wang, Guangming, Jiang, Chaokang, Shen, Zehang, Miao, Yanzi, Wang, Hesheng
Format:	Artikel
Sprache:	eng
Schlagworte:	3D point clouds Ablation Design Estimates Experiments generative adversarial network Generative adversarial networks Learning Lidar Motion perception scene flow estimation soft correspondence Synthesis Teaching methods Three dimensional flow Three dimensional models Three dimensional motion unsupervised learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Scene flow tracks the 3D motion of each point in adjacent point clouds. It provides fundamental 3D motion perception for autonomous driving and server robot. Although red green blue depth (RGBD) camera or light detection and ranging (LiDAR) capture discrete 3D points in space, the objects and motions usually are continuous in the macroworld. That is, the objects keep themselves consistent as they flow from the current frame to the next frame. Based on this insight, the generative adversarial networks (GAN) is utilized to self‐learn 3D scene flow without ground truth. The fake point cloud is synthesized from the predicted scene flow and the point cloud of the first frame. The adversarial training of the generator and discriminator is realized through synthesizing indistinguishable fake point cloud and discriminating the real point cloud and the synthesized fake point cloud. The experiments on Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) dataset show that our method realizes promising results. Just as human, the proposed method can identify the similar local structures of two adjacent frames even without knowing the ground truth scene flow. Then, the local correspondence can be correctly estimated, and further the scene flow is correctly estimated. An interactive preprint version of the article can be found here: https://www.authorea.com/doi/full/10.22541/au.163335790.03073492. Two point clouds P C t and P C t + 1 of consecutive frames are passed into the scene flow generator G sf . The point cloud P C t at time t is warped to P C t + 1 * by the predicted scene flow SF. Discriminator D pc is designed to discriminate between P C t + 1 * and P C t + 1 . The G sf loss and the D pc loss are designed to optimize G sf and D pc , respectively.
ISSN:	2640-4567 2640-4567
DOI:	10.1002/aisy.202100197