Modeling the Real World with High-Density Visual Particle Dynamics
We present High-Density Visual Particle Dynamics (HD-VPD), a learned world model that can emulate the physical dynamics of real scenes by processing massive latent point clouds containing 100K+ particles. To enable efficiency at this scale, we introduce a novel family of Point Cloud Transformers (PC...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present High-Density Visual Particle Dynamics (HD-VPD), a learned world
model that can emulate the physical dynamics of real scenes by processing
massive latent point clouds containing 100K+ particles. To enable efficiency at
this scale, we introduce a novel family of Point Cloud Transformers (PCTs)
called Interlacers leveraging intertwined linear-attention Performer layers and
graph-based neighbour attention layers. We demonstrate the capabilities of
HD-VPD by modeling the dynamics of high degree-of-freedom bi-manual robots with
two RGB-D cameras. Compared to the previous graph neural network approach, our
Interlacer dynamics is twice as fast with the same prediction quality, and can
achieve higher quality using 4x as many particles. We illustrate how HD-VPD can
evaluate motion plan quality with robotic box pushing and can grasping tasks.
See videos and particle dynamics rendered by HD-VPD at
https://sites.google.com/view/hd-vpd. |
---|---|
DOI: | 10.48550/arxiv.2406.19800 |