Unsupervised Dynamics Prediction with Object-Centric Kinematics
Human perception involves discerning complex multi-object scenes into time-static object appearance (ie, size, shape, color) and time-varying object motion (ie, location, velocity, acceleration). This innate ability to unconsciously understand the environment is the motivation behind the success of...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Human perception involves discerning complex multi-object scenes into
time-static object appearance (ie, size, shape, color) and time-varying object
motion (ie, location, velocity, acceleration). This innate ability to
unconsciously understand the environment is the motivation behind the success
of dynamics modeling. Object-centric representations have emerged as a
promising tool for dynamics prediction, yet they primarily focus on the
objects' appearance, often overlooking other crucial attributes. In this paper,
we propose Object-Centric Kinematics (OCK), a framework for dynamics prediction
leveraging object-centric representations. Our model utilizes a novel component
named object kinematics, which comprises low-level structured states of
objects' position, velocity, and acceleration. The object kinematics are
obtained via either implicit or explicit approaches, enabling comprehensive
spatiotemporal object reasoning, and integrated through various transformer
mechanisms, facilitating effective object-centric dynamics modeling. Our model
demonstrates superior performance when handling objects and backgrounds in
complex scenes characterized by a wide range of object attributes and dynamic
movements. Moreover, our model demonstrates generalization capabilities across
diverse synthetic environments, highlighting its potential for broad
applicability in vision-related tasks. |
---|---|
DOI: | 10.48550/arxiv.2404.18423 |