Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel Space
International Conference on Learning Representation (2022) Learning causal relationships in high-dimensional data (images, videos) is a hard task, as they are often defined on low dimensional manifolds and must be extracted from complex signals dominated by appearance, lighting, textures and also sp...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | International Conference on Learning Representation (2022) Learning causal relationships in high-dimensional data (images, videos) is a
hard task, as they are often defined on low dimensional manifolds and must be
extracted from complex signals dominated by appearance, lighting, textures and
also spurious correlations in the data. We present a method for learning
counterfactual reasoning of physical processes in pixel space, which requires
the prediction of the impact of interventions on initial conditions. Going
beyond the identification of structural relationships, we deal with the
challenging problem of forecasting raw video over long horizons. Our method
does not require the knowledge or supervision of any ground truth positions or
other object or scene properties. Our model learns and acts on a suitable
hybrid latent representation based on a combination of dense features, sets of
2D keypoints and an additional latent vector per keypoint. We show that this
better captures the dynamics of physical processes than purely dense or sparse
representations. We introduce a new challenging and carefully designed
counterfactual benchmark for predictions in pixel space and outperform strong
baselines in physics-inspired ML and video prediction. |
---|---|
DOI: | 10.48550/arxiv.2202.00368 |