Multi-Object Representation Learning with Iterative Variational Inference
ICML 2019 (PMLR 97:2424-2433) Human perception is structured around objects which form the basis for our higher-level cognition and impressive systematic generalization abilities. Yet most work on representation learning focuses on feature learning without even considering multiple objects, or treat...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | ICML 2019 (PMLR 97:2424-2433) Human perception is structured around objects which form the basis for our
higher-level cognition and impressive systematic generalization abilities. Yet
most work on representation learning focuses on feature learning without even
considering multiple objects, or treats segmentation as an (often supervised)
preprocessing step. Instead, we argue for the importance of learning to segment
and represent objects jointly. We demonstrate that, starting from the simple
assumption that a scene is composed of multiple entities, it is possible to
learn to segment images into interpretable objects with disentangled
representations. Our method learns -- without supervision -- to inpaint
occluded parts, and extrapolates to scenes with more objects and to unseen
objects with novel feature combinations. We also show that, due to the use of
iterative variational inference, our system is able to learn multi-modal
posteriors for ambiguous inputs and extends naturally to sequences. |
---|---|
DOI: | 10.48550/arxiv.1903.00450 |