Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation
We tackle the problems of latent variables identification and ``out-of-support'' image generation in representation learning. We show that both are possible for a class of decoders that we call additive, which are reminiscent of decoders used for object-centric representation learning (OCR...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We tackle the problems of latent variables identification and
``out-of-support'' image generation in representation learning. We show that
both are possible for a class of decoders that we call additive, which are
reminiscent of decoders used for object-centric representation learning (OCRL)
and well suited for images that can be decomposed as a sum of object-specific
images. We provide conditions under which exactly solving the reconstruction
problem using an additive decoder is guaranteed to identify the blocks of
latent variables up to permutation and block-wise invertible transformations.
This guarantee relies only on very weak assumptions about the distribution of
the latent factors, which might present statistical dependencies and have an
almost arbitrarily shaped support. Our result provides a new setting where
nonlinear independent component analysis (ICA) is possible and adds to our
theoretical understanding of OCRL methods. We also show theoretically that
additive decoders can generate novel images by recombining observed factors of
variations in novel ways, an ability we refer to as Cartesian-product
extrapolation. We show empirically that additivity is crucial for both
identifiability and extrapolation on simulated data. |
---|---|
DOI: | 10.48550/arxiv.2307.02598 |