Interaction Asymmetry: A General Principle for Learning Composable Abstractions

Learning disentangled representations of concepts and re-composing them in unseen ways is crucial for generalizing to out-of-domain situations. However, the underlying properties of concepts that enable such disentanglement and compositional generalization remain poorly understood. In this work, we...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Brady, Jack, von Kügelgen, Julius, Lachapelle, Sébastien, Buchholz, Simon, Kipf, Thomas, Brendel, Wieland
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Brady, Jack von Kügelgen, Julius Lachapelle, Sébastien Buchholz, Simon Kipf, Thomas Brendel, Wieland
description	Learning disentangled representations of concepts and re-composing them in unseen ways is crucial for generalizing to out-of-domain situations. However, the underlying properties of concepts that enable such disentanglement and compositional generalization remain poorly understood. In this work, we propose the principle of interaction asymmetry which states: "Parts of the same concept have more complex interactions than parts of different concepts". We formalize this via block diagonality conditions on the $(n+1)$th order derivatives of the generator mapping concepts to observed data, where different orders of "complexity" correspond to different $n$. Using this formalism, we prove that interaction asymmetry enables both disentanglement and compositional generalization. Our results unify recent theoretical results for learning concepts of objects, which we show are recovered as special cases with $n\!=\!0$ or $1$. We provide results for up to $n\!=\!2$, thus extending these prior works to more flexible generator functions, and conjecture that the same proof strategies generalize to larger $n$. Practically, our theory suggests that, to disentangle concepts, an autoencoder should penalize its latent capacity and the interactions between concepts during decoding. We propose an implementation of these criteria using a flexible Transformer-based VAE, with a novel regularizer on the attention weights of the decoder. On synthetic image datasets consisting of objects, we provide evidence that this model can achieve comparable object disentanglement to existing models that use more explicit object-centric priors.
doi_str_mv	10.48550/arxiv.2411.07784
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_07784</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_07784</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_077843</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DMwN7cw4WTw98wrSS1KTC7JzM9TcCyuzM1NLSmqtFJwVHBPzQNK5CgEFGXmJWcW5KQqpOUXKfikJhblZealKzjn5xbkFycmAcUdk4pLoEYU8zCwpiXmFKfyQmluBnk31xBnD12w1fEFRZm5iUWV8SAnxIOdYExYBQDM1Dx8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Interaction Asymmetry: A General Principle for Learning Composable Abstractions</title><source>arXiv.org</source><creator>Brady, Jack ; von Kügelgen, Julius ; Lachapelle, Sébastien ; Buchholz, Simon ; Kipf, Thomas ; Brendel, Wieland</creator><creatorcontrib>Brady, Jack ; von Kügelgen, Julius ; Lachapelle, Sébastien ; Buchholz, Simon ; Kipf, Thomas ; Brendel, Wieland</creatorcontrib><description>Learning disentangled representations of concepts and re-composing them in unseen ways is crucial for generalizing to out-of-domain situations. However, the underlying properties of concepts that enable such disentanglement and compositional generalization remain poorly understood. In this work, we propose the principle of interaction asymmetry which states: "Parts of the same concept have more complex interactions than parts of different concepts". We formalize this via block diagonality conditions on the $(n+1)$th order derivatives of the generator mapping concepts to observed data, where different orders of "complexity" correspond to different $n$. Using this formalism, we prove that interaction asymmetry enables both disentanglement and compositional generalization. Our results unify recent theoretical results for learning concepts of objects, which we show are recovered as special cases with $n\!=\!0$ or $1$. We provide results for up to $n\!=\!2$, thus extending these prior works to more flexible generator functions, and conjecture that the same proof strategies generalize to larger $n$. Practically, our theory suggests that, to disentangle concepts, an autoencoder should penalize its latent capacity and the interactions between concepts during decoding. We propose an implementation of these criteria using a flexible Transformer-based VAE, with a novel regularizer on the attention weights of the decoder. On synthetic image datasets consisting of objects, we provide evidence that this model can achieve comparable object disentanglement to existing models that use more explicit object-centric priors.</description><identifier>DOI: 10.48550/arxiv.2411.07784</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning</subject><creationdate>2024-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.07784$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.07784$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Brady, Jack</creatorcontrib><creatorcontrib>von Kügelgen, Julius</creatorcontrib><creatorcontrib>Lachapelle, Sébastien</creatorcontrib><creatorcontrib>Buchholz, Simon</creatorcontrib><creatorcontrib>Kipf, Thomas</creatorcontrib><creatorcontrib>Brendel, Wieland</creatorcontrib><title>Interaction Asymmetry: A General Principle for Learning Composable Abstractions</title><description>Learning disentangled representations of concepts and re-composing them in unseen ways is crucial for generalizing to out-of-domain situations. However, the underlying properties of concepts that enable such disentanglement and compositional generalization remain poorly understood. In this work, we propose the principle of interaction asymmetry which states: "Parts of the same concept have more complex interactions than parts of different concepts". We formalize this via block diagonality conditions on the $(n+1)$th order derivatives of the generator mapping concepts to observed data, where different orders of "complexity" correspond to different $n$. Using this formalism, we prove that interaction asymmetry enables both disentanglement and compositional generalization. Our results unify recent theoretical results for learning concepts of objects, which we show are recovered as special cases with $n\!=\!0$ or $1$. We provide results for up to $n\!=\!2$, thus extending these prior works to more flexible generator functions, and conjecture that the same proof strategies generalize to larger $n$. Practically, our theory suggests that, to disentangle concepts, an autoencoder should penalize its latent capacity and the interactions between concepts during decoding. We propose an implementation of these criteria using a flexible Transformer-based VAE, with a novel regularizer on the attention weights of the decoder. On synthetic image datasets consisting of objects, we provide evidence that this model can achieve comparable object disentanglement to existing models that use more explicit object-centric priors.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DMwN7cw4WTw98wrSS1KTC7JzM9TcCyuzM1NLSmqtFJwVHBPzQNK5CgEFGXmJWcW5KQqpOUXKfikJhblZealKzjn5xbkFycmAcUdk4pLoEYU8zCwpiXmFKfyQmluBnk31xBnD12w1fEFRZm5iUWV8SAnxIOdYExYBQDM1Dx8</recordid><startdate>20241112</startdate><enddate>20241112</enddate><creator>Brady, Jack</creator><creator>von Kügelgen, Julius</creator><creator>Lachapelle, Sébastien</creator><creator>Buchholz, Simon</creator><creator>Kipf, Thomas</creator><creator>Brendel, Wieland</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241112</creationdate><title>Interaction Asymmetry: A General Principle for Learning Composable Abstractions</title><author>Brady, Jack ; von Kügelgen, Julius ; Lachapelle, Sébastien ; Buchholz, Simon ; Kipf, Thomas ; Brendel, Wieland</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_077843</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Brady, Jack</creatorcontrib><creatorcontrib>von Kügelgen, Julius</creatorcontrib><creatorcontrib>Lachapelle, Sébastien</creatorcontrib><creatorcontrib>Buchholz, Simon</creatorcontrib><creatorcontrib>Kipf, Thomas</creatorcontrib><creatorcontrib>Brendel, Wieland</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Brady, Jack</au><au>von Kügelgen, Julius</au><au>Lachapelle, Sébastien</au><au>Buchholz, Simon</au><au>Kipf, Thomas</au><au>Brendel, Wieland</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Interaction Asymmetry: A General Principle for Learning Composable Abstractions</atitle><date>2024-11-12</date><risdate>2024</risdate><abstract>Learning disentangled representations of concepts and re-composing them in unseen ways is crucial for generalizing to out-of-domain situations. However, the underlying properties of concepts that enable such disentanglement and compositional generalization remain poorly understood. In this work, we propose the principle of interaction asymmetry which states: "Parts of the same concept have more complex interactions than parts of different concepts". We formalize this via block diagonality conditions on the $(n+1)$th order derivatives of the generator mapping concepts to observed data, where different orders of "complexity" correspond to different $n$. Using this formalism, we prove that interaction asymmetry enables both disentanglement and compositional generalization. Our results unify recent theoretical results for learning concepts of objects, which we show are recovered as special cases with $n\!=\!0$ or $1$. We provide results for up to $n\!=\!2$, thus extending these prior works to more flexible generator functions, and conjecture that the same proof strategies generalize to larger $n$. Practically, our theory suggests that, to disentangle concepts, an autoencoder should penalize its latent capacity and the interactions between concepts during decoding. We propose an implementation of these criteria using a flexible Transformer-based VAE, with a novel regularizer on the attention weights of the decoder. On synthetic image datasets consisting of objects, we provide evidence that this model can achieve comparable object disentanglement to existing models that use more explicit object-centric priors.</abstract><doi>10.48550/arxiv.2411.07784</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2411.07784
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2411_07784
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning
title	Interaction Asymmetry: A General Principle for Learning Composable Abstractions
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T19%3A06%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Interaction%20Asymmetry:%20A%20General%20Principle%20for%20Learning%20Composable%20Abstractions&rft.au=Brady,%20Jack&rft.date=2024-11-12&rft_id=info:doi/10.48550/arxiv.2411.07784&rft_dat=%3Carxiv_GOX%3E2411_07784%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true