Identifying Representations for Intervention Extrapolation
The premise of identifiable and causal representation learning is to improve the current representation learning paradigm in terms of generalizability or robustness. Despite recent progress in questions of identifiability, more theoretical results demonstrating concrete advantages of these methods f...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The premise of identifiable and causal representation learning is to improve
the current representation learning paradigm in terms of generalizability or
robustness. Despite recent progress in questions of identifiability, more
theoretical results demonstrating concrete advantages of these methods for
downstream tasks are needed. In this paper, we consider the task of
intervention extrapolation: predicting how interventions affect an outcome,
even when those interventions are not observed at training time, and show that
identifiable representations can provide an effective solution to this task
even if the interventions affect the outcome non-linearly. Our setup includes
an outcome Y, observed features X, which are generated as a non-linear
transformation of latent features Z, and exogenous action variables A, which
influence Z. The objective of intervention extrapolation is to predict how
interventions on A that lie outside the training support of A affect Y. Here,
extrapolation becomes possible if the effect of A on Z is linear and the
residual when regressing Z on A has full support. As Z is latent, we combine
the task of intervention extrapolation with identifiable representation
learning, which we call Rep4Ex: we aim to map the observed features X into a
subspace that allows for non-linear extrapolation in A. We show that the hidden
representation is identifiable up to an affine transformation in Z-space, which
is sufficient for intervention extrapolation. The identifiability is
characterized by a novel constraint describing the linearity assumption of A on
Z. Based on this insight, we propose a method that enforces the linear
invariance constraint and can be combined with any type of autoencoder. We
validate our theoretical findings through synthetic experiments and show that
our approach succeeds in predicting the effects of unseen interventions. |
---|---|
DOI: | 10.48550/arxiv.2310.04295 |