Equivariant Adaptation of Large Pretrained Models
Equivariant networks are specifically designed to ensure consistent behavior with respect to a set of input transformations, leading to higher sample efficiency and more accurate and robust predictions. However, redesigning each component of prevalent deep neural network architectures to achieve cho...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Equivariant networks are specifically designed to ensure consistent behavior
with respect to a set of input transformations, leading to higher sample
efficiency and more accurate and robust predictions. However, redesigning each
component of prevalent deep neural network architectures to achieve chosen
equivariance is a difficult problem and can result in a computationally
expensive network during both training and inference. A recently proposed
alternative towards equivariance that removes the architectural constraints is
to use a simple canonicalization network that transforms the input to a
canonical form before feeding it to an unconstrained prediction network. We
show here that this approach can effectively be used to make a large pretrained
network equivariant. However, we observe that the produced canonical
orientations can be misaligned with those of the training distribution,
hindering performance. Using dataset-dependent priors to inform the
canonicalization function, we are able to make large pretrained models
equivariant while maintaining their performance. This significantly improves
the robustness of these models to deterministic transformations of the data,
such as rotations. We believe this equivariant adaptation of large pretrained
models can help their domain-specific applications with known symmetry priors. |
---|---|
DOI: | 10.48550/arxiv.2310.01647 |