Permutation Equivariance of Transformers and Its Applications
Revolutionizing the field of deep learning, Transformer-based models have achieved remarkable performance in many tasks. Recent research has recognized these models are robust to shuffling but are limited to inter-token permutation in the forward propagation. In this work, we propose our definition...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Revolutionizing the field of deep learning, Transformer-based models have
achieved remarkable performance in many tasks. Recent research has recognized
these models are robust to shuffling but are limited to inter-token permutation
in the forward propagation. In this work, we propose our definition of
permutation equivariance, a broader concept covering both inter- and intra-
token permutation in the forward and backward propagation of neural networks.
We rigorously proved that such permutation equivariance property can be
satisfied on most vanilla Transformer-based models with almost no adaptation.
We examine the property over a range of state-of-the-art models including ViT,
Bert, GPT, and others, with experimental validations. Further, as a
proof-of-concept, we explore how real-world applications including
privacy-enhancing split learning, and model authorization, could exploit the
permutation equivariance property, which implicates wider, intriguing
application scenarios. |
---|---|
DOI: | 10.48550/arxiv.2304.07735 |