Measure-to-measure interpolation using Transformers
Transformers are deep neural network architectures that underpin the recent successes of large language models. Unlike more classical architectures that can be viewed as point-to-point maps, a Transformer acts as a measure-to-measure map implemented as specific interacting particle system on the uni...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Transformers are deep neural network architectures that underpin the recent
successes of large language models. Unlike more classical architectures that
can be viewed as point-to-point maps, a Transformer acts as a
measure-to-measure map implemented as specific interacting particle system on
the unit sphere: the input is the empirical measure of tokens in a prompt and
its evolution is governed by the continuity equation. In fact, Transformers are
not limited to empirical measures and can in principle process any input
measure. As the nature of data processed by Transformers is expanding rapidly,
it is important to investigate their expressive power as maps from an arbitrary
measure to another arbitrary measure. To that end, we provide an explicit
choice of parameters that allows a single Transformer to match $N$ arbitrary
input measures to $N$ arbitrary target measures, under the minimal assumption
that every pair of input-target measures can be matched by some transport map. |
---|---|
DOI: | 10.48550/arxiv.2411.04551 |