MIXING TOKENS WITH SPECTRAL TRANSFORM

Transformer systems and methods of using such transformer systems including computer programs encoded on a computer storage medium, for performing a deep learning task on an input sequence to generate an encoded output. In one aspect, one of the transformer systems includes an encoder architecture b...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lee-Thorp, James Patrick, Ainslie, Joshua Timothy, Ontañón, Santiago, Eckstein, Ilya
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Transformer systems and methods of using such transformer systems including computer programs encoded on a computer storage medium, for performing a deep learning task on an input sequence to generate an encoded output. In one aspect, one of the transformer systems includes an encoder architecture block, comprising: a spectral transform mixing layer that receives input embeddings of input tokens and generates, as output, a spectral transform output along a sequence dimension of the input embeddings; and a feed forward layer that receives an input based on the input embeddings of input tokens and the spectral transform output and generates an output for a subsequent processing block.