Attention neural network with sparse attention mechanism

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing network inputs using an attention neural network having one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: FAN, PHILIP, ONTAGNON SANTIAGO, ZAHIR MANZIR, DUBEY, KUMAR, AVINAVA, GURUGANESH, GURU, AINSLEY JOSHUA TIMOTHY, AHMED AMR
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing network inputs using an attention neural network having one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that notes differently for input locations in a first true subset of input locations in the input to the sub-layer compared to locations not in the first true subset. 用于使用具有一个或多个稀疏注意力子层的注意力神经网络来处理网络输入的方法、系统和装置,包括在计算机存储介质上编码的计算机程序。每个稀疏注意力子层被配置为应用稀疏注意力机制,该稀疏注意力机制针对在对子层的输入中的输入位置的第一真子集中的输入位置,与不在第一真子集中的位置相比不同地进行注意。