Attention neural network with sparse attention mechanism
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing network inputs using an attention neural network having one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing network inputs using an attention neural network having one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that notes differently for input locations in a first true subset of input locations in the input to the sub-layer compared to locations not in the first true subset.
用于使用具有一个或多个稀疏注意力子层的注意力神经网络来处理网络输入的方法、系统和装置,包括在计算机存储介质上编码的计算机程序。每个稀疏注意力子层被配置为应用稀疏注意力机制,该稀疏注意力机制针对在对子层的输入中的输入位置的第一真子集中的输入位置,与不在第一真子集中的位置相比不同地进行注意。 |
---|