ATTENTION NEURAL NETWORKS WITH SPARSE ATTENTION MECHANISMS

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	GURUGANESH, Guru, ONTAÑÓN, Santiago, ZAHEER, Manzil, AINSLIE, Joshua Timothy, PHAM, Philip, AHMED, Amr, DUBEY, Kumar Avinava
Format:	Patent
Sprache:	eng ; fre
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	GURUGANESH, Guru ONTAÑÓN, Santiago ZAHEER, Manzil AINSLIE, Joshua Timothy PHAM, Philip AHMED, Amr DUBEY, Kumar Avinava
description	Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset. L'invention concerne des procédés, des systèmes et un appareil, y compris des programmes informatiques codés sur un support de stockage informatique, permettant de traiter des entrées de réseau en utilisant un réseau d'attention éparse qui comprend une ou plusieurs sous-couches d'attention éparse. Chaque sous-couche d'attention éparse est configurée pour appliquer un mécanisme d'attention éparse qui se présente différemment pour des positions d'entrée qui sont dans un premier sous-ensemble approprié des positions d'entrée dans l'entrée à la sous-couche que pour des positions qui ne sont pas dans le premier sous-ensemble approprié.
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_WO2021248139A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>WO2021248139A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_WO2021248139A13</originalsourceid><addsrcrecordid>eNrjZLByDAlx9Qvx9PdT8HMNDXL0AVIh4f5B3sEK4Z4hHgrBAY5Bwa4KCFW-rs4ejn6ewb7BPAysaYk5xam8UJqbQdnNNcTZQze1ID8-tbggMTk1L7UkPtzfyMDI0MjEwtDY0tHQmDhVAJdoKnM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>ATTENTION NEURAL NETWORKS WITH SPARSE ATTENTION MECHANISMS</title><source>esp@cenet</source><creator>GURUGANESH, Guru ; ONTAÑÓN, Santiago ; ZAHEER, Manzil ; AINSLIE, Joshua Timothy ; PHAM, Philip ; AHMED, Amr ; DUBEY, Kumar Avinava</creator><creatorcontrib>GURUGANESH, Guru ; ONTAÑÓN, Santiago ; ZAHEER, Manzil ; AINSLIE, Joshua Timothy ; PHAM, Philip ; AHMED, Amr ; DUBEY, Kumar Avinava</creatorcontrib><description>Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset. L'invention concerne des procédés, des systèmes et un appareil, y compris des programmes informatiques codés sur un support de stockage informatique, permettant de traiter des entrées de réseau en utilisant un réseau d'attention éparse qui comprend une ou plusieurs sous-couches d'attention éparse. Chaque sous-couche d'attention éparse est configurée pour appliquer un mécanisme d'attention éparse qui se présente différemment pour des positions d'entrée qui sont dans un premier sous-ensemble approprié des positions d'entrée dans l'entrée à la sous-couche que pour des positions qui ne sont pas dans le premier sous-ensemble approprié.</description><language>eng ; fre</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; PHYSICS</subject><creationdate>2021</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20211209&DB=EPODOC&CC=WO&NR=2021248139A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25544,76293</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20211209&DB=EPODOC&CC=WO&NR=2021248139A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>GURUGANESH, Guru</creatorcontrib><creatorcontrib>ONTAÑÓN, Santiago</creatorcontrib><creatorcontrib>ZAHEER, Manzil</creatorcontrib><creatorcontrib>AINSLIE, Joshua Timothy</creatorcontrib><creatorcontrib>PHAM, Philip</creatorcontrib><creatorcontrib>AHMED, Amr</creatorcontrib><creatorcontrib>DUBEY, Kumar Avinava</creatorcontrib><title>ATTENTION NEURAL NETWORKS WITH SPARSE ATTENTION MECHANISMS</title><description>Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset. L'invention concerne des procédés, des systèmes et un appareil, y compris des programmes informatiques codés sur un support de stockage informatique, permettant de traiter des entrées de réseau en utilisant un réseau d'attention éparse qui comprend une ou plusieurs sous-couches d'attention éparse. Chaque sous-couche d'attention éparse est configurée pour appliquer un mécanisme d'attention éparse qui se présente différemment pour des positions d'entrée qui sont dans un premier sous-ensemble approprié des positions d'entrée dans l'entrée à la sous-couche que pour des positions qui ne sont pas dans le premier sous-ensemble approprié.</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2021</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZLByDAlx9Qvx9PdT8HMNDXL0AVIh4f5B3sEK4Z4hHgrBAY5Bwa4KCFW-rs4ejn6ewb7BPAysaYk5xam8UJqbQdnNNcTZQze1ID8-tbggMTk1L7UkPtzfyMDI0MjEwtDY0tHQmDhVAJdoKnM</recordid><startdate>20211209</startdate><enddate>20211209</enddate><creator>GURUGANESH, Guru</creator><creator>ONTAÑÓN, Santiago</creator><creator>ZAHEER, Manzil</creator><creator>AINSLIE, Joshua Timothy</creator><creator>PHAM, Philip</creator><creator>AHMED, Amr</creator><creator>DUBEY, Kumar Avinava</creator><scope>EVB</scope></search><sort><creationdate>20211209</creationdate><title>ATTENTION NEURAL NETWORKS WITH SPARSE ATTENTION MECHANISMS</title><author>GURUGANESH, Guru ; ONTAÑÓN, Santiago ; ZAHEER, Manzil ; AINSLIE, Joshua Timothy ; PHAM, Philip ; AHMED, Amr ; DUBEY, Kumar Avinava</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_WO2021248139A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng ; fre</language><creationdate>2021</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>GURUGANESH, Guru</creatorcontrib><creatorcontrib>ONTAÑÓN, Santiago</creatorcontrib><creatorcontrib>ZAHEER, Manzil</creatorcontrib><creatorcontrib>AINSLIE, Joshua Timothy</creatorcontrib><creatorcontrib>PHAM, Philip</creatorcontrib><creatorcontrib>AHMED, Amr</creatorcontrib><creatorcontrib>DUBEY, Kumar Avinava</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>GURUGANESH, Guru</au><au>ONTAÑÓN, Santiago</au><au>ZAHEER, Manzil</au><au>AINSLIE, Joshua Timothy</au><au>PHAM, Philip</au><au>AHMED, Amr</au><au>DUBEY, Kumar Avinava</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>ATTENTION NEURAL NETWORKS WITH SPARSE ATTENTION MECHANISMS</title><date>2021-12-09</date><risdate>2021</risdate><abstract>Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset. L'invention concerne des procédés, des systèmes et un appareil, y compris des programmes informatiques codés sur un support de stockage informatique, permettant de traiter des entrées de réseau en utilisant un réseau d'attention éparse qui comprend une ou plusieurs sous-couches d'attention éparse. Chaque sous-couche d'attention éparse est configurée pour appliquer un mécanisme d'attention éparse qui se présente différemment pour des positions d'entrée qui sont dans un premier sous-ensemble approprié des positions d'entrée dans l'entrée à la sous-couche que pour des positions qui ne sont pas dans le premier sous-ensemble approprié.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng ; fre
recordid	cdi_epo_espacenet_WO2021248139A1
source	esp@cenet
subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
title	ATTENTION NEURAL NETWORKS WITH SPARSE ATTENTION MECHANISMS
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T18%3A17%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=GURUGANESH,%20Guru&rft.date=2021-12-09&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EWO2021248139A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true