Improving Multi-Head Attention with Capsule Networks
Multi-head attention advances neural machine translation by working out multiple versions of attention in different subspaces, but the neglect of semantic overlapping between subspaces increases the difficulty of translation and consequently hinders the further improvement of translation performance...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Gu, Shuhao Feng, Yang |
description | Multi-head attention advances neural machine translation by working out
multiple versions of attention in different subspaces, but the neglect of
semantic overlapping between subspaces increases the difficulty of translation
and consequently hinders the further improvement of translation performance. In
this paper, we employ capsule networks to comb the information from the
multiple heads of the attention so that similar information can be clustered
and unique information can be reserved. To this end, we adopt two routing
mechanisms of Dynamic Routing and EM Routing, to fulfill the clustering and
separating. We conducted experiments on Chinese-to-English and
English-to-German translation tasks and got consistent improvements over the
strong Transformer baseline. |
doi_str_mv | 10.48550/arxiv.1909.00188 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1909_00188</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1909_00188</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-17364f1f4dd4b54223685d7164c18ae15d9be31aba7c7685280436b5f2dc9beb3</originalsourceid><addsrcrecordid>eNotzr1uwjAYhWEvHSroBXSqbyCpv_g3I4raggTtwh59jh2wCEnkGGjvvhQ6neGVjh5CnoHlwkjJXjF-h3MOJStzxsCYRyJWxzEO59Dv6ObUpZAtPTq6SMn3KQw9vYS0pxWO06nz9NOnyxAP05w8tNhN_ul_Z2T7_ratltn662NVLdYZKm0y0FyJFlrhnLBSFAVXRjoNSjRg0IN0pfUc0KJu9DUVhgmurGwL11yL5TPycr-9sesxhiPGn_qPX9_4_BdK8D86</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Improving Multi-Head Attention with Capsule Networks</title><source>arXiv.org</source><creator>Gu, Shuhao ; Feng, Yang</creator><creatorcontrib>Gu, Shuhao ; Feng, Yang</creatorcontrib><description>Multi-head attention advances neural machine translation by working out
multiple versions of attention in different subspaces, but the neglect of
semantic overlapping between subspaces increases the difficulty of translation
and consequently hinders the further improvement of translation performance. In
this paper, we employ capsule networks to comb the information from the
multiple heads of the attention so that similar information can be clustered
and unique information can be reserved. To this end, we adopt two routing
mechanisms of Dynamic Routing and EM Routing, to fulfill the clustering and
separating. We conducted experiments on Chinese-to-English and
English-to-German translation tasks and got consistent improvements over the
strong Transformer baseline.</description><identifier>DOI: 10.48550/arxiv.1909.00188</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2019-08</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1909.00188$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1909.00188$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Gu, Shuhao</creatorcontrib><creatorcontrib>Feng, Yang</creatorcontrib><title>Improving Multi-Head Attention with Capsule Networks</title><description>Multi-head attention advances neural machine translation by working out
multiple versions of attention in different subspaces, but the neglect of
semantic overlapping between subspaces increases the difficulty of translation
and consequently hinders the further improvement of translation performance. In
this paper, we employ capsule networks to comb the information from the
multiple heads of the attention so that similar information can be clustered
and unique information can be reserved. To this end, we adopt two routing
mechanisms of Dynamic Routing and EM Routing, to fulfill the clustering and
separating. We conducted experiments on Chinese-to-English and
English-to-German translation tasks and got consistent improvements over the
strong Transformer baseline.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzr1uwjAYhWEvHSroBXSqbyCpv_g3I4raggTtwh59jh2wCEnkGGjvvhQ6neGVjh5CnoHlwkjJXjF-h3MOJStzxsCYRyJWxzEO59Dv6ObUpZAtPTq6SMn3KQw9vYS0pxWO06nz9NOnyxAP05w8tNhN_ul_Z2T7_ratltn662NVLdYZKm0y0FyJFlrhnLBSFAVXRjoNSjRg0IN0pfUc0KJu9DUVhgmurGwL11yL5TPycr-9sesxhiPGn_qPX9_4_BdK8D86</recordid><startdate>20190831</startdate><enddate>20190831</enddate><creator>Gu, Shuhao</creator><creator>Feng, Yang</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20190831</creationdate><title>Improving Multi-Head Attention with Capsule Networks</title><author>Gu, Shuhao ; Feng, Yang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-17364f1f4dd4b54223685d7164c18ae15d9be31aba7c7685280436b5f2dc9beb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Gu, Shuhao</creatorcontrib><creatorcontrib>Feng, Yang</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gu, Shuhao</au><au>Feng, Yang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Improving Multi-Head Attention with Capsule Networks</atitle><date>2019-08-31</date><risdate>2019</risdate><abstract>Multi-head attention advances neural machine translation by working out
multiple versions of attention in different subspaces, but the neglect of
semantic overlapping between subspaces increases the difficulty of translation
and consequently hinders the further improvement of translation performance. In
this paper, we employ capsule networks to comb the information from the
multiple heads of the attention so that similar information can be clustered
and unique information can be reserved. To this end, we adopt two routing
mechanisms of Dynamic Routing and EM Routing, to fulfill the clustering and
separating. We conducted experiments on Chinese-to-English and
English-to-German translation tasks and got consistent improvements over the
strong Transformer baseline.</abstract><doi>10.48550/arxiv.1909.00188</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.1909.00188 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_1909_00188 |
source | arXiv.org |
subjects | Computer Science - Computation and Language |
title | Improving Multi-Head Attention with Capsule Networks |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T11%3A49%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Improving%20Multi-Head%20Attention%20with%20Capsule%20Networks&rft.au=Gu,%20Shuhao&rft.date=2019-08-31&rft_id=info:doi/10.48550/arxiv.1909.00188&rft_dat=%3Carxiv_GOX%3E1909_00188%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |