Dynamic Multi-Branch Layers for On-Device Neural Machine Translation

With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. In...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2022, Vol.30, p.958-967
Hauptverfasser: Tan, Zhixing, Yang, Zeyuan, Zhang, Meng, Liu, Qun, Sun, Maosong, Liu, Yang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 967
container_issue
container_start_page 958
container_title IEEE/ACM transactions on audio, speech, and language processing
container_volume 30
creator Tan, Zhixing
Yang, Zeyuan
Zhang, Meng
Liu, Qun
Sun, Maosong
Liu, Yang
description With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.
doi_str_mv 10.1109/TASLP.2022.3153257
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9729651</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9729651</ieee_id><sourcerecordid>2637438737</sourcerecordid><originalsourceid>FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</originalsourceid><addsrcrecordid>eNo9kMtOwzAQAC0EElXpD8AlEucUex3b8bG0vKSUIpG75TgbNVWaFDtF6t-T0sJp9zCzKw0ht4xOGaP6IZ99Zh9ToABTzgQHoS7ICDjoWHOaXP7toOk1mYSwoZQyqrRWyYgsFofWbmsXLfdNX8eP3rZuHWX2gD5EVeejVRsv8Lt2GL3j3tsmWlq3rluM8gENje3rrr0hV5VtAk7Oc0zy56d8_hpnq5e3-SyLHWjRxy6VjiaqSFMlLThQPEWwqSihkk4JKUpmy0JiIRJ0XFImKltoh2nipNYlH5P709md7772GHqz6fa-HT4akFwlPFVcDRScKOe7EDxWZufrrfUHw6g59jK_vcyxlzn3GqS7k1Qj4r-gFWgpGP8BKWhlsw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2637438737</pqid></control><display><type>article</type><title>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</title><source>IEEE Electronic Library (IEL)</source><creator>Tan, Zhixing ; Yang, Zeyuan ; Zhang, Meng ; Liu, Qun ; Sun, Maosong ; Liu, Yang</creator><creatorcontrib>Tan, Zhixing ; Yang, Zeyuan ; Zhang, Meng ; Liu, Qun ; Sun, Maosong ; Liu, Yang</creatorcontrib><description>With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.</description><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASLP.2022.3153257</identifier><identifier>CODEN: ITASFA</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial intelligence ; Conditional computation ; decoding ; Electronic devices ; Hardware ; Machine translation ; Mobile handsets ; natural language processing ; Performance enhancement ; Performance evaluation ; Training ; Transformers ; Translations</subject><ispartof>IEEE/ACM transactions on audio, speech, and language processing, 2022, Vol.30, p.958-967</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</citedby><cites>FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</cites><orcidid>0000-0002-6011-6115 ; 0000-0002-3087-242X ; 0000-0002-7000-1792 ; 0000-0002-2426-6220</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9729651$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,4010,27904,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9729651$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Tan, Zhixing</creatorcontrib><creatorcontrib>Yang, Zeyuan</creatorcontrib><creatorcontrib>Zhang, Meng</creatorcontrib><creatorcontrib>Liu, Qun</creatorcontrib><creatorcontrib>Sun, Maosong</creatorcontrib><creatorcontrib>Liu, Yang</creatorcontrib><title>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</title><title>IEEE/ACM transactions on audio, speech, and language processing</title><addtitle>TASLP</addtitle><description>With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.</description><subject>Artificial intelligence</subject><subject>Conditional computation</subject><subject>decoding</subject><subject>Electronic devices</subject><subject>Hardware</subject><subject>Machine translation</subject><subject>Mobile handsets</subject><subject>natural language processing</subject><subject>Performance enhancement</subject><subject>Performance evaluation</subject><subject>Training</subject><subject>Transformers</subject><subject>Translations</subject><issn>2329-9290</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kMtOwzAQAC0EElXpD8AlEucUex3b8bG0vKSUIpG75TgbNVWaFDtF6t-T0sJp9zCzKw0ht4xOGaP6IZ99Zh9ToABTzgQHoS7ICDjoWHOaXP7toOk1mYSwoZQyqrRWyYgsFofWbmsXLfdNX8eP3rZuHWX2gD5EVeejVRsv8Lt2GL3j3tsmWlq3rluM8gENje3rrr0hV5VtAk7Oc0zy56d8_hpnq5e3-SyLHWjRxy6VjiaqSFMlLThQPEWwqSihkk4JKUpmy0JiIRJ0XFImKltoh2nipNYlH5P709md7772GHqz6fa-HT4akFwlPFVcDRScKOe7EDxWZufrrfUHw6g59jK_vcyxlzn3GqS7k1Qj4r-gFWgpGP8BKWhlsw</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Tan, Zhixing</creator><creator>Yang, Zeyuan</creator><creator>Zhang, Meng</creator><creator>Liu, Qun</creator><creator>Sun, Maosong</creator><creator>Liu, Yang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-6011-6115</orcidid><orcidid>https://orcid.org/0000-0002-3087-242X</orcidid><orcidid>https://orcid.org/0000-0002-7000-1792</orcidid><orcidid>https://orcid.org/0000-0002-2426-6220</orcidid></search><sort><creationdate>2022</creationdate><title>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</title><author>Tan, Zhixing ; Yang, Zeyuan ; Zhang, Meng ; Liu, Qun ; Sun, Maosong ; Liu, Yang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c295t-c86c047b8876a2c2738e2a85d2f6c7565d1adb6eb54ec36015fab9ce84c699d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial intelligence</topic><topic>Conditional computation</topic><topic>decoding</topic><topic>Electronic devices</topic><topic>Hardware</topic><topic>Machine translation</topic><topic>Mobile handsets</topic><topic>natural language processing</topic><topic>Performance enhancement</topic><topic>Performance evaluation</topic><topic>Training</topic><topic>Transformers</topic><topic>Translations</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tan, Zhixing</creatorcontrib><creatorcontrib>Yang, Zeyuan</creatorcontrib><creatorcontrib>Zhang, Meng</creatorcontrib><creatorcontrib>Liu, Qun</creatorcontrib><creatorcontrib>Sun, Maosong</creatorcontrib><creatorcontrib>Liu, Yang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Tan, Zhixing</au><au>Yang, Zeyuan</au><au>Zhang, Meng</au><au>Liu, Qun</au><au>Sun, Maosong</au><au>Liu, Yang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dynamic Multi-Branch Layers for On-Device Neural Machine Translation</atitle><jtitle>IEEE/ACM transactions on audio, speech, and language processing</jtitle><stitle>TASLP</stitle><date>2022</date><risdate>2022</risdate><volume>30</volume><spage>958</spage><epage>967</epage><pages>958-967</pages><issn>2329-9290</issn><eissn>2329-9304</eissn><coden>ITASFA</coden><abstract>With the rapid development of artificial intelligence (AI), there is a trend in moving AI applications, such as neural machine translation (NMT), from cloud to mobile devices. Constrained by limited hardware resources and battery, the performance of on-device NMT systems is far from satisfactory. Inspired by conditional computation, we propose to improve the performance of on-device NMT systems with dynamic multi-branch layers. Specifically, we design a layer-wise dynamic multi-branch network with only one branch activated during training and inference. As not all branches are activated during training, we propose shared-private reparameterization to ensure sufficient training for each branch. At almost the same computational cost, our method achieves improvements of up to 1.7 BLEU points on the WMT14 English-German translation task and 1.8 BLEU points on the WMT20 Chinese-English translation task over the Transformer model, respectively. Compared with a strong baseline that also uses multiple branches, the proposed method is up to 1.5 times faster with the same number of parameters.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TASLP.2022.3153257</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-6011-6115</orcidid><orcidid>https://orcid.org/0000-0002-3087-242X</orcidid><orcidid>https://orcid.org/0000-0002-7000-1792</orcidid><orcidid>https://orcid.org/0000-0002-2426-6220</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2329-9290
ispartof IEEE/ACM transactions on audio, speech, and language processing, 2022, Vol.30, p.958-967
issn 2329-9290
2329-9304
language eng
recordid cdi_ieee_primary_9729651
source IEEE Electronic Library (IEL)
subjects Artificial intelligence
Conditional computation
decoding
Electronic devices
Hardware
Machine translation
Mobile handsets
natural language processing
Performance enhancement
Performance evaluation
Training
Transformers
Translations
title Dynamic Multi-Branch Layers for On-Device Neural Machine Translation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T12%3A36%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dynamic%20Multi-Branch%20Layers%20for%20On-Device%20Neural%20Machine%20Translation&rft.jtitle=IEEE/ACM%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Tan,%20Zhixing&rft.date=2022&rft.volume=30&rft.spage=958&rft.epage=967&rft.pages=958-967&rft.issn=2329-9290&rft.eissn=2329-9304&rft.coden=ITASFA&rft_id=info:doi/10.1109/TASLP.2022.3153257&rft_dat=%3Cproquest_RIE%3E2637438737%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2637438737&rft_id=info:pmid/&rft_ieee_id=9729651&rfr_iscdi=true