Variational Neural Machine Translation with Normalizing Flows

Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations, conditioned not only on the source sentence but also on some latent random variables. The latent variable modeling may introduce useful statistical dependencies that can impro...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Setiawan, Hendra, Sperber, Matthias, Nallasamy, Udhay, Paulik, Matthias
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Setiawan, Hendra Sperber, Matthias Nallasamy, Udhay Paulik, Matthias
description	Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations, conditioned not only on the source sentence but also on some latent random variables. The latent variable modeling may introduce useful statistical dependencies that can improve translation accuracy. Unfortunately, learning informative latent variables is non-trivial, as the latent space can be prohibitively large, and the latent codes are prone to be ignored by many translation models at training time. Previous works impose strong assumptions on the distribution of the latent code and limit the choice of the NMT architecture. In this paper, we propose to apply the VNMT framework to the state-of-the-art Transformer and introduce a more flexible approximate posterior based on normalizing flows. We demonstrate the efficacy of our proposal under both in-domain and out-of-domain conditions, significantly outperforming strong baselines.
doi_str_mv	10.48550/arxiv.2005.13978
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2005_13978</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2005_13978</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-e3c9f87e59c1e4f65669f3e98b74acffbc4b1f3187afba50a66244fe387d4e543</originalsourceid><addsrcrecordid>eNotj71OwzAURr0woMIDMOEXSLDr_4EBVRSQ2rJErNGNe29ryU2QUyjw9EBgOsMnfTqHsSspau2NETdQPtJ7PRfC1FIF58_Z7QuUBMc09JD5Bt_KD9YQ96lH3hToxzyN_JSOe74ZygFy-kr9ji_zcBov2BlBHvHynzPWLO-bxWO1en54WtytKrDOV6hiIO_QhChRkzXWBlIYfOc0RKIu6k6Skt4BdWAEWDvXmlB5t9VotJqx67_byb99LekA5bP97WinDvUNlVdDuw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Variational Neural Machine Translation with Normalizing Flows</title><source>arXiv.org</source><creator>Setiawan, Hendra ; Sperber, Matthias ; Nallasamy, Udhay ; Paulik, Matthias</creator><creatorcontrib>Setiawan, Hendra ; Sperber, Matthias ; Nallasamy, Udhay ; Paulik, Matthias</creatorcontrib><description>Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations, conditioned not only on the source sentence but also on some latent random variables. The latent variable modeling may introduce useful statistical dependencies that can improve translation accuracy. Unfortunately, learning informative latent variables is non-trivial, as the latent space can be prohibitively large, and the latent codes are prone to be ignored by many translation models at training time. Previous works impose strong assumptions on the distribution of the latent code and limit the choice of the NMT architecture. In this paper, we propose to apply the VNMT framework to the state-of-the-art Transformer and introduce a more flexible approximate posterior based on normalizing flows. We demonstrate the efficacy of our proposal under both in-domain and out-of-domain conditions, significantly outperforming strong baselines.</description><identifier>DOI: 10.48550/arxiv.2005.13978</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2020-05</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2005.13978$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2005.13978$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Setiawan, Hendra</creatorcontrib><creatorcontrib>Sperber, Matthias</creatorcontrib><creatorcontrib>Nallasamy, Udhay</creatorcontrib><creatorcontrib>Paulik, Matthias</creatorcontrib><title>Variational Neural Machine Translation with Normalizing Flows</title><description>Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations, conditioned not only on the source sentence but also on some latent random variables. The latent variable modeling may introduce useful statistical dependencies that can improve translation accuracy. Unfortunately, learning informative latent variables is non-trivial, as the latent space can be prohibitively large, and the latent codes are prone to be ignored by many translation models at training time. Previous works impose strong assumptions on the distribution of the latent code and limit the choice of the NMT architecture. In this paper, we propose to apply the VNMT framework to the state-of-the-art Transformer and introduce a more flexible approximate posterior based on normalizing flows. We demonstrate the efficacy of our proposal under both in-domain and out-of-domain conditions, significantly outperforming strong baselines.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj71OwzAURr0woMIDMOEXSLDr_4EBVRSQ2rJErNGNe29ryU2QUyjw9EBgOsMnfTqHsSspau2NETdQPtJ7PRfC1FIF58_Z7QuUBMc09JD5Bt_KD9YQ96lH3hToxzyN_JSOe74ZygFy-kr9ji_zcBov2BlBHvHynzPWLO-bxWO1en54WtytKrDOV6hiIO_QhChRkzXWBlIYfOc0RKIu6k6Skt4BdWAEWDvXmlB5t9VotJqx67_byb99LekA5bP97WinDvUNlVdDuw</recordid><startdate>20200528</startdate><enddate>20200528</enddate><creator>Setiawan, Hendra</creator><creator>Sperber, Matthias</creator><creator>Nallasamy, Udhay</creator><creator>Paulik, Matthias</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200528</creationdate><title>Variational Neural Machine Translation with Normalizing Flows</title><author>Setiawan, Hendra ; Sperber, Matthias ; Nallasamy, Udhay ; Paulik, Matthias</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-e3c9f87e59c1e4f65669f3e98b74acffbc4b1f3187afba50a66244fe387d4e543</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Setiawan, Hendra</creatorcontrib><creatorcontrib>Sperber, Matthias</creatorcontrib><creatorcontrib>Nallasamy, Udhay</creatorcontrib><creatorcontrib>Paulik, Matthias</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Setiawan, Hendra</au><au>Sperber, Matthias</au><au>Nallasamy, Udhay</au><au>Paulik, Matthias</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Variational Neural Machine Translation with Normalizing Flows</atitle><date>2020-05-28</date><risdate>2020</risdate><abstract>Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations, conditioned not only on the source sentence but also on some latent random variables. The latent variable modeling may introduce useful statistical dependencies that can improve translation accuracy. Unfortunately, learning informative latent variables is non-trivial, as the latent space can be prohibitively large, and the latent codes are prone to be ignored by many translation models at training time. Previous works impose strong assumptions on the distribution of the latent code and limit the choice of the NMT architecture. In this paper, we propose to apply the VNMT framework to the state-of-the-art Transformer and introduce a more flexible approximate posterior based on normalizing flows. We demonstrate the efficacy of our proposal under both in-domain and out-of-domain conditions, significantly outperforming strong baselines.</abstract><doi>10.48550/arxiv.2005.13978</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2005.13978
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2005_13978
source	arXiv.org
subjects	Computer Science - Computation and Language
title	Variational Neural Machine Translation with Normalizing Flows
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T18%3A50%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Variational%20Neural%20Machine%20Translation%20with%20Normalizing%20Flows&rft.au=Setiawan,%20Hendra&rft.date=2020-05-28&rft_id=info:doi/10.48550/arxiv.2005.13978&rft_dat=%3Carxiv_GOX%3E2005_13978%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true