Variational Neural Machine Translation with Normalizing Flows
Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations, conditioned not only on the source sentence but also on some latent random variables. The latent variable modeling may introduce useful statistical dependencies that can impro...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Setiawan, Hendra Sperber, Matthias Nallasamy, Udhay Paulik, Matthias |
description | Variational Neural Machine Translation (VNMT) is an attractive framework for
modeling the generation of target translations, conditioned not only on the
source sentence but also on some latent random variables. The latent variable
modeling may introduce useful statistical dependencies that can improve
translation accuracy. Unfortunately, learning informative latent variables is
non-trivial, as the latent space can be prohibitively large, and the latent
codes are prone to be ignored by many translation models at training time.
Previous works impose strong assumptions on the distribution of the latent code
and limit the choice of the NMT architecture. In this paper, we propose to
apply the VNMT framework to the state-of-the-art Transformer and introduce a
more flexible approximate posterior based on normalizing flows. We demonstrate
the efficacy of our proposal under both in-domain and out-of-domain conditions,
significantly outperforming strong baselines. |
doi_str_mv | 10.48550/arxiv.2005.13978 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2005_13978</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2005_13978</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-e3c9f87e59c1e4f65669f3e98b74acffbc4b1f3187afba50a66244fe387d4e543</originalsourceid><addsrcrecordid>eNotj71OwzAURr0woMIDMOEXSLDr_4EBVRSQ2rJErNGNe29ryU2QUyjw9EBgOsMnfTqHsSspau2NETdQPtJ7PRfC1FIF58_Z7QuUBMc09JD5Bt_KD9YQ96lH3hToxzyN_JSOe74ZygFy-kr9ji_zcBov2BlBHvHynzPWLO-bxWO1en54WtytKrDOV6hiIO_QhChRkzXWBlIYfOc0RKIu6k6Skt4BdWAEWDvXmlB5t9VotJqx67_byb99LekA5bP97WinDvUNlVdDuw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Variational Neural Machine Translation with Normalizing Flows</title><source>arXiv.org</source><creator>Setiawan, Hendra ; Sperber, Matthias ; Nallasamy, Udhay ; Paulik, Matthias</creator><creatorcontrib>Setiawan, Hendra ; Sperber, Matthias ; Nallasamy, Udhay ; Paulik, Matthias</creatorcontrib><description>Variational Neural Machine Translation (VNMT) is an attractive framework for
modeling the generation of target translations, conditioned not only on the
source sentence but also on some latent random variables. The latent variable
modeling may introduce useful statistical dependencies that can improve
translation accuracy. Unfortunately, learning informative latent variables is
non-trivial, as the latent space can be prohibitively large, and the latent
codes are prone to be ignored by many translation models at training time.
Previous works impose strong assumptions on the distribution of the latent code
and limit the choice of the NMT architecture. In this paper, we propose to
apply the VNMT framework to the state-of-the-art Transformer and introduce a
more flexible approximate posterior based on normalizing flows. We demonstrate
the efficacy of our proposal under both in-domain and out-of-domain conditions,
significantly outperforming strong baselines.</description><identifier>DOI: 10.48550/arxiv.2005.13978</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2020-05</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2005.13978$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2005.13978$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Setiawan, Hendra</creatorcontrib><creatorcontrib>Sperber, Matthias</creatorcontrib><creatorcontrib>Nallasamy, Udhay</creatorcontrib><creatorcontrib>Paulik, Matthias</creatorcontrib><title>Variational Neural Machine Translation with Normalizing Flows</title><description>Variational Neural Machine Translation (VNMT) is an attractive framework for
modeling the generation of target translations, conditioned not only on the
source sentence but also on some latent random variables. The latent variable
modeling may introduce useful statistical dependencies that can improve
translation accuracy. Unfortunately, learning informative latent variables is
non-trivial, as the latent space can be prohibitively large, and the latent
codes are prone to be ignored by many translation models at training time.
Previous works impose strong assumptions on the distribution of the latent code
and limit the choice of the NMT architecture. In this paper, we propose to
apply the VNMT framework to the state-of-the-art Transformer and introduce a
more flexible approximate posterior based on normalizing flows. We demonstrate
the efficacy of our proposal under both in-domain and out-of-domain conditions,
significantly outperforming strong baselines.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj71OwzAURr0woMIDMOEXSLDr_4EBVRSQ2rJErNGNe29ryU2QUyjw9EBgOsMnfTqHsSspau2NETdQPtJ7PRfC1FIF58_Z7QuUBMc09JD5Bt_KD9YQ96lH3hToxzyN_JSOe74ZygFy-kr9ji_zcBov2BlBHvHynzPWLO-bxWO1en54WtytKrDOV6hiIO_QhChRkzXWBlIYfOc0RKIu6k6Skt4BdWAEWDvXmlB5t9VotJqx67_byb99LekA5bP97WinDvUNlVdDuw</recordid><startdate>20200528</startdate><enddate>20200528</enddate><creator>Setiawan, Hendra</creator><creator>Sperber, Matthias</creator><creator>Nallasamy, Udhay</creator><creator>Paulik, Matthias</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200528</creationdate><title>Variational Neural Machine Translation with Normalizing Flows</title><author>Setiawan, Hendra ; Sperber, Matthias ; Nallasamy, Udhay ; Paulik, Matthias</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-e3c9f87e59c1e4f65669f3e98b74acffbc4b1f3187afba50a66244fe387d4e543</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Setiawan, Hendra</creatorcontrib><creatorcontrib>Sperber, Matthias</creatorcontrib><creatorcontrib>Nallasamy, Udhay</creatorcontrib><creatorcontrib>Paulik, Matthias</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Setiawan, Hendra</au><au>Sperber, Matthias</au><au>Nallasamy, Udhay</au><au>Paulik, Matthias</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Variational Neural Machine Translation with Normalizing Flows</atitle><date>2020-05-28</date><risdate>2020</risdate><abstract>Variational Neural Machine Translation (VNMT) is an attractive framework for
modeling the generation of target translations, conditioned not only on the
source sentence but also on some latent random variables. The latent variable
modeling may introduce useful statistical dependencies that can improve
translation accuracy. Unfortunately, learning informative latent variables is
non-trivial, as the latent space can be prohibitively large, and the latent
codes are prone to be ignored by many translation models at training time.
Previous works impose strong assumptions on the distribution of the latent code
and limit the choice of the NMT architecture. In this paper, we propose to
apply the VNMT framework to the state-of-the-art Transformer and introduce a
more flexible approximate posterior based on normalizing flows. We demonstrate
the efficacy of our proposal under both in-domain and out-of-domain conditions,
significantly outperforming strong baselines.</abstract><doi>10.48550/arxiv.2005.13978</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2005.13978 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2005_13978 |
source | arXiv.org |
subjects | Computer Science - Computation and Language |
title | Variational Neural Machine Translation with Normalizing Flows |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T18%3A50%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Variational%20Neural%20Machine%20Translation%20with%20Normalizing%20Flows&rft.au=Setiawan,%20Hendra&rft.date=2020-05-28&rft_id=info:doi/10.48550/arxiv.2005.13978&rft_dat=%3Carxiv_GOX%3E2005_13978%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |