A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning
In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper fu...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Appicharla, Ramakrishna Gain, Baban Pal, Santanu Ekbal, Asif Bhattacharyya, Pushpak |
description | In document-level neural machine translation (DocNMT), multi-encoder
approaches are common in encoding context and source sentences. Recent studies
\cite{li-etal-2020-multi-encoder} have shown that the context encoder generates
noise and makes the model robust to the choice of context. This paper further
investigates this observation by explicitly modelling context encoding through
multi-task learning (MTL) to make the model sensitive to the choice of context.
We conduct experiments on cascade MTL architecture, which consists of one
encoder and two decoders. Generation of the source from the context is
considered an auxiliary task, and generation of the target from the source is
the main task. We experimented with German--English language pairs on News,
TED, and Europarl corpora. Evaluation results show that the proposed MTL
approach performs better than concatenation-based and multi-encoder DocNMT
models in low-resource settings and is sensitive to the choice of context.
However, we observe that the MTL models are failing to generate the source from
the context. These observations align with the previous studies, and this might
suggest that the available document-level parallel corpora are not
context-aware, and a robust sentence-level model can outperform the
context-aware models. |
doi_str_mv | 10.48550/arxiv.2407.03076 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2407_03076</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2407_03076</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2407_030763</originalsourceid><addsrcrecordid>eNqFjjEOgkAQAK-xMOoDrNwPgKeA2BKisRATIz3Z6Cobz8PcHQK_V4m91TSTzAgxXUg_XEeRnKNp-eUvQxn7MpDxaiiOCaRoCU6uvnRQaUgr7ah1XtKgIThQbVBBhueSNUFuUFuFjj9iw66ErFaOvRztHfaERrO-jcXgisrS5MeRmG03ebrz-njxNPxA0xXfiaKfCP4bb8asPDM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</title><source>arXiv.org</source><creator>Appicharla, Ramakrishna ; Gain, Baban ; Pal, Santanu ; Ekbal, Asif ; Bhattacharyya, Pushpak</creator><creatorcontrib>Appicharla, Ramakrishna ; Gain, Baban ; Pal, Santanu ; Ekbal, Asif ; Bhattacharyya, Pushpak</creatorcontrib><description>In document-level neural machine translation (DocNMT), multi-encoder
approaches are common in encoding context and source sentences. Recent studies
\cite{li-etal-2020-multi-encoder} have shown that the context encoder generates
noise and makes the model robust to the choice of context. This paper further
investigates this observation by explicitly modelling context encoding through
multi-task learning (MTL) to make the model sensitive to the choice of context.
We conduct experiments on cascade MTL architecture, which consists of one
encoder and two decoders. Generation of the source from the context is
considered an auxiliary task, and generation of the target from the source is
the main task. We experimented with German--English language pairs on News,
TED, and Europarl corpora. Evaluation results show that the proposed MTL
approach performs better than concatenation-based and multi-encoder DocNMT
models in low-resource settings and is sensitive to the choice of context.
However, we observe that the MTL models are failing to generate the source from
the context. These observations align with the previous studies, and this might
suggest that the available document-level parallel corpora are not
context-aware, and a robust sentence-level model can outperform the
context-aware models.</description><identifier>DOI: 10.48550/arxiv.2407.03076</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2024-07</creationdate><rights>http://creativecommons.org/publicdomain/zero/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2407.03076$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2407.03076$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Appicharla, Ramakrishna</creatorcontrib><creatorcontrib>Gain, Baban</creatorcontrib><creatorcontrib>Pal, Santanu</creatorcontrib><creatorcontrib>Ekbal, Asif</creatorcontrib><creatorcontrib>Bhattacharyya, Pushpak</creatorcontrib><title>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</title><description>In document-level neural machine translation (DocNMT), multi-encoder
approaches are common in encoding context and source sentences. Recent studies
\cite{li-etal-2020-multi-encoder} have shown that the context encoder generates
noise and makes the model robust to the choice of context. This paper further
investigates this observation by explicitly modelling context encoding through
multi-task learning (MTL) to make the model sensitive to the choice of context.
We conduct experiments on cascade MTL architecture, which consists of one
encoder and two decoders. Generation of the source from the context is
considered an auxiliary task, and generation of the target from the source is
the main task. We experimented with German--English language pairs on News,
TED, and Europarl corpora. Evaluation results show that the proposed MTL
approach performs better than concatenation-based and multi-encoder DocNMT
models in low-resource settings and is sensitive to the choice of context.
However, we observe that the MTL models are failing to generate the source from
the context. These observations align with the previous studies, and this might
suggest that the available document-level parallel corpora are not
context-aware, and a robust sentence-level model can outperform the
context-aware models.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjjEOgkAQAK-xMOoDrNwPgKeA2BKisRATIz3Z6Cobz8PcHQK_V4m91TSTzAgxXUg_XEeRnKNp-eUvQxn7MpDxaiiOCaRoCU6uvnRQaUgr7ah1XtKgIThQbVBBhueSNUFuUFuFjj9iw66ErFaOvRztHfaERrO-jcXgisrS5MeRmG03ebrz-njxNPxA0xXfiaKfCP4bb8asPDM</recordid><startdate>20240703</startdate><enddate>20240703</enddate><creator>Appicharla, Ramakrishna</creator><creator>Gain, Baban</creator><creator>Pal, Santanu</creator><creator>Ekbal, Asif</creator><creator>Bhattacharyya, Pushpak</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240703</creationdate><title>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</title><author>Appicharla, Ramakrishna ; Gain, Baban ; Pal, Santanu ; Ekbal, Asif ; Bhattacharyya, Pushpak</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2407_030763</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Appicharla, Ramakrishna</creatorcontrib><creatorcontrib>Gain, Baban</creatorcontrib><creatorcontrib>Pal, Santanu</creatorcontrib><creatorcontrib>Ekbal, Asif</creatorcontrib><creatorcontrib>Bhattacharyya, Pushpak</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Appicharla, Ramakrishna</au><au>Gain, Baban</au><au>Pal, Santanu</au><au>Ekbal, Asif</au><au>Bhattacharyya, Pushpak</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</atitle><date>2024-07-03</date><risdate>2024</risdate><abstract>In document-level neural machine translation (DocNMT), multi-encoder
approaches are common in encoding context and source sentences. Recent studies
\cite{li-etal-2020-multi-encoder} have shown that the context encoder generates
noise and makes the model robust to the choice of context. This paper further
investigates this observation by explicitly modelling context encoding through
multi-task learning (MTL) to make the model sensitive to the choice of context.
We conduct experiments on cascade MTL architecture, which consists of one
encoder and two decoders. Generation of the source from the context is
considered an auxiliary task, and generation of the target from the source is
the main task. We experimented with German--English language pairs on News,
TED, and Europarl corpora. Evaluation results show that the proposed MTL
approach performs better than concatenation-based and multi-encoder DocNMT
models in low-resource settings and is sensitive to the choice of context.
However, we observe that the MTL models are failing to generate the source from
the context. These observations align with the previous studies, and this might
suggest that the available document-level parallel corpora are not
context-aware, and a robust sentence-level model can outperform the
context-aware models.</abstract><doi>10.48550/arxiv.2407.03076</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2407.03076 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2407_03076 |
source | arXiv.org |
subjects | Computer Science - Computation and Language |
title | A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T14%3A17%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Case%20Study%20on%20Context-Aware%20Neural%20Machine%20Translation%20with%20Multi-Task%20Learning&rft.au=Appicharla,%20Ramakrishna&rft.date=2024-07-03&rft_id=info:doi/10.48550/arxiv.2407.03076&rft_dat=%3Carxiv_GOX%3E2407_03076%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |