A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning

In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper fu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Appicharla, Ramakrishna, Gain, Baban, Pal, Santanu, Ekbal, Asif, Bhattacharyya, Pushpak
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Appicharla, Ramakrishna Gain, Baban Pal, Santanu Ekbal, Asif Bhattacharyya, Pushpak
description	In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context. We conduct experiments on cascade MTL architecture, which consists of one encoder and two decoders. Generation of the source from the context is considered an auxiliary task, and generation of the target from the source is the main task. We experimented with German--English language pairs on News, TED, and Europarl corpora. Evaluation results show that the proposed MTL approach performs better than concatenation-based and multi-encoder DocNMT models in low-resource settings and is sensitive to the choice of context. However, we observe that the MTL models are failing to generate the source from the context. These observations align with the previous studies, and this might suggest that the available document-level parallel corpora are not context-aware, and a robust sentence-level model can outperform the context-aware models.
doi_str_mv	10.48550/arxiv.2407.03076
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2407_03076</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2407_03076</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2407_030763</originalsourceid><addsrcrecordid>eNqFjjEOgkAQAK-xMOoDrNwPgKeA2BKisRATIz3Z6Cobz8PcHQK_V4m91TSTzAgxXUg_XEeRnKNp-eUvQxn7MpDxaiiOCaRoCU6uvnRQaUgr7ah1XtKgIThQbVBBhueSNUFuUFuFjj9iw66ErFaOvRztHfaERrO-jcXgisrS5MeRmG03ebrz-njxNPxA0xXfiaKfCP4bb8asPDM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</title><source>arXiv.org</source><creator>Appicharla, Ramakrishna ; Gain, Baban ; Pal, Santanu ; Ekbal, Asif ; Bhattacharyya, Pushpak</creator><creatorcontrib>Appicharla, Ramakrishna ; Gain, Baban ; Pal, Santanu ; Ekbal, Asif ; Bhattacharyya, Pushpak</creatorcontrib><description>In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context. We conduct experiments on cascade MTL architecture, which consists of one encoder and two decoders. Generation of the source from the context is considered an auxiliary task, and generation of the target from the source is the main task. We experimented with German--English language pairs on News, TED, and Europarl corpora. Evaluation results show that the proposed MTL approach performs better than concatenation-based and multi-encoder DocNMT models in low-resource settings and is sensitive to the choice of context. However, we observe that the MTL models are failing to generate the source from the context. These observations align with the previous studies, and this might suggest that the available document-level parallel corpora are not context-aware, and a robust sentence-level model can outperform the context-aware models.</description><identifier>DOI: 10.48550/arxiv.2407.03076</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2024-07</creationdate><rights>http://creativecommons.org/publicdomain/zero/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2407.03076$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2407.03076$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Appicharla, Ramakrishna</creatorcontrib><creatorcontrib>Gain, Baban</creatorcontrib><creatorcontrib>Pal, Santanu</creatorcontrib><creatorcontrib>Ekbal, Asif</creatorcontrib><creatorcontrib>Bhattacharyya, Pushpak</creatorcontrib><title>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</title><description>In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context. We conduct experiments on cascade MTL architecture, which consists of one encoder and two decoders. Generation of the source from the context is considered an auxiliary task, and generation of the target from the source is the main task. We experimented with German--English language pairs on News, TED, and Europarl corpora. Evaluation results show that the proposed MTL approach performs better than concatenation-based and multi-encoder DocNMT models in low-resource settings and is sensitive to the choice of context. However, we observe that the MTL models are failing to generate the source from the context. These observations align with the previous studies, and this might suggest that the available document-level parallel corpora are not context-aware, and a robust sentence-level model can outperform the context-aware models.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjjEOgkAQAK-xMOoDrNwPgKeA2BKisRATIz3Z6Cobz8PcHQK_V4m91TSTzAgxXUg_XEeRnKNp-eUvQxn7MpDxaiiOCaRoCU6uvnRQaUgr7ah1XtKgIThQbVBBhueSNUFuUFuFjj9iw66ErFaOvRztHfaERrO-jcXgisrS5MeRmG03ebrz-njxNPxA0xXfiaKfCP4bb8asPDM</recordid><startdate>20240703</startdate><enddate>20240703</enddate><creator>Appicharla, Ramakrishna</creator><creator>Gain, Baban</creator><creator>Pal, Santanu</creator><creator>Ekbal, Asif</creator><creator>Bhattacharyya, Pushpak</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240703</creationdate><title>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</title><author>Appicharla, Ramakrishna ; Gain, Baban ; Pal, Santanu ; Ekbal, Asif ; Bhattacharyya, Pushpak</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2407_030763</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Appicharla, Ramakrishna</creatorcontrib><creatorcontrib>Gain, Baban</creatorcontrib><creatorcontrib>Pal, Santanu</creatorcontrib><creatorcontrib>Ekbal, Asif</creatorcontrib><creatorcontrib>Bhattacharyya, Pushpak</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Appicharla, Ramakrishna</au><au>Gain, Baban</au><au>Pal, Santanu</au><au>Ekbal, Asif</au><au>Bhattacharyya, Pushpak</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</atitle><date>2024-07-03</date><risdate>2024</risdate><abstract>In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context. We conduct experiments on cascade MTL architecture, which consists of one encoder and two decoders. Generation of the source from the context is considered an auxiliary task, and generation of the target from the source is the main task. We experimented with German--English language pairs on News, TED, and Europarl corpora. Evaluation results show that the proposed MTL approach performs better than concatenation-based and multi-encoder DocNMT models in low-resource settings and is sensitive to the choice of context. However, we observe that the MTL models are failing to generate the source from the context. These observations align with the previous studies, and this might suggest that the available document-level parallel corpora are not context-aware, and a robust sentence-level model can outperform the context-aware models.</abstract><doi>10.48550/arxiv.2407.03076</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2407.03076
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2407_03076
source	arXiv.org
subjects	Computer Science - Computation and Language
title	A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T14%3A17%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Case%20Study%20on%20Context-Aware%20Neural%20Machine%20Translation%20with%20Multi-Task%20Learning&rft.au=Appicharla,%20Ramakrishna&rft.date=2024-07-03&rft_id=info:doi/10.48550/arxiv.2407.03076&rft_dat=%3Carxiv_GOX%3E2407_03076%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true