A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning

In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper fu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Appicharla, Ramakrishna, Gain, Baban, Pal, Santanu, Ekbal, Asif, Bhattacharyya, Pushpak
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Appicharla, Ramakrishna
Gain, Baban
Pal, Santanu
Ekbal, Asif
Bhattacharyya, Pushpak
description In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context. We conduct experiments on cascade MTL architecture, which consists of one encoder and two decoders. Generation of the source from the context is considered an auxiliary task, and generation of the target from the source is the main task. We experimented with German--English language pairs on News, TED, and Europarl corpora. Evaluation results show that the proposed MTL approach performs better than concatenation-based and multi-encoder DocNMT models in low-resource settings and is sensitive to the choice of context. However, we observe that the MTL models are failing to generate the source from the context. These observations align with the previous studies, and this might suggest that the available document-level parallel corpora are not context-aware, and a robust sentence-level model can outperform the context-aware models.
doi_str_mv 10.48550/arxiv.2407.03076
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2407_03076</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2407_03076</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2407_030763</originalsourceid><addsrcrecordid>eNqFjjEOgkAQAK-xMOoDrNwPgKeA2BKisRATIz3Z6Cobz8PcHQK_V4m91TSTzAgxXUg_XEeRnKNp-eUvQxn7MpDxaiiOCaRoCU6uvnRQaUgr7ah1XtKgIThQbVBBhueSNUFuUFuFjj9iw66ErFaOvRztHfaERrO-jcXgisrS5MeRmG03ebrz-njxNPxA0xXfiaKfCP4bb8asPDM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</title><source>arXiv.org</source><creator>Appicharla, Ramakrishna ; Gain, Baban ; Pal, Santanu ; Ekbal, Asif ; Bhattacharyya, Pushpak</creator><creatorcontrib>Appicharla, Ramakrishna ; Gain, Baban ; Pal, Santanu ; Ekbal, Asif ; Bhattacharyya, Pushpak</creatorcontrib><description>In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context. We conduct experiments on cascade MTL architecture, which consists of one encoder and two decoders. Generation of the source from the context is considered an auxiliary task, and generation of the target from the source is the main task. We experimented with German--English language pairs on News, TED, and Europarl corpora. Evaluation results show that the proposed MTL approach performs better than concatenation-based and multi-encoder DocNMT models in low-resource settings and is sensitive to the choice of context. However, we observe that the MTL models are failing to generate the source from the context. These observations align with the previous studies, and this might suggest that the available document-level parallel corpora are not context-aware, and a robust sentence-level model can outperform the context-aware models.</description><identifier>DOI: 10.48550/arxiv.2407.03076</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2024-07</creationdate><rights>http://creativecommons.org/publicdomain/zero/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2407.03076$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2407.03076$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Appicharla, Ramakrishna</creatorcontrib><creatorcontrib>Gain, Baban</creatorcontrib><creatorcontrib>Pal, Santanu</creatorcontrib><creatorcontrib>Ekbal, Asif</creatorcontrib><creatorcontrib>Bhattacharyya, Pushpak</creatorcontrib><title>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</title><description>In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context. We conduct experiments on cascade MTL architecture, which consists of one encoder and two decoders. Generation of the source from the context is considered an auxiliary task, and generation of the target from the source is the main task. We experimented with German--English language pairs on News, TED, and Europarl corpora. Evaluation results show that the proposed MTL approach performs better than concatenation-based and multi-encoder DocNMT models in low-resource settings and is sensitive to the choice of context. However, we observe that the MTL models are failing to generate the source from the context. These observations align with the previous studies, and this might suggest that the available document-level parallel corpora are not context-aware, and a robust sentence-level model can outperform the context-aware models.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjjEOgkAQAK-xMOoDrNwPgKeA2BKisRATIz3Z6Cobz8PcHQK_V4m91TSTzAgxXUg_XEeRnKNp-eUvQxn7MpDxaiiOCaRoCU6uvnRQaUgr7ah1XtKgIThQbVBBhueSNUFuUFuFjj9iw66ErFaOvRztHfaERrO-jcXgisrS5MeRmG03ebrz-njxNPxA0xXfiaKfCP4bb8asPDM</recordid><startdate>20240703</startdate><enddate>20240703</enddate><creator>Appicharla, Ramakrishna</creator><creator>Gain, Baban</creator><creator>Pal, Santanu</creator><creator>Ekbal, Asif</creator><creator>Bhattacharyya, Pushpak</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240703</creationdate><title>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</title><author>Appicharla, Ramakrishna ; Gain, Baban ; Pal, Santanu ; Ekbal, Asif ; Bhattacharyya, Pushpak</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2407_030763</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Appicharla, Ramakrishna</creatorcontrib><creatorcontrib>Gain, Baban</creatorcontrib><creatorcontrib>Pal, Santanu</creatorcontrib><creatorcontrib>Ekbal, Asif</creatorcontrib><creatorcontrib>Bhattacharyya, Pushpak</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Appicharla, Ramakrishna</au><au>Gain, Baban</au><au>Pal, Santanu</au><au>Ekbal, Asif</au><au>Bhattacharyya, Pushpak</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning</atitle><date>2024-07-03</date><risdate>2024</risdate><abstract>In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context. We conduct experiments on cascade MTL architecture, which consists of one encoder and two decoders. Generation of the source from the context is considered an auxiliary task, and generation of the target from the source is the main task. We experimented with German--English language pairs on News, TED, and Europarl corpora. Evaluation results show that the proposed MTL approach performs better than concatenation-based and multi-encoder DocNMT models in low-resource settings and is sensitive to the choice of context. However, we observe that the MTL models are failing to generate the source from the context. These observations align with the previous studies, and this might suggest that the available document-level parallel corpora are not context-aware, and a robust sentence-level model can outperform the context-aware models.</abstract><doi>10.48550/arxiv.2407.03076</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2407.03076
ispartof
issn
language eng
recordid cdi_arxiv_primary_2407_03076
source arXiv.org
subjects Computer Science - Computation and Language
title A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T14%3A17%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Case%20Study%20on%20Context-Aware%20Neural%20Machine%20Translation%20with%20Multi-Task%20Learning&rft.au=Appicharla,%20Ramakrishna&rft.date=2024-07-03&rft_id=info:doi/10.48550/arxiv.2407.03076&rft_dat=%3Carxiv_GOX%3E2407_03076%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true