Generating Gender Alternatives in Machine Translation

Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term "the nurse") into the gendered form that is most prevalent in the systems' training data (e.g., "enfermera", the Spanish term for a female nurse). This often reflects and perpetu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Garg, Sarthak, Gheini, Mozhdeh, Emmanuel, Clara, Likhomanenko, Tatiana, Gao, Qin, Paulik, Matthias
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Garg, Sarthak
Gheini, Mozhdeh
Emmanuel, Clara
Likhomanenko, Tatiana
Gao, Qin
Paulik, Matthias
description Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term "the nurse") into the gendered form that is most prevalent in the systems' training data (e.g., "enfermera", the Spanish term for a female nurse). This often reflects and perpetuates harmful stereotypes present in society. With MT user interfaces in mind that allow for resolving gender ambiguity in a frictionless manner, we study the problem of generating all grammatically correct gendered translation alternatives. We open source train and test datasets for five language pairs and establish benchmarks for this task. Our key technical contribution is a novel semi-supervised solution for generating alternatives that integrates seamlessly with standard MT models and maintains high performance without requiring additional components or increasing inference overhead.
doi_str_mv 10.48550/arxiv.2407.20438
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2407_20438</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2407_20438</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2407_204383</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw1zMyMDG24GQwdU_NSy1KLMnMS1cAMlNSixQcc0pSi_KAQmWpxQqZeQq-ickZmXmpCiFFiXnFOUDx_DweBta0xJziVF4ozc0g7-Ya4uyhC7YgvqAoMzexqDIeZFE82CJjwioAcl0yyQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Generating Gender Alternatives in Machine Translation</title><source>arXiv.org</source><creator>Garg, Sarthak ; Gheini, Mozhdeh ; Emmanuel, Clara ; Likhomanenko, Tatiana ; Gao, Qin ; Paulik, Matthias</creator><creatorcontrib>Garg, Sarthak ; Gheini, Mozhdeh ; Emmanuel, Clara ; Likhomanenko, Tatiana ; Gao, Qin ; Paulik, Matthias</creatorcontrib><description>Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term "the nurse") into the gendered form that is most prevalent in the systems' training data (e.g., "enfermera", the Spanish term for a female nurse). This often reflects and perpetuates harmful stereotypes present in society. With MT user interfaces in mind that allow for resolving gender ambiguity in a frictionless manner, we study the problem of generating all grammatically correct gendered translation alternatives. We open source train and test datasets for five language pairs and establish benchmarks for this task. Our key technical contribution is a novel semi-supervised solution for generating alternatives that integrates seamlessly with standard MT models and maintains high performance without requiring additional components or increasing inference overhead.</description><identifier>DOI: 10.48550/arxiv.2407.20438</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2024-07</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2407.20438$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2407.20438$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Garg, Sarthak</creatorcontrib><creatorcontrib>Gheini, Mozhdeh</creatorcontrib><creatorcontrib>Emmanuel, Clara</creatorcontrib><creatorcontrib>Likhomanenko, Tatiana</creatorcontrib><creatorcontrib>Gao, Qin</creatorcontrib><creatorcontrib>Paulik, Matthias</creatorcontrib><title>Generating Gender Alternatives in Machine Translation</title><description>Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term "the nurse") into the gendered form that is most prevalent in the systems' training data (e.g., "enfermera", the Spanish term for a female nurse). This often reflects and perpetuates harmful stereotypes present in society. With MT user interfaces in mind that allow for resolving gender ambiguity in a frictionless manner, we study the problem of generating all grammatically correct gendered translation alternatives. We open source train and test datasets for five language pairs and establish benchmarks for this task. Our key technical contribution is a novel semi-supervised solution for generating alternatives that integrates seamlessly with standard MT models and maintains high performance without requiring additional components or increasing inference overhead.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw1zMyMDG24GQwdU_NSy1KLMnMS1cAMlNSixQcc0pSi_KAQmWpxQqZeQq-ickZmXmpCiFFiXnFOUDx_DweBta0xJziVF4ozc0g7-Ya4uyhC7YgvqAoMzexqDIeZFE82CJjwioAcl0yyQ</recordid><startdate>20240729</startdate><enddate>20240729</enddate><creator>Garg, Sarthak</creator><creator>Gheini, Mozhdeh</creator><creator>Emmanuel, Clara</creator><creator>Likhomanenko, Tatiana</creator><creator>Gao, Qin</creator><creator>Paulik, Matthias</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240729</creationdate><title>Generating Gender Alternatives in Machine Translation</title><author>Garg, Sarthak ; Gheini, Mozhdeh ; Emmanuel, Clara ; Likhomanenko, Tatiana ; Gao, Qin ; Paulik, Matthias</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2407_204383</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Garg, Sarthak</creatorcontrib><creatorcontrib>Gheini, Mozhdeh</creatorcontrib><creatorcontrib>Emmanuel, Clara</creatorcontrib><creatorcontrib>Likhomanenko, Tatiana</creatorcontrib><creatorcontrib>Gao, Qin</creatorcontrib><creatorcontrib>Paulik, Matthias</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Garg, Sarthak</au><au>Gheini, Mozhdeh</au><au>Emmanuel, Clara</au><au>Likhomanenko, Tatiana</au><au>Gao, Qin</au><au>Paulik, Matthias</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Generating Gender Alternatives in Machine Translation</atitle><date>2024-07-29</date><risdate>2024</risdate><abstract>Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term "the nurse") into the gendered form that is most prevalent in the systems' training data (e.g., "enfermera", the Spanish term for a female nurse). This often reflects and perpetuates harmful stereotypes present in society. With MT user interfaces in mind that allow for resolving gender ambiguity in a frictionless manner, we study the problem of generating all grammatically correct gendered translation alternatives. We open source train and test datasets for five language pairs and establish benchmarks for this task. Our key technical contribution is a novel semi-supervised solution for generating alternatives that integrates seamlessly with standard MT models and maintains high performance without requiring additional components or increasing inference overhead.</abstract><doi>10.48550/arxiv.2407.20438</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2407.20438
ispartof
issn
language eng
recordid cdi_arxiv_primary_2407_20438
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computation and Language
title Generating Gender Alternatives in Machine Translation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T06%3A25%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Generating%20Gender%20Alternatives%20in%20Machine%20Translation&rft.au=Garg,%20Sarthak&rft.date=2024-07-29&rft_id=info:doi/10.48550/arxiv.2407.20438&rft_dat=%3Carxiv_GOX%3E2407_20438%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true