Generating Gender Alternatives in Machine Translation
Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term "the nurse") into the gendered form that is most prevalent in the systems' training data (e.g., "enfermera", the Spanish term for a female nurse). This often reflects and perpetu...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Garg, Sarthak Gheini, Mozhdeh Emmanuel, Clara Likhomanenko, Tatiana Gao, Qin Paulik, Matthias |
description | Machine translation (MT) systems often translate terms with ambiguous gender
(e.g., English term "the nurse") into the gendered form that is most prevalent
in the systems' training data (e.g., "enfermera", the Spanish term for a female
nurse). This often reflects and perpetuates harmful stereotypes present in
society. With MT user interfaces in mind that allow for resolving gender
ambiguity in a frictionless manner, we study the problem of generating all
grammatically correct gendered translation alternatives. We open source train
and test datasets for five language pairs and establish benchmarks for this
task. Our key technical contribution is a novel semi-supervised solution for
generating alternatives that integrates seamlessly with standard MT models and
maintains high performance without requiring additional components or
increasing inference overhead. |
doi_str_mv | 10.48550/arxiv.2407.20438 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2407_20438</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2407_20438</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2407_204383</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw1zMyMDG24GQwdU_NSy1KLMnMS1cAMlNSixQcc0pSi_KAQmWpxQqZeQq-ickZmXmpCiFFiXnFOUDx_DweBta0xJziVF4ozc0g7-Ya4uyhC7YgvqAoMzexqDIeZFE82CJjwioAcl0yyQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Generating Gender Alternatives in Machine Translation</title><source>arXiv.org</source><creator>Garg, Sarthak ; Gheini, Mozhdeh ; Emmanuel, Clara ; Likhomanenko, Tatiana ; Gao, Qin ; Paulik, Matthias</creator><creatorcontrib>Garg, Sarthak ; Gheini, Mozhdeh ; Emmanuel, Clara ; Likhomanenko, Tatiana ; Gao, Qin ; Paulik, Matthias</creatorcontrib><description>Machine translation (MT) systems often translate terms with ambiguous gender
(e.g., English term "the nurse") into the gendered form that is most prevalent
in the systems' training data (e.g., "enfermera", the Spanish term for a female
nurse). This often reflects and perpetuates harmful stereotypes present in
society. With MT user interfaces in mind that allow for resolving gender
ambiguity in a frictionless manner, we study the problem of generating all
grammatically correct gendered translation alternatives. We open source train
and test datasets for five language pairs and establish benchmarks for this
task. Our key technical contribution is a novel semi-supervised solution for
generating alternatives that integrates seamlessly with standard MT models and
maintains high performance without requiring additional components or
increasing inference overhead.</description><identifier>DOI: 10.48550/arxiv.2407.20438</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2024-07</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2407.20438$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2407.20438$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Garg, Sarthak</creatorcontrib><creatorcontrib>Gheini, Mozhdeh</creatorcontrib><creatorcontrib>Emmanuel, Clara</creatorcontrib><creatorcontrib>Likhomanenko, Tatiana</creatorcontrib><creatorcontrib>Gao, Qin</creatorcontrib><creatorcontrib>Paulik, Matthias</creatorcontrib><title>Generating Gender Alternatives in Machine Translation</title><description>Machine translation (MT) systems often translate terms with ambiguous gender
(e.g., English term "the nurse") into the gendered form that is most prevalent
in the systems' training data (e.g., "enfermera", the Spanish term for a female
nurse). This often reflects and perpetuates harmful stereotypes present in
society. With MT user interfaces in mind that allow for resolving gender
ambiguity in a frictionless manner, we study the problem of generating all
grammatically correct gendered translation alternatives. We open source train
and test datasets for five language pairs and establish benchmarks for this
task. Our key technical contribution is a novel semi-supervised solution for
generating alternatives that integrates seamlessly with standard MT models and
maintains high performance without requiring additional components or
increasing inference overhead.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw1zMyMDG24GQwdU_NSy1KLMnMS1cAMlNSixQcc0pSi_KAQmWpxQqZeQq-ickZmXmpCiFFiXnFOUDx_DweBta0xJziVF4ozc0g7-Ya4uyhC7YgvqAoMzexqDIeZFE82CJjwioAcl0yyQ</recordid><startdate>20240729</startdate><enddate>20240729</enddate><creator>Garg, Sarthak</creator><creator>Gheini, Mozhdeh</creator><creator>Emmanuel, Clara</creator><creator>Likhomanenko, Tatiana</creator><creator>Gao, Qin</creator><creator>Paulik, Matthias</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240729</creationdate><title>Generating Gender Alternatives in Machine Translation</title><author>Garg, Sarthak ; Gheini, Mozhdeh ; Emmanuel, Clara ; Likhomanenko, Tatiana ; Gao, Qin ; Paulik, Matthias</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2407_204383</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Garg, Sarthak</creatorcontrib><creatorcontrib>Gheini, Mozhdeh</creatorcontrib><creatorcontrib>Emmanuel, Clara</creatorcontrib><creatorcontrib>Likhomanenko, Tatiana</creatorcontrib><creatorcontrib>Gao, Qin</creatorcontrib><creatorcontrib>Paulik, Matthias</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Garg, Sarthak</au><au>Gheini, Mozhdeh</au><au>Emmanuel, Clara</au><au>Likhomanenko, Tatiana</au><au>Gao, Qin</au><au>Paulik, Matthias</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Generating Gender Alternatives in Machine Translation</atitle><date>2024-07-29</date><risdate>2024</risdate><abstract>Machine translation (MT) systems often translate terms with ambiguous gender
(e.g., English term "the nurse") into the gendered form that is most prevalent
in the systems' training data (e.g., "enfermera", the Spanish term for a female
nurse). This often reflects and perpetuates harmful stereotypes present in
society. With MT user interfaces in mind that allow for resolving gender
ambiguity in a frictionless manner, we study the problem of generating all
grammatically correct gendered translation alternatives. We open source train
and test datasets for five language pairs and establish benchmarks for this
task. Our key technical contribution is a novel semi-supervised solution for
generating alternatives that integrates seamlessly with standard MT models and
maintains high performance without requiring additional components or
increasing inference overhead.</abstract><doi>10.48550/arxiv.2407.20438</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2407.20438 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2407_20438 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Computation and Language |
title | Generating Gender Alternatives in Machine Translation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T06%3A25%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Generating%20Gender%20Alternatives%20in%20Machine%20Translation&rft.au=Garg,%20Sarthak&rft.date=2024-07-29&rft_id=info:doi/10.48550/arxiv.2407.20438&rft_dat=%3Carxiv_GOX%3E2407_20438%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |