Generating Gender Alternatives in Machine Translation
Machine translation (MT) systems often translate terms with ambiguous gender (e.g., English term "the nurse") into the gendered form that is most prevalent in the systems' training data (e.g., "enfermera", the Spanish term for a female nurse). This often reflects and perpetu...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Machine translation (MT) systems often translate terms with ambiguous gender
(e.g., English term "the nurse") into the gendered form that is most prevalent
in the systems' training data (e.g., "enfermera", the Spanish term for a female
nurse). This often reflects and perpetuates harmful stereotypes present in
society. With MT user interfaces in mind that allow for resolving gender
ambiguity in a frictionless manner, we study the problem of generating all
grammatically correct gendered translation alternatives. We open source train
and test datasets for five language pairs and establish benchmarks for this
task. Our key technical contribution is a novel semi-supervised solution for
generating alternatives that integrates seamlessly with standard MT models and
maintains high performance without requiring additional components or
increasing inference overhead. |
---|---|
DOI: | 10.48550/arxiv.2407.20438 |