How To Build Competitive Multi-gender Speech Translation Models For Controlling Speaker Gender Translation
When translating from notional gender languages (e.g., English) into grammatical gender languages (e.g., Italian), the generated translation requires explicit gender assignments for various words, including those referring to the speaker. When the source sentence does not convey the speaker's g...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | When translating from notional gender languages (e.g., English) into
grammatical gender languages (e.g., Italian), the generated translation
requires explicit gender assignments for various words, including those
referring to the speaker. When the source sentence does not convey the
speaker's gender, speech translation (ST) models either rely on the
possibly-misleading vocal traits of the speaker or default to the masculine
gender, the most frequent in existing training corpora. To avoid such biased
and not inclusive behaviors, the gender assignment of speaker-related
expressions should be guided by externally-provided metadata about the
speaker's gender. While previous work has shown that the most effective
solution is represented by separate, dedicated gender-specific models, the goal
of this paper is to achieve the same results by integrating the speaker's
gender metadata into a single "multi-gender" neural ST model, easier to
maintain. Our experiments demonstrate that a single multi-gender model
outperforms gender-specialized ones when trained from scratch (with gender
accuracy gains up to 12.9 for feminine forms), while fine-tuning from existing
ST models does not lead to competitive results. |
---|---|
DOI: | 10.48550/arxiv.2310.15114 |