Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function
Gender bias exists in natural language datasets which neural language models tend to learn, resulting in biased text generation. In this research, we propose a debiasing approach based on the loss function modification. We introduce a new term to the loss function which attempts to equalize the prob...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Gender bias exists in natural language datasets which neural language models
tend to learn, resulting in biased text generation. In this research, we
propose a debiasing approach based on the loss function modification. We
introduce a new term to the loss function which attempts to equalize the
probabilities of male and female words in the output. Using an array of bias
evaluation metrics, we provide empirical evidence that our approach
successfully mitigates gender bias in language models without increasing
perplexity. In comparison to existing debiasing strategies, data augmentation,
and word embedding debiasing, our method performs better in several aspects,
especially in reducing gender bias in occupation words. Finally, we introduce a
combination of data augmentation and our approach, and show that it outperforms
existing strategies in all bias evaluation metrics. |
---|---|
DOI: | 10.48550/arxiv.1905.12801 |