GeniL: A Multilingual Dataset on Generalizing Language
Generative language models are transforming our digital ecosystem, but they often inherit societal biases, for instance stereotypes associating certain attributes with specific identity groups. While whether and how these biases are mitigated may depend on the specific use cases, being able to effec...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Generative language models are transforming our digital ecosystem, but they
often inherit societal biases, for instance stereotypes associating certain
attributes with specific identity groups. While whether and how these biases
are mitigated may depend on the specific use cases, being able to effectively
detect instances of stereotype perpetuation is a crucial first step. Current
methods to assess presence of stereotypes in generated language rely on simple
template or co-occurrence based measures, without accounting for the variety of
sentential contexts they manifest in. We argue that understanding the
sentential context is crucial for detecting instances of generalization. We
distinguish two types of generalizations: (1) language that merely mentions the
presence of a generalization ("people think the French are very rude"), and (2)
language that reinforces such a generalization ("as French they must be rude"),
from non-generalizing context ("My French friends think I am rude"). For
meaningful stereotype evaluations, we need to reliably distinguish such
instances of generalizations. We introduce the new task of detecting
generalization in language, and build GeniL, a multilingual dataset of over 50K
sentences from 9 languages (English, Arabic, Bengali, Spanish, French, Hindi,
Indonesian, Malay, and Portuguese) annotated for instances of generalizations.
We demonstrate that the likelihood of a co-occurrence being an instance of
generalization is usually low, and varies across different languages, identity
groups, and attributes. We build classifiers to detect generalization in
language with an overall PR-AUC of 58.7, with varying degrees of performance
across languages. Our research provides data and tools to enable a nuanced
understanding of stereotype perpetuation, a crucial step towards more inclusive
and responsible language technologies. |
---|---|
DOI: | 10.48550/arxiv.2404.05866 |