Insights on Disagreement Patterns in Multimodal Safety Perception across Diverse Rater Groups
AI systems crucially rely on human ratings, but these ratings are often aggregated, obscuring the inherent diversity of perspectives in real-world phenomenon. This is particularly concerning when evaluating the safety of generative AI, where perceptions and associated harms can vary significantly ac...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | AI systems crucially rely on human ratings, but these ratings are often
aggregated, obscuring the inherent diversity of perspectives in real-world
phenomenon. This is particularly concerning when evaluating the safety of
generative AI, where perceptions and associated harms can vary significantly
across socio-cultural contexts. While recent research has studied the impact of
demographic differences on annotating text, there is limited understanding of
how these subjective variations affect multimodal safety in generative AI. To
address this, we conduct a large-scale study employing highly-parallel safety
ratings of about 1000 text-to-image (T2I) generations from a demographically
diverse rater pool of 630 raters balanced across 30 intersectional groups
across age, gender, and ethnicity. Our study shows that (1) there are
significant differences across demographic groups (including intersectional
groups) on how severe they assess the harm to be, and that these differences
vary across different types of safety violations, (2) the diverse rater pool
captures annotation patterns that are substantially different from expert
raters trained on specific set of safety policies, and (3) the differences we
observe in T2I safety are distinct from previously documented group level
differences in text-based safety tasks. To further understand these varying
perspectives, we conduct a qualitative analysis of the open-ended explanations
provided by raters. This analysis reveals core differences into the reasons why
different groups perceive harms in T2I generations. Our findings underscore the
critical need for incorporating diverse perspectives into safety evaluation of
generative AI ensuring these systems are truly inclusive and reflect the values
of all users. |
---|---|
DOI: | 10.48550/arxiv.2410.17032 |