Conformity in Large Language Models
The conformity effect describes the tendency of individuals to align their responses with the majority. Studying this bias in large language models (LLMs) is crucial, as LLMs are increasingly used in various information-seeking and decision-making tasks as conversation partners to improve productivi...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The conformity effect describes the tendency of individuals to align their
responses with the majority. Studying this bias in large language models (LLMs)
is crucial, as LLMs are increasingly used in various information-seeking and
decision-making tasks as conversation partners to improve productivity. Thus,
conformity to incorrect responses can compromise their effectiveness. In this
paper, we adapt psychological experiments to examine the extent of conformity
in state-of-the-art LLMs. Our findings reveal that all models tested exhibit
varying levels of conformity toward the majority, regardless of their initial
choice or correctness, across different knowledge domains. Notably, we are the
first to show that LLMs are more likely to conform when they are more uncertain
in their own prediction. We further explore factors that influence conformity,
such as training paradigms and input characteristics, finding that
instruction-tuned models are less susceptible to conformity, while increasing
the naturalness of majority tones amplifies conformity. Finally, we propose two
interventions--Devil's Advocate and Question Distillation--to mitigate
conformity, providing insights into building more robust language models. |
---|---|
DOI: | 10.48550/arxiv.2410.12428 |