Self-Alignment: Improving Alignment of Cultural Values in LLMs via In-Context Learning
Improving the alignment of Large Language Models (LLMs) with respect to the cultural values that they encode has become an increasingly important topic. In this work, we study whether we can exploit existing knowledge about cultural values at inference time to adjust model responses to cultural valu...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Improving the alignment of Large Language Models (LLMs) with respect to the
cultural values that they encode has become an increasingly important topic. In
this work, we study whether we can exploit existing knowledge about cultural
values at inference time to adjust model responses to cultural value probes. We
present a simple and inexpensive method that uses a combination of in-context
learning (ICL) and human survey data, and show that we can improve the
alignment to cultural values across 5 models that include both English-centric
and multilingual LLMs. Importantly, we show that our method could prove useful
in test languages other than English and can improve alignment to the cultural
values that correspond to a range of culturally diverse countries. |
---|---|
DOI: | 10.48550/arxiv.2408.16482 |