SPICA: Retrieving Scenarios for Pluralistic In-Context Alignment
When different groups' values differ, one approach to model alignment is to steer models at inference time towards each group's preferences. However, techniques like in-context learning only consider similarity when drawing few-shot examples and not cross-group differences in values. We pr...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | When different groups' values differ, one approach to model alignment is to
steer models at inference time towards each group's preferences. However,
techniques like in-context learning only consider similarity when drawing
few-shot examples and not cross-group differences in values. We propose SPICA,
a framework that accounts for group-level differences during in-context example
retrieval. SPICA introduces three designs: scenario banks, group-informed
retrieval metrics, and in-context alignment prompts. From an evaluation of
SPICA on an alignment task collecting inputs from four demographic groups ($n =
544$), our metrics retrieve in-context examples that more closely match
observed preferences, with the best prompt configuration using multiple
contrastive responses to demonstrate examples. In an end-to-end evaluation ($n
= 120$), we observe that SPICA is higher rated than similarity-based retrieval,
with groups seeing up to a +0.16 point improvement on a 5 point scale.
Additionally, gains from SPICA were more uniform, with all groups benefiting
from alignment rather than only some. Finally, we find that while a
group-agnostic approach can align to aggregated values, it is not most suited
for divergent groups. |
---|---|
DOI: | 10.48550/arxiv.2411.10912 |