Editing Personality for Large Language Models
This paper introduces an innovative task focused on editing the personality traits of Large Language Models (LLMs). This task seeks to adjust the models' responses to opinion-related questions on specified topics since an individual's personality often manifests in the form of their expres...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper introduces an innovative task focused on editing the personality
traits of Large Language Models (LLMs). This task seeks to adjust the models'
responses to opinion-related questions on specified topics since an
individual's personality often manifests in the form of their expressed
opinions, thereby showcasing different personality traits. Specifically, we
construct PersonalityEdit, a new benchmark dataset to address this task.
Drawing on the theory in Social Psychology, we isolate three representative
traits, namely Neuroticism, Extraversion, and Agreeableness, as the foundation
for our benchmark. We then gather data using GPT-4, generating responses that
align with a specified topic and embody the targeted personality trait. We
conduct comprehensive experiments involving various baselines and discuss the
representation of personality behavior in LLMs. Our findings uncover potential
challenges of the proposed task, illustrating several remaining issues. We
anticipate that our work can stimulate further annotation in model editing and
personality-related research. Code is available at
https://github.com/zjunlp/EasyEdit. |
---|---|
DOI: | 10.48550/arxiv.2310.02168 |