Personality testing of Large Language Models: Limited temporal stability, but highlighted prosociality
As Large Language Models (LLMs) continue to gain popularity due to their human-like traits and the intimacy they offer to users, their societal impact inevitably expands. This leads to the rising necessity for comprehensive studies to fully understand LLMs and reveal their potential opportunities, d...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | As Large Language Models (LLMs) continue to gain popularity due to their
human-like traits and the intimacy they offer to users, their societal impact
inevitably expands. This leads to the rising necessity for comprehensive
studies to fully understand LLMs and reveal their potential opportunities,
drawbacks, and overall societal impact. With that in mind, this research
conducted an extensive investigation into seven LLM's, aiming to assess the
temporal stability and inter-rater agreement on their responses on personality
instruments in two time points. In addition, LLMs personality profile was
analyzed and compared to human normative data. The findings revealed varying
levels of inter-rater agreement in the LLMs responses over a short time, with
some LLMs showing higher agreement (e.g., LIama3 and GPT-4o) compared to others
(e.g., GPT-4 and Gemini). Furthermore, agreement depended on used instruments
as well as on domain or trait. This implies the variable robustness in LLMs'
ability to reliably simulate stable personality characteristics. In the case of
scales which showed at least fair agreement, LLMs displayed mostly a socially
desirable profile in both agentic and communal domains, as well as a prosocial
personality profile reflected in higher agreeableness and conscientiousness and
lower Machiavellianism. Exhibiting temporal stability and coherent responses on
personality traits is crucial for AI systems due to their societal impact and
AI safety concerns. |
---|---|
DOI: | 10.48550/arxiv.2306.04308 |