Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits
The Myers-Briggs Type Indicator (MBTI) is one of the most influential personality theories reflecting individual differences in thinking, feeling, and behaving. MBTI personality detection has garnered considerable research interest and has evolved significantly over the years. However, this task ten...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Myers-Briggs Type Indicator (MBTI) is one of the most influential
personality theories reflecting individual differences in thinking, feeling,
and behaving. MBTI personality detection has garnered considerable research
interest and has evolved significantly over the years. However, this task tends
to be overly optimistic, as it currently does not align well with the natural
distribution of population personality traits. Specifically, (1) the
self-reported labels in existing datasets result in incorrect labeling issues,
and (2) the hard labels fail to capture the full range of population
personality distributions. In this paper, we optimize the task by constructing
MBTIBench, the first manually annotated high-quality MBTI personality detection
dataset with soft labels, under the guidance of psychologists. As for the first
challenge, MBTIBench effectively solves the incorrect labeling issues, which
account for 29.58% of the data. As for the second challenge, we estimate soft
labels by deriving the polarity tendency of samples. The obtained soft labels
confirm that there are more people with non-extreme personality traits.
Experimental results not only highlight the polarized predictions and biases in
LLMs as key directions for future research, but also confirm that soft labels
can provide more benefits to other psychological tasks than hard labels. The
code and data are available at https://github.com/Personality-NLP/MbtiBench. |
---|---|
DOI: | 10.48550/arxiv.2412.12510 |