Chumor 2.0: Towards Benchmarking Chinese Humor Understanding
Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Existing humor datasets and evaluations predominantly focus on English,
leaving limited resources for culturally nuanced humor in non-English languages
like Chinese. To address this gap, we construct Chumor, the first Chinese humor
explanation dataset that exceeds the size of existing humor datasets. Chumor is
sourced from Ruo Zhi Ba, a Chinese Reddit-like platform known for sharing
intellectually challenging and culturally specific jokes. We test ten LLMs
through direct and chain-of-thought prompting, revealing that Chumor poses
significant challenges to existing LLMs, with their accuracy slightly above
random and far below human. In addition, our analysis highlights that
human-annotated humor explanations are significantly better than those
generated by GPT-4o and ERNIE-4-turbo. We release Chumor at
https://huggingface.co/datasets/dnaihao/Chumor, our project page is at
https://dnaihao.github.io/Chumor-dataset/, our leaderboard is at
https://huggingface.co/spaces/dnaihao/Chumor, and our codebase is at
https://github.com/dnaihao/Chumor-dataset. |
---|---|
DOI: | 10.48550/arxiv.2412.17729 |