Chumor 2.0: Towards Benchmarking Chinese Humor Understanding

Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: He, Ruiqi, He, Yushu, Bai, Longju, Liu, Jiarui, Sun, Zhenjie, Tang, Zenghao, Wang, He, Xia, Hanchen, Mihalcea, Rada, Deng, Naihao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator He, Ruiqi
He, Yushu
Bai, Longju
Liu, Jiarui
Sun, Zhenjie
Tang, Zenghao
Wang, He
Xia, Hanchen
Mihalcea, Rada
Deng, Naihao
description Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets. Chumor is sourced from Ruo Zhi Ba, a Chinese Reddit-like platform known for sharing intellectually challenging and culturally specific jokes. We test ten LLMs through direct and chain-of-thought prompting, revealing that Chumor poses significant challenges to existing LLMs, with their accuracy slightly above random and far below human. In addition, our analysis highlights that human-annotated humor explanations are significantly better than those generated by GPT-4o and ERNIE-4-turbo. We release Chumor at https://huggingface.co/datasets/dnaihao/Chumor, our project page is at https://dnaihao.github.io/Chumor-dataset/, our leaderboard is at https://huggingface.co/spaces/dnaihao/Chumor, and our codebase is at https://github.com/dnaihao/Chumor-dataset.
doi_str_mv 10.48550/arxiv.2412.17729
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_17729</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_17729</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_177293</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jM0Nzey5GSwcc4ozc0vUjDSM7BSCMkvTyxKKVZwSs1LzshNLMrOzEtXcM7IzEstTlXwAKsLzUtJLSouScxLAcrxMLCmJeYUp_JCaW4GeTfXEGcPXbA98QVFmUBDKuNB9sWD7TMmrAIAnto0ng</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Chumor 2.0: Towards Benchmarking Chinese Humor Understanding</title><source>arXiv.org</source><creator>He, Ruiqi ; He, Yushu ; Bai, Longju ; Liu, Jiarui ; Sun, Zhenjie ; Tang, Zenghao ; Wang, He ; Xia, Hanchen ; Mihalcea, Rada ; Deng, Naihao</creator><creatorcontrib>He, Ruiqi ; He, Yushu ; Bai, Longju ; Liu, Jiarui ; Sun, Zhenjie ; Tang, Zenghao ; Wang, He ; Xia, Hanchen ; Mihalcea, Rada ; Deng, Naihao</creatorcontrib><description>Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets. Chumor is sourced from Ruo Zhi Ba, a Chinese Reddit-like platform known for sharing intellectually challenging and culturally specific jokes. We test ten LLMs through direct and chain-of-thought prompting, revealing that Chumor poses significant challenges to existing LLMs, with their accuracy slightly above random and far below human. In addition, our analysis highlights that human-annotated humor explanations are significantly better than those generated by GPT-4o and ERNIE-4-turbo. We release Chumor at https://huggingface.co/datasets/dnaihao/Chumor, our project page is at https://dnaihao.github.io/Chumor-dataset/, our leaderboard is at https://huggingface.co/spaces/dnaihao/Chumor, and our codebase is at https://github.com/dnaihao/Chumor-dataset.</description><identifier>DOI: 10.48550/arxiv.2412.17729</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2024-12</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.17729$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.17729$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>He, Ruiqi</creatorcontrib><creatorcontrib>He, Yushu</creatorcontrib><creatorcontrib>Bai, Longju</creatorcontrib><creatorcontrib>Liu, Jiarui</creatorcontrib><creatorcontrib>Sun, Zhenjie</creatorcontrib><creatorcontrib>Tang, Zenghao</creatorcontrib><creatorcontrib>Wang, He</creatorcontrib><creatorcontrib>Xia, Hanchen</creatorcontrib><creatorcontrib>Mihalcea, Rada</creatorcontrib><creatorcontrib>Deng, Naihao</creatorcontrib><title>Chumor 2.0: Towards Benchmarking Chinese Humor Understanding</title><description>Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets. Chumor is sourced from Ruo Zhi Ba, a Chinese Reddit-like platform known for sharing intellectually challenging and culturally specific jokes. We test ten LLMs through direct and chain-of-thought prompting, revealing that Chumor poses significant challenges to existing LLMs, with their accuracy slightly above random and far below human. In addition, our analysis highlights that human-annotated humor explanations are significantly better than those generated by GPT-4o and ERNIE-4-turbo. We release Chumor at https://huggingface.co/datasets/dnaihao/Chumor, our project page is at https://dnaihao.github.io/Chumor-dataset/, our leaderboard is at https://huggingface.co/spaces/dnaihao/Chumor, and our codebase is at https://github.com/dnaihao/Chumor-dataset.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jM0Nzey5GSwcc4ozc0vUjDSM7BSCMkvTyxKKVZwSs1LzshNLMrOzEtXcM7IzEstTlXwAKsLzUtJLSouScxLAcrxMLCmJeYUp_JCaW4GeTfXEGcPXbA98QVFmUBDKuNB9sWD7TMmrAIAnto0ng</recordid><startdate>20241223</startdate><enddate>20241223</enddate><creator>He, Ruiqi</creator><creator>He, Yushu</creator><creator>Bai, Longju</creator><creator>Liu, Jiarui</creator><creator>Sun, Zhenjie</creator><creator>Tang, Zenghao</creator><creator>Wang, He</creator><creator>Xia, Hanchen</creator><creator>Mihalcea, Rada</creator><creator>Deng, Naihao</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241223</creationdate><title>Chumor 2.0: Towards Benchmarking Chinese Humor Understanding</title><author>He, Ruiqi ; He, Yushu ; Bai, Longju ; Liu, Jiarui ; Sun, Zhenjie ; Tang, Zenghao ; Wang, He ; Xia, Hanchen ; Mihalcea, Rada ; Deng, Naihao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_177293</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>He, Ruiqi</creatorcontrib><creatorcontrib>He, Yushu</creatorcontrib><creatorcontrib>Bai, Longju</creatorcontrib><creatorcontrib>Liu, Jiarui</creatorcontrib><creatorcontrib>Sun, Zhenjie</creatorcontrib><creatorcontrib>Tang, Zenghao</creatorcontrib><creatorcontrib>Wang, He</creatorcontrib><creatorcontrib>Xia, Hanchen</creatorcontrib><creatorcontrib>Mihalcea, Rada</creatorcontrib><creatorcontrib>Deng, Naihao</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>He, Ruiqi</au><au>He, Yushu</au><au>Bai, Longju</au><au>Liu, Jiarui</au><au>Sun, Zhenjie</au><au>Tang, Zenghao</au><au>Wang, He</au><au>Xia, Hanchen</au><au>Mihalcea, Rada</au><au>Deng, Naihao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Chumor 2.0: Towards Benchmarking Chinese Humor Understanding</atitle><date>2024-12-23</date><risdate>2024</risdate><abstract>Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets. Chumor is sourced from Ruo Zhi Ba, a Chinese Reddit-like platform known for sharing intellectually challenging and culturally specific jokes. We test ten LLMs through direct and chain-of-thought prompting, revealing that Chumor poses significant challenges to existing LLMs, with their accuracy slightly above random and far below human. In addition, our analysis highlights that human-annotated humor explanations are significantly better than those generated by GPT-4o and ERNIE-4-turbo. We release Chumor at https://huggingface.co/datasets/dnaihao/Chumor, our project page is at https://dnaihao.github.io/Chumor-dataset/, our leaderboard is at https://huggingface.co/spaces/dnaihao/Chumor, and our codebase is at https://github.com/dnaihao/Chumor-dataset.</abstract><doi>10.48550/arxiv.2412.17729</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2412.17729
ispartof
issn
language eng
recordid cdi_arxiv_primary_2412_17729
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computation and Language
title Chumor 2.0: Towards Benchmarking Chinese Humor Understanding
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T10%3A03%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Chumor%202.0:%20Towards%20Benchmarking%20Chinese%20Humor%20Understanding&rft.au=He,%20Ruiqi&rft.date=2024-12-23&rft_id=info:doi/10.48550/arxiv.2412.17729&rft_dat=%3Carxiv_GOX%3E2412_17729%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true