Chumor 2.0: Towards Benchmarking Chinese Humor Understanding

Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	He, Ruiqi, He, Yushu, Bai, Longju, Liu, Jiarui, Sun, Zhenjie, Tang, Zenghao, Wang, He, Xia, Hanchen, Mihalcea, Rada, Deng, Naihao
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	He, Ruiqi He, Yushu Bai, Longju Liu, Jiarui Sun, Zhenjie Tang, Zenghao Wang, He Xia, Hanchen Mihalcea, Rada Deng, Naihao
description	Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets. Chumor is sourced from Ruo Zhi Ba, a Chinese Reddit-like platform known for sharing intellectually challenging and culturally specific jokes. We test ten LLMs through direct and chain-of-thought prompting, revealing that Chumor poses significant challenges to existing LLMs, with their accuracy slightly above random and far below human. In addition, our analysis highlights that human-annotated humor explanations are significantly better than those generated by GPT-4o and ERNIE-4-turbo. We release Chumor at https://huggingface.co/datasets/dnaihao/Chumor, our project page is at https://dnaihao.github.io/Chumor-dataset/, our leaderboard is at https://huggingface.co/spaces/dnaihao/Chumor, and our codebase is at https://github.com/dnaihao/Chumor-dataset.
doi_str_mv	10.48550/arxiv.2412.17729
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_17729</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_17729</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_177293</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jM0Nzey5GSwcc4ozc0vUjDSM7BSCMkvTyxKKVZwSs1LzshNLMrOzEtXcM7IzEstTlXwAKsLzUtJLSouScxLAcrxMLCmJeYUp_JCaW4GeTfXEGcPXbA98QVFmUBDKuNB9sWD7TMmrAIAnto0ng</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Chumor 2.0: Towards Benchmarking Chinese Humor Understanding</title><source>arXiv.org</source><creator>He, Ruiqi ; He, Yushu ; Bai, Longju ; Liu, Jiarui ; Sun, Zhenjie ; Tang, Zenghao ; Wang, He ; Xia, Hanchen ; Mihalcea, Rada ; Deng, Naihao</creator><creatorcontrib>He, Ruiqi ; He, Yushu ; Bai, Longju ; Liu, Jiarui ; Sun, Zhenjie ; Tang, Zenghao ; Wang, He ; Xia, Hanchen ; Mihalcea, Rada ; Deng, Naihao</creatorcontrib><description>Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets. Chumor is sourced from Ruo Zhi Ba, a Chinese Reddit-like platform known for sharing intellectually challenging and culturally specific jokes. We test ten LLMs through direct and chain-of-thought prompting, revealing that Chumor poses significant challenges to existing LLMs, with their accuracy slightly above random and far below human. In addition, our analysis highlights that human-annotated humor explanations are significantly better than those generated by GPT-4o and ERNIE-4-turbo. We release Chumor at https://huggingface.co/datasets/dnaihao/Chumor, our project page is at https://dnaihao.github.io/Chumor-dataset/, our leaderboard is at https://huggingface.co/spaces/dnaihao/Chumor, and our codebase is at https://github.com/dnaihao/Chumor-dataset.</description><identifier>DOI: 10.48550/arxiv.2412.17729</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2024-12</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.17729$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.17729$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>He, Ruiqi</creatorcontrib><creatorcontrib>He, Yushu</creatorcontrib><creatorcontrib>Bai, Longju</creatorcontrib><creatorcontrib>Liu, Jiarui</creatorcontrib><creatorcontrib>Sun, Zhenjie</creatorcontrib><creatorcontrib>Tang, Zenghao</creatorcontrib><creatorcontrib>Wang, He</creatorcontrib><creatorcontrib>Xia, Hanchen</creatorcontrib><creatorcontrib>Mihalcea, Rada</creatorcontrib><creatorcontrib>Deng, Naihao</creatorcontrib><title>Chumor 2.0: Towards Benchmarking Chinese Humor Understanding</title><description>Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets. Chumor is sourced from Ruo Zhi Ba, a Chinese Reddit-like platform known for sharing intellectually challenging and culturally specific jokes. We test ten LLMs through direct and chain-of-thought prompting, revealing that Chumor poses significant challenges to existing LLMs, with their accuracy slightly above random and far below human. In addition, our analysis highlights that human-annotated humor explanations are significantly better than those generated by GPT-4o and ERNIE-4-turbo. We release Chumor at https://huggingface.co/datasets/dnaihao/Chumor, our project page is at https://dnaihao.github.io/Chumor-dataset/, our leaderboard is at https://huggingface.co/spaces/dnaihao/Chumor, and our codebase is at https://github.com/dnaihao/Chumor-dataset.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE00jM0Nzey5GSwcc4ozc0vUjDSM7BSCMkvTyxKKVZwSs1LzshNLMrOzEtXcM7IzEstTlXwAKsLzUtJLSouScxLAcrxMLCmJeYUp_JCaW4GeTfXEGcPXbA98QVFmUBDKuNB9sWD7TMmrAIAnto0ng</recordid><startdate>20241223</startdate><enddate>20241223</enddate><creator>He, Ruiqi</creator><creator>He, Yushu</creator><creator>Bai, Longju</creator><creator>Liu, Jiarui</creator><creator>Sun, Zhenjie</creator><creator>Tang, Zenghao</creator><creator>Wang, He</creator><creator>Xia, Hanchen</creator><creator>Mihalcea, Rada</creator><creator>Deng, Naihao</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241223</creationdate><title>Chumor 2.0: Towards Benchmarking Chinese Humor Understanding</title><author>He, Ruiqi ; He, Yushu ; Bai, Longju ; Liu, Jiarui ; Sun, Zhenjie ; Tang, Zenghao ; Wang, He ; Xia, Hanchen ; Mihalcea, Rada ; Deng, Naihao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_177293</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>He, Ruiqi</creatorcontrib><creatorcontrib>He, Yushu</creatorcontrib><creatorcontrib>Bai, Longju</creatorcontrib><creatorcontrib>Liu, Jiarui</creatorcontrib><creatorcontrib>Sun, Zhenjie</creatorcontrib><creatorcontrib>Tang, Zenghao</creatorcontrib><creatorcontrib>Wang, He</creatorcontrib><creatorcontrib>Xia, Hanchen</creatorcontrib><creatorcontrib>Mihalcea, Rada</creatorcontrib><creatorcontrib>Deng, Naihao</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>He, Ruiqi</au><au>He, Yushu</au><au>Bai, Longju</au><au>Liu, Jiarui</au><au>Sun, Zhenjie</au><au>Tang, Zenghao</au><au>Wang, He</au><au>Xia, Hanchen</au><au>Mihalcea, Rada</au><au>Deng, Naihao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Chumor 2.0: Towards Benchmarking Chinese Humor Understanding</atitle><date>2024-12-23</date><risdate>2024</risdate><abstract>Existing humor datasets and evaluations predominantly focus on English, leaving limited resources for culturally nuanced humor in non-English languages like Chinese. To address this gap, we construct Chumor, the first Chinese humor explanation dataset that exceeds the size of existing humor datasets. Chumor is sourced from Ruo Zhi Ba, a Chinese Reddit-like platform known for sharing intellectually challenging and culturally specific jokes. We test ten LLMs through direct and chain-of-thought prompting, revealing that Chumor poses significant challenges to existing LLMs, with their accuracy slightly above random and far below human. In addition, our analysis highlights that human-annotated humor explanations are significantly better than those generated by GPT-4o and ERNIE-4-turbo. We release Chumor at https://huggingface.co/datasets/dnaihao/Chumor, our project page is at https://dnaihao.github.io/Chumor-dataset/, our leaderboard is at https://huggingface.co/spaces/dnaihao/Chumor, and our codebase is at https://github.com/dnaihao/Chumor-dataset.</abstract><doi>10.48550/arxiv.2412.17729</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2412.17729
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2412_17729
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language
title	Chumor 2.0: Towards Benchmarking Chinese Humor Understanding
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T10%3A03%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Chumor%202.0:%20Towards%20Benchmarking%20Chinese%20Humor%20Understanding&rft.au=He,%20Ruiqi&rft.date=2024-12-23&rft_id=info:doi/10.48550/arxiv.2412.17729&rft_dat=%3Carxiv_GOX%3E2412_17729%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true