Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance

Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable bench...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-10
Hauptverfasser:	Chen, Andong, Lianzhang Lou, Chen, Kehai, Bai, Xuefeng, Yang, Xiang, Yang, Muyun, Zhao, Tiejun, Zhang, Min
Format:	Artikel
Sprache:	eng
Schlagworte:	Adequacy Large language models Translating Translations
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Chen, Andong Lianzhang Lou Chen, Kehai Bai, Xuefeng Yang, Xiang Yang, Muyun Zhao, Tiejun Zhang, Min
description	Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3094932998</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3094932998</sourcerecordid><originalsourceid>FETCH-proquest_journals_30949329983</originalsourceid><addsrcrecordid>eNqNjM0KgkAUhYcgSKp3GGibYDNa2q7EaFHQor1c9PrXNOZcJ-jtM-oBWp3D-T7OiDlCypUb-kJM2Jyo8TxPrDciCKTDcI86q-5gbrUu-el0Jl60hl8NaFLQf8ZYAVGdgeJxVWsk5JcWe_PaJk9Q9uvscuwsZK8lPyg7PA4FdM4ThSXoDGdsXIAinP9yyhaH5Bof3YdpO4vUp01rjR5QKr3Ij6SIolD-Z70BKRxGRg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3094932998</pqid></control><display><type>article</type><title>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</title><source>Free E- Journals</source><creator>Chen, Andong ; Lianzhang Lou ; Chen, Kehai ; Bai, Xuefeng ; Yang, Xiang ; Yang, Muyun ; Zhao, Tiejun ; Zhang, Min</creator><creatorcontrib>Chen, Andong ; Lianzhang Lou ; Chen, Kehai ; Bai, Xuefeng ; Yang, Xiang ; Yang, Muyun ; Zhao, Tiejun ; Zhang, Min</creatorcontrib><description>Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Adequacy ; Large language models ; Translating ; Translations</subject><ispartof>arXiv.org, 2024-10</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Chen, Andong</creatorcontrib><creatorcontrib>Lianzhang Lou</creatorcontrib><creatorcontrib>Chen, Kehai</creatorcontrib><creatorcontrib>Bai, Xuefeng</creatorcontrib><creatorcontrib>Yang, Xiang</creatorcontrib><creatorcontrib>Yang, Muyun</creatorcontrib><creatorcontrib>Zhao, Tiejun</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><title>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</title><title>arXiv.org</title><description>Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.</description><subject>Adequacy</subject><subject>Large language models</subject><subject>Translating</subject><subject>Translations</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjM0KgkAUhYcgSKp3GGibYDNa2q7EaFHQor1c9PrXNOZcJ-jtM-oBWp3D-T7OiDlCypUb-kJM2Jyo8TxPrDciCKTDcI86q-5gbrUu-el0Jl60hl8NaFLQf8ZYAVGdgeJxVWsk5JcWe_PaJk9Q9uvscuwsZK8lPyg7PA4FdM4ThSXoDGdsXIAinP9yyhaH5Bof3YdpO4vUp01rjR5QKr3Ij6SIolD-Z70BKRxGRg</recordid><startdate>20241017</startdate><enddate>20241017</enddate><creator>Chen, Andong</creator><creator>Lianzhang Lou</creator><creator>Chen, Kehai</creator><creator>Bai, Xuefeng</creator><creator>Yang, Xiang</creator><creator>Yang, Muyun</creator><creator>Zhao, Tiejun</creator><creator>Zhang, Min</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241017</creationdate><title>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</title><author>Chen, Andong ; Lianzhang Lou ; Chen, Kehai ; Bai, Xuefeng ; Yang, Xiang ; Yang, Muyun ; Zhao, Tiejun ; Zhang, Min</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30949329983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Adequacy</topic><topic>Large language models</topic><topic>Translating</topic><topic>Translations</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Andong</creatorcontrib><creatorcontrib>Lianzhang Lou</creatorcontrib><creatorcontrib>Chen, Kehai</creatorcontrib><creatorcontrib>Bai, Xuefeng</creatorcontrib><creatorcontrib>Yang, Xiang</creatorcontrib><creatorcontrib>Yang, Muyun</creatorcontrib><creatorcontrib>Zhao, Tiejun</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Andong</au><au>Lianzhang Lou</au><au>Chen, Kehai</au><au>Bai, Xuefeng</au><au>Yang, Xiang</au><au>Yang, Muyun</au><au>Zhao, Tiejun</au><au>Zhang, Min</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</atitle><jtitle>arXiv.org</jtitle><date>2024-10-17</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-10
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_3094932998
source	Free E- Journals
subjects	Adequacy Large language models Translating Translations
title	Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T04%3A47%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Benchmarking%20LLMs%20for%20Translating%20Classical%20Chinese%20Poetry:Evaluating%20Adequacy,%20Fluency,%20and%20Elegance&rft.jtitle=arXiv.org&rft.au=Chen,%20Andong&rft.date=2024-10-17&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3094932998%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3094932998&rft_id=info:pmid/&rfr_iscdi=true