Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance

Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable bench...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-10
Hauptverfasser: Chen, Andong, Lianzhang Lou, Chen, Kehai, Bai, Xuefeng, Yang, Xiang, Yang, Muyun, Zhao, Tiejun, Zhang, Min
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Chen, Andong
Lianzhang Lou
Chen, Kehai
Bai, Xuefeng
Yang, Xiang
Yang, Muyun
Zhao, Tiejun
Zhang, Min
description Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3094932998</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3094932998</sourcerecordid><originalsourceid>FETCH-proquest_journals_30949329983</originalsourceid><addsrcrecordid>eNqNjM0KgkAUhYcgSKp3GGibYDNa2q7EaFHQor1c9PrXNOZcJ-jtM-oBWp3D-T7OiDlCypUb-kJM2Jyo8TxPrDciCKTDcI86q-5gbrUu-el0Jl60hl8NaFLQf8ZYAVGdgeJxVWsk5JcWe_PaJk9Q9uvscuwsZK8lPyg7PA4FdM4ThSXoDGdsXIAinP9yyhaH5Bof3YdpO4vUp01rjR5QKr3Ij6SIolD-Z70BKRxGRg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3094932998</pqid></control><display><type>article</type><title>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</title><source>Free E- Journals</source><creator>Chen, Andong ; Lianzhang Lou ; Chen, Kehai ; Bai, Xuefeng ; Yang, Xiang ; Yang, Muyun ; Zhao, Tiejun ; Zhang, Min</creator><creatorcontrib>Chen, Andong ; Lianzhang Lou ; Chen, Kehai ; Bai, Xuefeng ; Yang, Xiang ; Yang, Muyun ; Zhao, Tiejun ; Zhang, Min</creatorcontrib><description>Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Adequacy ; Large language models ; Translating ; Translations</subject><ispartof>arXiv.org, 2024-10</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Chen, Andong</creatorcontrib><creatorcontrib>Lianzhang Lou</creatorcontrib><creatorcontrib>Chen, Kehai</creatorcontrib><creatorcontrib>Bai, Xuefeng</creatorcontrib><creatorcontrib>Yang, Xiang</creatorcontrib><creatorcontrib>Yang, Muyun</creatorcontrib><creatorcontrib>Zhao, Tiejun</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><title>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</title><title>arXiv.org</title><description>Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.</description><subject>Adequacy</subject><subject>Large language models</subject><subject>Translating</subject><subject>Translations</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjM0KgkAUhYcgSKp3GGibYDNa2q7EaFHQor1c9PrXNOZcJ-jtM-oBWp3D-T7OiDlCypUb-kJM2Jyo8TxPrDciCKTDcI86q-5gbrUu-el0Jl60hl8NaFLQf8ZYAVGdgeJxVWsk5JcWe_PaJk9Q9uvscuwsZK8lPyg7PA4FdM4ThSXoDGdsXIAinP9yyhaH5Bof3YdpO4vUp01rjR5QKr3Ij6SIolD-Z70BKRxGRg</recordid><startdate>20241017</startdate><enddate>20241017</enddate><creator>Chen, Andong</creator><creator>Lianzhang Lou</creator><creator>Chen, Kehai</creator><creator>Bai, Xuefeng</creator><creator>Yang, Xiang</creator><creator>Yang, Muyun</creator><creator>Zhao, Tiejun</creator><creator>Zhang, Min</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241017</creationdate><title>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</title><author>Chen, Andong ; Lianzhang Lou ; Chen, Kehai ; Bai, Xuefeng ; Yang, Xiang ; Yang, Muyun ; Zhao, Tiejun ; Zhang, Min</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30949329983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Adequacy</topic><topic>Large language models</topic><topic>Translating</topic><topic>Translations</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Andong</creatorcontrib><creatorcontrib>Lianzhang Lou</creatorcontrib><creatorcontrib>Chen, Kehai</creatorcontrib><creatorcontrib>Bai, Xuefeng</creatorcontrib><creatorcontrib>Yang, Xiang</creatorcontrib><creatorcontrib>Yang, Muyun</creatorcontrib><creatorcontrib>Zhao, Tiejun</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Andong</au><au>Lianzhang Lou</au><au>Chen, Kehai</au><au>Bai, Xuefeng</au><au>Yang, Xiang</au><au>Yang, Muyun</au><au>Zhao, Tiejun</au><au>Zhang, Min</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</atitle><jtitle>arXiv.org</jtitle><date>2024-10-17</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-10
issn 2331-8422
language eng
recordid cdi_proquest_journals_3094932998
source Free E- Journals
subjects Adequacy
Large language models
Translating
Translations
title Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T04%3A47%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Benchmarking%20LLMs%20for%20Translating%20Classical%20Chinese%20Poetry:Evaluating%20Adequacy,%20Fluency,%20and%20Elegance&rft.jtitle=arXiv.org&rft.au=Chen,%20Andong&rft.date=2024-10-17&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3094932998%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3094932998&rft_id=info:pmid/&rfr_iscdi=true