Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance
Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable bench...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2024-10 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Chen, Andong Lianzhang Lou Chen, Kehai Bai, Xuefeng Yang, Xiang Yang, Muyun Zhao, Tiejun Zhang, Min |
description | Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3094932998</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3094932998</sourcerecordid><originalsourceid>FETCH-proquest_journals_30949329983</originalsourceid><addsrcrecordid>eNqNjM0KgkAUhYcgSKp3GGibYDNa2q7EaFHQor1c9PrXNOZcJ-jtM-oBWp3D-T7OiDlCypUb-kJM2Jyo8TxPrDciCKTDcI86q-5gbrUu-el0Jl60hl8NaFLQf8ZYAVGdgeJxVWsk5JcWe_PaJk9Q9uvscuwsZK8lPyg7PA4FdM4ThSXoDGdsXIAinP9yyhaH5Bof3YdpO4vUp01rjR5QKr3Ij6SIolD-Z70BKRxGRg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3094932998</pqid></control><display><type>article</type><title>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</title><source>Free E- Journals</source><creator>Chen, Andong ; Lianzhang Lou ; Chen, Kehai ; Bai, Xuefeng ; Yang, Xiang ; Yang, Muyun ; Zhao, Tiejun ; Zhang, Min</creator><creatorcontrib>Chen, Andong ; Lianzhang Lou ; Chen, Kehai ; Bai, Xuefeng ; Yang, Xiang ; Yang, Muyun ; Zhao, Tiejun ; Zhang, Min</creatorcontrib><description>Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Adequacy ; Large language models ; Translating ; Translations</subject><ispartof>arXiv.org, 2024-10</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Chen, Andong</creatorcontrib><creatorcontrib>Lianzhang Lou</creatorcontrib><creatorcontrib>Chen, Kehai</creatorcontrib><creatorcontrib>Bai, Xuefeng</creatorcontrib><creatorcontrib>Yang, Xiang</creatorcontrib><creatorcontrib>Yang, Muyun</creatorcontrib><creatorcontrib>Zhao, Tiejun</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><title>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</title><title>arXiv.org</title><description>Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.</description><subject>Adequacy</subject><subject>Large language models</subject><subject>Translating</subject><subject>Translations</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjM0KgkAUhYcgSKp3GGibYDNa2q7EaFHQor1c9PrXNOZcJ-jtM-oBWp3D-T7OiDlCypUb-kJM2Jyo8TxPrDciCKTDcI86q-5gbrUu-el0Jl60hl8NaFLQf8ZYAVGdgeJxVWsk5JcWe_PaJk9Q9uvscuwsZK8lPyg7PA4FdM4ThSXoDGdsXIAinP9yyhaH5Bof3YdpO4vUp01rjR5QKr3Ij6SIolD-Z70BKRxGRg</recordid><startdate>20241017</startdate><enddate>20241017</enddate><creator>Chen, Andong</creator><creator>Lianzhang Lou</creator><creator>Chen, Kehai</creator><creator>Bai, Xuefeng</creator><creator>Yang, Xiang</creator><creator>Yang, Muyun</creator><creator>Zhao, Tiejun</creator><creator>Zhang, Min</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241017</creationdate><title>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</title><author>Chen, Andong ; Lianzhang Lou ; Chen, Kehai ; Bai, Xuefeng ; Yang, Xiang ; Yang, Muyun ; Zhao, Tiejun ; Zhang, Min</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30949329983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Adequacy</topic><topic>Large language models</topic><topic>Translating</topic><topic>Translations</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Andong</creatorcontrib><creatorcontrib>Lianzhang Lou</creatorcontrib><creatorcontrib>Chen, Kehai</creatorcontrib><creatorcontrib>Bai, Xuefeng</creatorcontrib><creatorcontrib>Yang, Xiang</creatorcontrib><creatorcontrib>Yang, Muyun</creatorcontrib><creatorcontrib>Zhao, Tiejun</creatorcontrib><creatorcontrib>Zhang, Min</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Andong</au><au>Lianzhang Lou</au><au>Chen, Kehai</au><au>Bai, Xuefeng</au><au>Yang, Xiang</au><au>Yang, Muyun</au><au>Zhao, Tiejun</au><au>Zhang, Min</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance</atitle><jtitle>arXiv.org</jtitle><date>2024-10-17</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only adequacy in translating culturally and historically significant content but also a strict adherence to linguistic fluency and poetic elegance. To overcome the limitations of traditional evaluation metrics, we propose an automatic evaluation metric based on GPT-4, which better evaluates translation quality in terms of adequacy, fluency, and elegance. Our evaluation study reveals that existing large language models fall short in this task. To evaluate these issues, we propose RAT, a Retrieval-Augmented machine Translation method that enhances the translation process by incorporating knowledge related to classical poetry. Our dataset and code will be made available.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-10 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3094932998 |
source | Free E- Journals |
subjects | Adequacy Large language models Translating Translations |
title | Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T04%3A47%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Benchmarking%20LLMs%20for%20Translating%20Classical%20Chinese%20Poetry:Evaluating%20Adequacy,%20Fluency,%20and%20Elegance&rft.jtitle=arXiv.org&rft.au=Chen,%20Andong&rft.date=2024-10-17&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3094932998%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3094932998&rft_id=info:pmid/&rfr_iscdi=true |