PatentEval: Understanding Errors in Patent Generation

NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Zuo, You, Gerdes, Kim, de La Clergerie, Eric Villemonte, Sagot, Benoît
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Zuo, You Gerdes, Kim de La Clergerie, Eric Villemonte Sagot, Benoît
description	NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. We have also developed a benchmark, PatentEval, for systematically assessing language models in this context. Our study includes a comparative analysis, annotated by humans, of various models. These range from those specifically adapted during training for tasks within the patent domain to the latest general-purpose large language models (LLMs). Furthermore, we explored and evaluated some metrics to approximate human judgments in patent text evaluation, analyzing the extent to which these metrics align with expert assessments. These approaches provide valuable insights into the capabilities and limitations of current language models in the specialized field of patent text generation.
doi_str_mv	10.48550/arxiv.2406.06589
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2406_06589</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2406_06589</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-98b3cc9130aea4b47a7f334add9f13f9c1003aea5a620f0a60c3e8e489cb1d663</originalsourceid><addsrcrecordid>eNotzrtqAzEUBFA1KYLjD0gV_cBurnwlrZQumPUDDEnh1MtdPYzAloN2Mcnf-1lNMcNwGHsVUEujFLxT-UuneiZB16CVsc9MfdMY8tieaP_Bf7IPZRgp-5R3vC3lWAaeMr9v-DLkUGhMx_zCniLthzB95IRtF-12vqo2X8v1_HNTkW5sZU2PzlmBQIFkLxtqIqIk720UGK0TAHipFOkZRCANDoMJ0ljXC681Ttjb_fbm7n5LOlD5767-7ubHM9cqQD8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>PatentEval: Understanding Errors in Patent Generation</title><source>arXiv.org</source><creator>Zuo, You ; Gerdes, Kim ; de La Clergerie, Eric Villemonte ; Sagot, Benoît</creator><creatorcontrib>Zuo, You ; Gerdes, Kim ; de La Clergerie, Eric Villemonte ; Sagot, Benoît</creatorcontrib><description>NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. We have also developed a benchmark, PatentEval, for systematically assessing language models in this context. Our study includes a comparative analysis, annotated by humans, of various models. These range from those specifically adapted during training for tasks within the patent domain to the latest general-purpose large language models (LLMs). Furthermore, we explored and evaluated some metrics to approximate human judgments in patent text evaluation, analyzing the extent to which these metrics align with expert assessments. These approaches provide valuable insights into the capabilities and limitations of current language models in the specialized field of patent text generation.</description><identifier>DOI: 10.48550/arxiv.2406.06589</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2024-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2406.06589$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2406.06589$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zuo, You</creatorcontrib><creatorcontrib>Gerdes, Kim</creatorcontrib><creatorcontrib>de La Clergerie, Eric Villemonte</creatorcontrib><creatorcontrib>Sagot, Benoît</creatorcontrib><title>PatentEval: Understanding Errors in Patent Generation</title><description>NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. We have also developed a benchmark, PatentEval, for systematically assessing language models in this context. Our study includes a comparative analysis, annotated by humans, of various models. These range from those specifically adapted during training for tasks within the patent domain to the latest general-purpose large language models (LLMs). Furthermore, we explored and evaluated some metrics to approximate human judgments in patent text evaluation, analyzing the extent to which these metrics align with expert assessments. These approaches provide valuable insights into the capabilities and limitations of current language models in the specialized field of patent text generation.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzrtqAzEUBFA1KYLjD0gV_cBurnwlrZQumPUDDEnh1MtdPYzAloN2Mcnf-1lNMcNwGHsVUEujFLxT-UuneiZB16CVsc9MfdMY8tieaP_Bf7IPZRgp-5R3vC3lWAaeMr9v-DLkUGhMx_zCniLthzB95IRtF-12vqo2X8v1_HNTkW5sZU2PzlmBQIFkLxtqIqIk720UGK0TAHipFOkZRCANDoMJ0ljXC681Ttjb_fbm7n5LOlD5767-7ubHM9cqQD8</recordid><startdate>20240605</startdate><enddate>20240605</enddate><creator>Zuo, You</creator><creator>Gerdes, Kim</creator><creator>de La Clergerie, Eric Villemonte</creator><creator>Sagot, Benoît</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240605</creationdate><title>PatentEval: Understanding Errors in Patent Generation</title><author>Zuo, You ; Gerdes, Kim ; de La Clergerie, Eric Villemonte ; Sagot, Benoît</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-98b3cc9130aea4b47a7f334add9f13f9c1003aea5a620f0a60c3e8e489cb1d663</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Zuo, You</creatorcontrib><creatorcontrib>Gerdes, Kim</creatorcontrib><creatorcontrib>de La Clergerie, Eric Villemonte</creatorcontrib><creatorcontrib>Sagot, Benoît</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zuo, You</au><au>Gerdes, Kim</au><au>de La Clergerie, Eric Villemonte</au><au>Sagot, Benoît</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PatentEval: Understanding Errors in Patent Generation</atitle><date>2024-06-05</date><risdate>2024</risdate><abstract>NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. We have also developed a benchmark, PatentEval, for systematically assessing language models in this context. Our study includes a comparative analysis, annotated by humans, of various models. These range from those specifically adapted during training for tasks within the patent domain to the latest general-purpose large language models (LLMs). Furthermore, we explored and evaluated some metrics to approximate human judgments in patent text evaluation, analyzing the extent to which these metrics align with expert assessments. These approaches provide valuable insights into the capabilities and limitations of current language models in the specialized field of patent text generation.</abstract><doi>10.48550/arxiv.2406.06589</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2406.06589
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2406_06589
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language
title	PatentEval: Understanding Errors in Patent Generation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T11%3A35%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PatentEval:%20Understanding%20Errors%20in%20Patent%20Generation&rft.au=Zuo,%20You&rft.date=2024-06-05&rft_id=info:doi/10.48550/arxiv.2406.06589&rft_dat=%3Carxiv_GOX%3E2406_06589%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true