How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection

To combat the misuse of Large Language Models (LLMs), many recent studies have presented LLM-generated-text detectors with promising performance. When users instruct LLMs to generate texts, the instruction can include different constraints depending on the user's need. However, most recent stud...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-06
Hauptverfasser: Koike, Ryuto, Kaneko, Masahiro, Okazaki, Naoaki
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Koike, Ryuto
Kaneko, Masahiro
Okazaki, Naoaki
description To combat the misuse of Large Language Models (LLMs), many recent studies have presented LLM-generated-text detectors with promising performance. When users instruct LLMs to generate texts, the instruction can include different constraints depending on the user's need. However, most recent studies do not cover such diverse instruction patterns when creating datasets for LLM detection. In this paper, we reveal that even task-oriented constraints -- constraints that would naturally be included in an instruction and are not related to detection-evasion -- cause existing powerful detectors to have a large variance in detection performance. We focus on student essay writing as a realistic domain and manually create task-oriented constraints based on several factors for essay quality. Our experiments show that the standard deviation (SD) of current detector performance on texts generated by an instruction with such a constraint is significantly larger (up to an SD of 14.4 F1-score) than that by generating texts multiple times or paraphrasing the instruction. We also observe an overall trend where the constraints can make LLM detection more challenging than without them. Finally, our analysis indicates that the high instruction-following ability of LLMs fosters the large impact of such constraints on detection performance.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2890143723</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2890143723</sourcerecordid><originalsourceid>FETCH-proquest_journals_28901437233</originalsourceid><addsrcrecordid>eNqNjsEKgkAYhJcgSMp3-KOzYLuadgyzDIo6eOkUS_2CVru2-1s9fho9QKdhZr6B6TGHCzH14oDzAXOtrXzf57OIh6FwWJXpFxx1Awej7zXBThKhsWNIn6ggl_bq7U2JivACiVaWjCwVWSgVbDrXnKlsY1gUBZ4Jttudt0aFRnaDHN8ESyT8QiPWL-TNovvTIZus0jzJvNroR4OWTpVujGqrE4_n_jQQUXv8P-oDgv1H5Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2890143723</pqid></control><display><type>article</type><title>How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection</title><source>Open Access Journals</source><creator>Koike, Ryuto ; Kaneko, Masahiro ; Okazaki, Naoaki</creator><creatorcontrib>Koike, Ryuto ; Kaneko, Masahiro ; Okazaki, Naoaki</creatorcontrib><description>To combat the misuse of Large Language Models (LLMs), many recent studies have presented LLM-generated-text detectors with promising performance. When users instruct LLMs to generate texts, the instruction can include different constraints depending on the user's need. However, most recent studies do not cover such diverse instruction patterns when creating datasets for LLM detection. In this paper, we reveal that even task-oriented constraints -- constraints that would naturally be included in an instruction and are not related to detection-evasion -- cause existing powerful detectors to have a large variance in detection performance. We focus on student essay writing as a realistic domain and manually create task-oriented constraints based on several factors for essay quality. Our experiments show that the standard deviation (SD) of current detector performance on texts generated by an instruction with such a constraint is significantly larger (up to an SD of 14.4 F1-score) than that by generating texts multiple times or paraphrasing the instruction. We also observe an overall trend where the constraints can make LLM detection more challenging than without them. Finally, our analysis indicates that the high instruction-following ability of LLMs fosters the large impact of such constraints on detection performance.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Detectors ; Large language models ; Sensors ; Texts</subject><ispartof>arXiv.org, 2024-06</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Koike, Ryuto</creatorcontrib><creatorcontrib>Kaneko, Masahiro</creatorcontrib><creatorcontrib>Okazaki, Naoaki</creatorcontrib><title>How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection</title><title>arXiv.org</title><description>To combat the misuse of Large Language Models (LLMs), many recent studies have presented LLM-generated-text detectors with promising performance. When users instruct LLMs to generate texts, the instruction can include different constraints depending on the user's need. However, most recent studies do not cover such diverse instruction patterns when creating datasets for LLM detection. In this paper, we reveal that even task-oriented constraints -- constraints that would naturally be included in an instruction and are not related to detection-evasion -- cause existing powerful detectors to have a large variance in detection performance. We focus on student essay writing as a realistic domain and manually create task-oriented constraints based on several factors for essay quality. Our experiments show that the standard deviation (SD) of current detector performance on texts generated by an instruction with such a constraint is significantly larger (up to an SD of 14.4 F1-score) than that by generating texts multiple times or paraphrasing the instruction. We also observe an overall trend where the constraints can make LLM detection more challenging than without them. Finally, our analysis indicates that the high instruction-following ability of LLMs fosters the large impact of such constraints on detection performance.</description><subject>Detectors</subject><subject>Large language models</subject><subject>Sensors</subject><subject>Texts</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjsEKgkAYhJcgSMp3-KOzYLuadgyzDIo6eOkUS_2CVru2-1s9fho9QKdhZr6B6TGHCzH14oDzAXOtrXzf57OIh6FwWJXpFxx1Awej7zXBThKhsWNIn6ggl_bq7U2JivACiVaWjCwVWSgVbDrXnKlsY1gUBZ4Jttudt0aFRnaDHN8ESyT8QiPWL-TNovvTIZus0jzJvNroR4OWTpVujGqrE4_n_jQQUXv8P-oDgv1H5Q</recordid><startdate>20240612</startdate><enddate>20240612</enddate><creator>Koike, Ryuto</creator><creator>Kaneko, Masahiro</creator><creator>Okazaki, Naoaki</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240612</creationdate><title>How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection</title><author>Koike, Ryuto ; Kaneko, Masahiro ; Okazaki, Naoaki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28901437233</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Detectors</topic><topic>Large language models</topic><topic>Sensors</topic><topic>Texts</topic><toplevel>online_resources</toplevel><creatorcontrib>Koike, Ryuto</creatorcontrib><creatorcontrib>Kaneko, Masahiro</creatorcontrib><creatorcontrib>Okazaki, Naoaki</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Databases</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Koike, Ryuto</au><au>Kaneko, Masahiro</au><au>Okazaki, Naoaki</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection</atitle><jtitle>arXiv.org</jtitle><date>2024-06-12</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>To combat the misuse of Large Language Models (LLMs), many recent studies have presented LLM-generated-text detectors with promising performance. When users instruct LLMs to generate texts, the instruction can include different constraints depending on the user's need. However, most recent studies do not cover such diverse instruction patterns when creating datasets for LLM detection. In this paper, we reveal that even task-oriented constraints -- constraints that would naturally be included in an instruction and are not related to detection-evasion -- cause existing powerful detectors to have a large variance in detection performance. We focus on student essay writing as a realistic domain and manually create task-oriented constraints based on several factors for essay quality. Our experiments show that the standard deviation (SD) of current detector performance on texts generated by an instruction with such a constraint is significantly larger (up to an SD of 14.4 F1-score) than that by generating texts multiple times or paraphrasing the instruction. We also observe an overall trend where the constraints can make LLM detection more challenging than without them. Finally, our analysis indicates that the high instruction-following ability of LLMs fosters the large impact of such constraints on detection performance.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-06
issn 2331-8422
language eng
recordid cdi_proquest_journals_2890143723
source Open Access Journals
subjects Detectors
Large language models
Sensors
Texts
title How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T16%3A19%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=How%20You%20Prompt%20Matters!%20Even%20Task-Oriented%20Constraints%20in%20Instructions%20Affect%20LLM-Generated%20Text%20Detection&rft.jtitle=arXiv.org&rft.au=Koike,%20Ryuto&rft.date=2024-06-12&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2890143723%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2890143723&rft_id=info:pmid/&rfr_iscdi=true