Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection
Large Language Models (LLMs) have revolutionized text generation, making detecting machine-generated text increasingly challenging. Although past methods have achieved good performance on detecting pure machine-generated text, those detectors have poor performance on distinguishing machine-revised t...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Chen, Jiaqi Zhu, Xiaoye Liu, Tianyang Chen, Ying Chen, Xinhui Yuan, Yiwen Leong, Chak Tou Li, Zuchao Long, Tang Zhang, Lei Yan, Chenyu Mei, Guanghao Zhang, Jie Zhang, Lefei |
description | Large Language Models (LLMs) have revolutionized text generation, making
detecting machine-generated text increasingly challenging. Although past
methods have achieved good performance on detecting pure machine-generated
text, those detectors have poor performance on distinguishing machine-revised
text (rewriting, expansion, and polishing), which can have only minor changes
from its original human prompt. As the content of text may originate from human
prompts, detecting machine-revised text often involves identifying distinctive
machine styles, e.g., worded favored by LLMs. However, existing methods
struggle to detect machine-style phrasing hidden within the content contributed
by humans. We propose the "Imitate Before Detect" (ImBD) approach, which first
imitates the machine-style token distribution, and then compares the
distribution of the text to be tested with the machine-style distribution to
determine whether the text has been machine-revised. To this end, we introduce
style preference optimization (SPO), which aligns a scoring LLM model to the
preference of text styles generated by machines. The aligned scoring model is
then used to calculate the style-conditional probability curvature (Style-CPC),
quantifying the log probability difference between the original and
conditionally sampled texts for effective detection. We conduct extensive
comparisons across various scenarios, encompassing text revisions by six LLMs,
four distinct text domains, and three machine revision types. Compared to
existing state-of-the-art methods, our method yields a 13% increase in AUC for
detecting text revised by open-source LLMs, and improves performance by 5% and
19% for detecting GPT-3.5 and GPT-4o revised text, respectively. Notably, our
method surpasses the commercially trained GPT-Zero with just $1,000$ samples
and five minutes of SPO, demonstrating its efficiency and effectiveness. |
doi_str_mv | 10.48550/arxiv.2412.10432 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_10432</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_10432</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_104323</originalsourceid><addsrcrecordid>eNqFjrsKwkAQRbexEPUDrJwfSMwTxM4nWgii6ZdlncSBzUY2Q0j-Xg2xtrrNuYcjxDwM_GSVpsFSuZYaP0rCyA-DJI7GQp5LYsUIW8wrh7BHRs1r2BgqLNkCLko_ySLcuTNUM2m4OszRodUIn8sP8G7YUI0PyLDlQUOVnYpRrkyNs2EnYnE8ZLuT16fIl6NSuU5-k2SfFP8n3tKFQaM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection</title><source>arXiv.org</source><creator>Chen, Jiaqi ; Zhu, Xiaoye ; Liu, Tianyang ; Chen, Ying ; Chen, Xinhui ; Yuan, Yiwen ; Leong, Chak Tou ; Li, Zuchao ; Long, Tang ; Zhang, Lei ; Yan, Chenyu ; Mei, Guanghao ; Zhang, Jie ; Zhang, Lefei</creator><creatorcontrib>Chen, Jiaqi ; Zhu, Xiaoye ; Liu, Tianyang ; Chen, Ying ; Chen, Xinhui ; Yuan, Yiwen ; Leong, Chak Tou ; Li, Zuchao ; Long, Tang ; Zhang, Lei ; Yan, Chenyu ; Mei, Guanghao ; Zhang, Jie ; Zhang, Lefei</creatorcontrib><description>Large Language Models (LLMs) have revolutionized text generation, making
detecting machine-generated text increasingly challenging. Although past
methods have achieved good performance on detecting pure machine-generated
text, those detectors have poor performance on distinguishing machine-revised
text (rewriting, expansion, and polishing), which can have only minor changes
from its original human prompt. As the content of text may originate from human
prompts, detecting machine-revised text often involves identifying distinctive
machine styles, e.g., worded favored by LLMs. However, existing methods
struggle to detect machine-style phrasing hidden within the content contributed
by humans. We propose the "Imitate Before Detect" (ImBD) approach, which first
imitates the machine-style token distribution, and then compares the
distribution of the text to be tested with the machine-style distribution to
determine whether the text has been machine-revised. To this end, we introduce
style preference optimization (SPO), which aligns a scoring LLM model to the
preference of text styles generated by machines. The aligned scoring model is
then used to calculate the style-conditional probability curvature (Style-CPC),
quantifying the log probability difference between the original and
conditionally sampled texts for effective detection. We conduct extensive
comparisons across various scenarios, encompassing text revisions by six LLMs,
four distinct text domains, and three machine revision types. Compared to
existing state-of-the-art methods, our method yields a 13% increase in AUC for
detecting text revised by open-source LLMs, and improves performance by 5% and
19% for detecting GPT-3.5 and GPT-4o revised text, respectively. Notably, our
method surpasses the commercially trained GPT-Zero with just $1,000$ samples
and five minutes of SPO, demonstrating its efficiency and effectiveness.</description><identifier>DOI: 10.48550/arxiv.2412.10432</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Cryptography and Security</subject><creationdate>2024-12</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.10432$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.10432$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Chen, Jiaqi</creatorcontrib><creatorcontrib>Zhu, Xiaoye</creatorcontrib><creatorcontrib>Liu, Tianyang</creatorcontrib><creatorcontrib>Chen, Ying</creatorcontrib><creatorcontrib>Chen, Xinhui</creatorcontrib><creatorcontrib>Yuan, Yiwen</creatorcontrib><creatorcontrib>Leong, Chak Tou</creatorcontrib><creatorcontrib>Li, Zuchao</creatorcontrib><creatorcontrib>Long, Tang</creatorcontrib><creatorcontrib>Zhang, Lei</creatorcontrib><creatorcontrib>Yan, Chenyu</creatorcontrib><creatorcontrib>Mei, Guanghao</creatorcontrib><creatorcontrib>Zhang, Jie</creatorcontrib><creatorcontrib>Zhang, Lefei</creatorcontrib><title>Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection</title><description>Large Language Models (LLMs) have revolutionized text generation, making
detecting machine-generated text increasingly challenging. Although past
methods have achieved good performance on detecting pure machine-generated
text, those detectors have poor performance on distinguishing machine-revised
text (rewriting, expansion, and polishing), which can have only minor changes
from its original human prompt. As the content of text may originate from human
prompts, detecting machine-revised text often involves identifying distinctive
machine styles, e.g., worded favored by LLMs. However, existing methods
struggle to detect machine-style phrasing hidden within the content contributed
by humans. We propose the "Imitate Before Detect" (ImBD) approach, which first
imitates the machine-style token distribution, and then compares the
distribution of the text to be tested with the machine-style distribution to
determine whether the text has been machine-revised. To this end, we introduce
style preference optimization (SPO), which aligns a scoring LLM model to the
preference of text styles generated by machines. The aligned scoring model is
then used to calculate the style-conditional probability curvature (Style-CPC),
quantifying the log probability difference between the original and
conditionally sampled texts for effective detection. We conduct extensive
comparisons across various scenarios, encompassing text revisions by six LLMs,
four distinct text domains, and three machine revision types. Compared to
existing state-of-the-art methods, our method yields a 13% increase in AUC for
detecting text revised by open-source LLMs, and improves performance by 5% and
19% for detecting GPT-3.5 and GPT-4o revised text, respectively. Notably, our
method surpasses the commercially trained GPT-Zero with just $1,000$ samples
and five minutes of SPO, demonstrating its efficiency and effectiveness.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Cryptography and Security</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjrsKwkAQRbexEPUDrJwfSMwTxM4nWgii6ZdlncSBzUY2Q0j-Xg2xtrrNuYcjxDwM_GSVpsFSuZYaP0rCyA-DJI7GQp5LYsUIW8wrh7BHRs1r2BgqLNkCLko_ySLcuTNUM2m4OszRodUIn8sP8G7YUI0PyLDlQUOVnYpRrkyNs2EnYnE8ZLuT16fIl6NSuU5-k2SfFP8n3tKFQaM</recordid><startdate>20241210</startdate><enddate>20241210</enddate><creator>Chen, Jiaqi</creator><creator>Zhu, Xiaoye</creator><creator>Liu, Tianyang</creator><creator>Chen, Ying</creator><creator>Chen, Xinhui</creator><creator>Yuan, Yiwen</creator><creator>Leong, Chak Tou</creator><creator>Li, Zuchao</creator><creator>Long, Tang</creator><creator>Zhang, Lei</creator><creator>Yan, Chenyu</creator><creator>Mei, Guanghao</creator><creator>Zhang, Jie</creator><creator>Zhang, Lefei</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241210</creationdate><title>Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection</title><author>Chen, Jiaqi ; Zhu, Xiaoye ; Liu, Tianyang ; Chen, Ying ; Chen, Xinhui ; Yuan, Yiwen ; Leong, Chak Tou ; Li, Zuchao ; Long, Tang ; Zhang, Lei ; Yan, Chenyu ; Mei, Guanghao ; Zhang, Jie ; Zhang, Lefei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_104323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Cryptography and Security</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Jiaqi</creatorcontrib><creatorcontrib>Zhu, Xiaoye</creatorcontrib><creatorcontrib>Liu, Tianyang</creatorcontrib><creatorcontrib>Chen, Ying</creatorcontrib><creatorcontrib>Chen, Xinhui</creatorcontrib><creatorcontrib>Yuan, Yiwen</creatorcontrib><creatorcontrib>Leong, Chak Tou</creatorcontrib><creatorcontrib>Li, Zuchao</creatorcontrib><creatorcontrib>Long, Tang</creatorcontrib><creatorcontrib>Zhang, Lei</creatorcontrib><creatorcontrib>Yan, Chenyu</creatorcontrib><creatorcontrib>Mei, Guanghao</creatorcontrib><creatorcontrib>Zhang, Jie</creatorcontrib><creatorcontrib>Zhang, Lefei</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chen, Jiaqi</au><au>Zhu, Xiaoye</au><au>Liu, Tianyang</au><au>Chen, Ying</au><au>Chen, Xinhui</au><au>Yuan, Yiwen</au><au>Leong, Chak Tou</au><au>Li, Zuchao</au><au>Long, Tang</au><au>Zhang, Lei</au><au>Yan, Chenyu</au><au>Mei, Guanghao</au><au>Zhang, Jie</au><au>Zhang, Lefei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection</atitle><date>2024-12-10</date><risdate>2024</risdate><abstract>Large Language Models (LLMs) have revolutionized text generation, making
detecting machine-generated text increasingly challenging. Although past
methods have achieved good performance on detecting pure machine-generated
text, those detectors have poor performance on distinguishing machine-revised
text (rewriting, expansion, and polishing), which can have only minor changes
from its original human prompt. As the content of text may originate from human
prompts, detecting machine-revised text often involves identifying distinctive
machine styles, e.g., worded favored by LLMs. However, existing methods
struggle to detect machine-style phrasing hidden within the content contributed
by humans. We propose the "Imitate Before Detect" (ImBD) approach, which first
imitates the machine-style token distribution, and then compares the
distribution of the text to be tested with the machine-style distribution to
determine whether the text has been machine-revised. To this end, we introduce
style preference optimization (SPO), which aligns a scoring LLM model to the
preference of text styles generated by machines. The aligned scoring model is
then used to calculate the style-conditional probability curvature (Style-CPC),
quantifying the log probability difference between the original and
conditionally sampled texts for effective detection. We conduct extensive
comparisons across various scenarios, encompassing text revisions by six LLMs,
four distinct text domains, and three machine revision types. Compared to
existing state-of-the-art methods, our method yields a 13% increase in AUC for
detecting text revised by open-source LLMs, and improves performance by 5% and
19% for detecting GPT-3.5 and GPT-4o revised text, respectively. Notably, our
method surpasses the commercially trained GPT-Zero with just $1,000$ samples
and five minutes of SPO, demonstrating its efficiency and effectiveness.</abstract><doi>10.48550/arxiv.2412.10432</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2412.10432 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2412_10432 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Cryptography and Security |
title | Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T01%3A30%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Imitate%20Before%20Detect:%20Aligning%20Machine%20Stylistic%20Preference%20for%20Machine-Revised%20Text%20Detection&rft.au=Chen,%20Jiaqi&rft.date=2024-12-10&rft_id=info:doi/10.48550/arxiv.2412.10432&rft_dat=%3Carxiv_GOX%3E2412_10432%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |