Trojan Prompt Attacks on Graph Neural Networks

Graph Prompt Learning (GPL) has been introduced as a promising approach that uses prompts to adapt pre-trained GNN models to specific downstream tasks without requiring fine-tuning of the entire model. Despite the advantages of GPL, little attention has been given to its vulnerability to backdoor at...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lin, Minhua, Zhang, Zhiwei, Dai, Enyan, Wu, Zongyu, Wang, Yilong, Zhang, Xiang, Wang, Suhang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Lin, Minhua
Zhang, Zhiwei
Dai, Enyan
Wu, Zongyu
Wang, Yilong
Zhang, Xiang
Wang, Suhang
description Graph Prompt Learning (GPL) has been introduced as a promising approach that uses prompts to adapt pre-trained GNN models to specific downstream tasks without requiring fine-tuning of the entire model. Despite the advantages of GPL, little attention has been given to its vulnerability to backdoor attacks, where an adversary can manipulate the model's behavior by embedding hidden triggers. Existing graph backdoor attacks rely on modifying model parameters during training, but this approach is impractical in GPL as GNN encoder parameters are frozen after pre-training. Moreover, downstream users may fine-tune their own task models on clean datasets, further complicating the attack. In this paper, we propose TGPA, a backdoor attack framework designed specifically for GPL. TGPA injects backdoors into graph prompts without modifying pre-trained GNN encoders and ensures high attack success rates and clean accuracy. To address the challenge of model fine-tuning by users, we introduce a finetuning-resistant poisoning approach that maintains the effectiveness of the backdoor even after downstream model adjustments. Extensive experiments on multiple datasets under various settings demonstrate the effectiveness of TGPA in compromising GPL models with fixed GNN encoders.
doi_str_mv 10.48550/arxiv.2410.13974
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2410_13974</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2410_13974</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2410_139743</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGBpbmptwMuiFFOVnJeYpBBTl5xaUKDiWlCQmZxcr5OcpuBclFmQo-KWWFiXmAKmS8vyi7GIeBta0xJziVF4ozc0g7-Ya4uyhCzY5vqAoMzexqDIeZEM82AZjwioAvk4vzQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Trojan Prompt Attacks on Graph Neural Networks</title><source>arXiv.org</source><creator>Lin, Minhua ; Zhang, Zhiwei ; Dai, Enyan ; Wu, Zongyu ; Wang, Yilong ; Zhang, Xiang ; Wang, Suhang</creator><creatorcontrib>Lin, Minhua ; Zhang, Zhiwei ; Dai, Enyan ; Wu, Zongyu ; Wang, Yilong ; Zhang, Xiang ; Wang, Suhang</creatorcontrib><description>Graph Prompt Learning (GPL) has been introduced as a promising approach that uses prompts to adapt pre-trained GNN models to specific downstream tasks without requiring fine-tuning of the entire model. Despite the advantages of GPL, little attention has been given to its vulnerability to backdoor attacks, where an adversary can manipulate the model's behavior by embedding hidden triggers. Existing graph backdoor attacks rely on modifying model parameters during training, but this approach is impractical in GPL as GNN encoder parameters are frozen after pre-training. Moreover, downstream users may fine-tune their own task models on clean datasets, further complicating the attack. In this paper, we propose TGPA, a backdoor attack framework designed specifically for GPL. TGPA injects backdoors into graph prompts without modifying pre-trained GNN encoders and ensures high attack success rates and clean accuracy. To address the challenge of model fine-tuning by users, we introduce a finetuning-resistant poisoning approach that maintains the effectiveness of the backdoor even after downstream model adjustments. Extensive experiments on multiple datasets under various settings demonstrate the effectiveness of TGPA in compromising GPL models with fixed GNN encoders.</description><identifier>DOI: 10.48550/arxiv.2410.13974</identifier><language>eng</language><subject>Computer Science - Cryptography and Security ; Computer Science - Learning</subject><creationdate>2024-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,778,883</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2410.13974$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2410.13974$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Lin, Minhua</creatorcontrib><creatorcontrib>Zhang, Zhiwei</creatorcontrib><creatorcontrib>Dai, Enyan</creatorcontrib><creatorcontrib>Wu, Zongyu</creatorcontrib><creatorcontrib>Wang, Yilong</creatorcontrib><creatorcontrib>Zhang, Xiang</creatorcontrib><creatorcontrib>Wang, Suhang</creatorcontrib><title>Trojan Prompt Attacks on Graph Neural Networks</title><description>Graph Prompt Learning (GPL) has been introduced as a promising approach that uses prompts to adapt pre-trained GNN models to specific downstream tasks without requiring fine-tuning of the entire model. Despite the advantages of GPL, little attention has been given to its vulnerability to backdoor attacks, where an adversary can manipulate the model's behavior by embedding hidden triggers. Existing graph backdoor attacks rely on modifying model parameters during training, but this approach is impractical in GPL as GNN encoder parameters are frozen after pre-training. Moreover, downstream users may fine-tune their own task models on clean datasets, further complicating the attack. In this paper, we propose TGPA, a backdoor attack framework designed specifically for GPL. TGPA injects backdoors into graph prompts without modifying pre-trained GNN encoders and ensures high attack success rates and clean accuracy. To address the challenge of model fine-tuning by users, we introduce a finetuning-resistant poisoning approach that maintains the effectiveness of the backdoor even after downstream model adjustments. Extensive experiments on multiple datasets under various settings demonstrate the effectiveness of TGPA in compromising GPL models with fixed GNN encoders.</description><subject>Computer Science - Cryptography and Security</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGBpbmptwMuiFFOVnJeYpBBTl5xaUKDiWlCQmZxcr5OcpuBclFmQo-KWWFiXmAKmS8vyi7GIeBta0xJziVF4ozc0g7-Ya4uyhCzY5vqAoMzexqDIeZEM82AZjwioAvk4vzQ</recordid><startdate>20241017</startdate><enddate>20241017</enddate><creator>Lin, Minhua</creator><creator>Zhang, Zhiwei</creator><creator>Dai, Enyan</creator><creator>Wu, Zongyu</creator><creator>Wang, Yilong</creator><creator>Zhang, Xiang</creator><creator>Wang, Suhang</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241017</creationdate><title>Trojan Prompt Attacks on Graph Neural Networks</title><author>Lin, Minhua ; Zhang, Zhiwei ; Dai, Enyan ; Wu, Zongyu ; Wang, Yilong ; Zhang, Xiang ; Wang, Suhang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2410_139743</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Cryptography and Security</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Lin, Minhua</creatorcontrib><creatorcontrib>Zhang, Zhiwei</creatorcontrib><creatorcontrib>Dai, Enyan</creatorcontrib><creatorcontrib>Wu, Zongyu</creatorcontrib><creatorcontrib>Wang, Yilong</creatorcontrib><creatorcontrib>Zhang, Xiang</creatorcontrib><creatorcontrib>Wang, Suhang</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lin, Minhua</au><au>Zhang, Zhiwei</au><au>Dai, Enyan</au><au>Wu, Zongyu</au><au>Wang, Yilong</au><au>Zhang, Xiang</au><au>Wang, Suhang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Trojan Prompt Attacks on Graph Neural Networks</atitle><date>2024-10-17</date><risdate>2024</risdate><abstract>Graph Prompt Learning (GPL) has been introduced as a promising approach that uses prompts to adapt pre-trained GNN models to specific downstream tasks without requiring fine-tuning of the entire model. Despite the advantages of GPL, little attention has been given to its vulnerability to backdoor attacks, where an adversary can manipulate the model's behavior by embedding hidden triggers. Existing graph backdoor attacks rely on modifying model parameters during training, but this approach is impractical in GPL as GNN encoder parameters are frozen after pre-training. Moreover, downstream users may fine-tune their own task models on clean datasets, further complicating the attack. In this paper, we propose TGPA, a backdoor attack framework designed specifically for GPL. TGPA injects backdoors into graph prompts without modifying pre-trained GNN encoders and ensures high attack success rates and clean accuracy. To address the challenge of model fine-tuning by users, we introduce a finetuning-resistant poisoning approach that maintains the effectiveness of the backdoor even after downstream model adjustments. Extensive experiments on multiple datasets under various settings demonstrate the effectiveness of TGPA in compromising GPL models with fixed GNN encoders.</abstract><doi>10.48550/arxiv.2410.13974</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2410.13974
ispartof
issn
language eng
recordid cdi_arxiv_primary_2410_13974
source arXiv.org
subjects Computer Science - Cryptography and Security
Computer Science - Learning
title Trojan Prompt Attacks on Graph Neural Networks
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T14%3A08%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Trojan%20Prompt%20Attacks%20on%20Graph%20Neural%20Networks&rft.au=Lin,%20Minhua&rft.date=2024-10-17&rft_id=info:doi/10.48550/arxiv.2410.13974&rft_dat=%3Carxiv_GOX%3E2410_13974%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true