Manipulating Pre-trained Encoder for Targeted Poisoning Attacks in Contrastive Learning

In recent years, contrastive learning has become very powerful for representation learning using large-scale unlabeled data, by involving pretrained encoders to fine-tune downstream classifiers. However, the latest research indicates that contrastive learning can potentially suffer from the risks of...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on information forensics and security 2024-01, Vol.19, p.1-1
Hauptverfasser:	Chen, Jian, Gao, Yuan, Liu, Gaoyang, Abdelmoniem, Ahmed M., Wang, Chen
Format:	Artikel
Sprache:	eng
Schlagworte:	Behavioral sciences Classification Classifiers Coders contrastive learning Design optimization Feature extraction Machine learning Pipelines poisoned pre-trained encoder Poisons Representations Targeted poisoning attack Task analysis Testing Toxicology Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue
container_start_page	1
container_title	IEEE transactions on information forensics and security
container_volume	19
creator	Chen, Jian Gao, Yuan Liu, Gaoyang Abdelmoniem, Ahmed M. Wang, Chen
description	In recent years, contrastive learning has become very powerful for representation learning using large-scale unlabeled data, by involving pretrained encoders to fine-tune downstream classifiers. However, the latest research indicates that contrastive learning can potentially suffer from the risks of data poisoning attacks, where the attacker injects maliciously crafted poisoned samples into the unlabeled pretraining data. To step forward, in this paper, we present a more stealthy poisoning attack dubbed PA-CL to directly poison the pretrained encoder, such that the downstream classifier's behavior on a single target instance to the attacker-desired class can be manipulated without affecting the overall downstream classification performance. We observe that a high similarity exists between the feature representation generated by the poisoned pretrained encoder for the target sample and samples from the attacker-desired class. This leads to the downstream classifier misclassifying the target sample with the attacker-desired class. Therefore, we formulate our attack as an optimization problem, and design two novel loss functions, namely, the target effectiveness loss to effectively poison the pretrained encoder, and the model utility loss to maintain the downstream classification performance. Experimental results on four real-world datasets demonstrate that the attack success rate of the proposed attack is 40% higher on average than that of the three baseline attacks, and the fluctuation of the downstream classifier's prediction accuracy is within 5%.
doi_str_mv	10.1109/TIFS.2024.3350389
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10381885</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10381885</ieee_id><sourcerecordid>2913512800</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-222e2b629c53b16e2d090fd77e2bb2cbb1155d16f78b1e8ee64ca26f62f1068d3</originalsourceid><addsrcrecordid>eNpNkEFLAzEQhYMoWKs_QPCw4HlrJtlNs8dSWi1ULFjxGLK7syW1JjVJBf-9WVrE0wxv3psZPkJugY4AaPWwXsxfR4yyYsR5SbmszsgAylLkgjI4_-uBX5KrELaUFgUIOSDvz9qa_WGno7GbbOUxj14bi202s41r0Wed89la-w3GJK6cCc721kmMuvkImbHZ1NkUCtF8Y7ZE7fv5Nbno9C7gzakOydt8tp4-5cuXx8V0sswbVoiYM8aQ1YJVTclrEMhaWtGuHY-TWrOmriE93oLoxrIGlIiiaDQTnWAdUCFbPiT3x717774OGKLauoO36aRiFfASmKQ0ueDoarwLwWOn9t58av-jgKqen-r5qZ6fOvFLmbtjxiDiPz-XIGXJfwE2mGx6</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2913512800</pqid></control><display><type>article</type><title>Manipulating Pre-trained Encoder for Targeted Poisoning Attacks in Contrastive Learning</title><source>IEEE Electronic Library (IEL)</source><creator>Chen, Jian ; Gao, Yuan ; Liu, Gaoyang ; Abdelmoniem, Ahmed M. ; Wang, Chen</creator><creatorcontrib>Chen, Jian ; Gao, Yuan ; Liu, Gaoyang ; Abdelmoniem, Ahmed M. ; Wang, Chen</creatorcontrib><description>In recent years, contrastive learning has become very powerful for representation learning using large-scale unlabeled data, by involving pretrained encoders to fine-tune downstream classifiers. However, the latest research indicates that contrastive learning can potentially suffer from the risks of data poisoning attacks, where the attacker injects maliciously crafted poisoned samples into the unlabeled pretraining data. To step forward, in this paper, we present a more stealthy poisoning attack dubbed PA-CL to directly poison the pretrained encoder, such that the downstream classifier's behavior on a single target instance to the attacker-desired class can be manipulated without affecting the overall downstream classification performance. We observe that a high similarity exists between the feature representation generated by the poisoned pretrained encoder for the target sample and samples from the attacker-desired class. This leads to the downstream classifier misclassifying the target sample with the attacker-desired class. Therefore, we formulate our attack as an optimization problem, and design two novel loss functions, namely, the target effectiveness loss to effectively poison the pretrained encoder, and the model utility loss to maintain the downstream classification performance. Experimental results on four real-world datasets demonstrate that the attack success rate of the proposed attack is 40% higher on average than that of the three baseline attacks, and the fluctuation of the downstream classifier's prediction accuracy is within 5%.</description><identifier>ISSN: 1556-6013</identifier><identifier>EISSN: 1556-6021</identifier><identifier>DOI: 10.1109/TIFS.2024.3350389</identifier><identifier>CODEN: ITIFA6</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Behavioral sciences ; Classification ; Classifiers ; Coders ; contrastive learning ; Design optimization ; Feature extraction ; Machine learning ; Pipelines ; poisoned pre-trained encoder ; Poisons ; Representations ; Targeted poisoning attack ; Task analysis ; Testing ; Toxicology ; Training</subject><ispartof>IEEE transactions on information forensics and security, 2024-01, Vol.19, p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c246t-222e2b629c53b16e2d090fd77e2bb2cbb1155d16f78b1e8ee64ca26f62f1068d3</cites><orcidid>0000-0003-2566-9360 ; 0000-0003-3738-2092 ; 0000-0003-1963-4954 ; 0000-0002-1374-1882</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10381885$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10381885$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Chen, Jian</creatorcontrib><creatorcontrib>Gao, Yuan</creatorcontrib><creatorcontrib>Liu, Gaoyang</creatorcontrib><creatorcontrib>Abdelmoniem, Ahmed M.</creatorcontrib><creatorcontrib>Wang, Chen</creatorcontrib><title>Manipulating Pre-trained Encoder for Targeted Poisoning Attacks in Contrastive Learning</title><title>IEEE transactions on information forensics and security</title><addtitle>TIFS</addtitle><description>In recent years, contrastive learning has become very powerful for representation learning using large-scale unlabeled data, by involving pretrained encoders to fine-tune downstream classifiers. However, the latest research indicates that contrastive learning can potentially suffer from the risks of data poisoning attacks, where the attacker injects maliciously crafted poisoned samples into the unlabeled pretraining data. To step forward, in this paper, we present a more stealthy poisoning attack dubbed PA-CL to directly poison the pretrained encoder, such that the downstream classifier's behavior on a single target instance to the attacker-desired class can be manipulated without affecting the overall downstream classification performance. We observe that a high similarity exists between the feature representation generated by the poisoned pretrained encoder for the target sample and samples from the attacker-desired class. This leads to the downstream classifier misclassifying the target sample with the attacker-desired class. Therefore, we formulate our attack as an optimization problem, and design two novel loss functions, namely, the target effectiveness loss to effectively poison the pretrained encoder, and the model utility loss to maintain the downstream classification performance. Experimental results on four real-world datasets demonstrate that the attack success rate of the proposed attack is 40% higher on average than that of the three baseline attacks, and the fluctuation of the downstream classifier's prediction accuracy is within 5%.</description><subject>Behavioral sciences</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Coders</subject><subject>contrastive learning</subject><subject>Design optimization</subject><subject>Feature extraction</subject><subject>Machine learning</subject><subject>Pipelines</subject><subject>poisoned pre-trained encoder</subject><subject>Poisons</subject><subject>Representations</subject><subject>Targeted poisoning attack</subject><subject>Task analysis</subject><subject>Testing</subject><subject>Toxicology</subject><subject>Training</subject><issn>1556-6013</issn><issn>1556-6021</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkEFLAzEQhYMoWKs_QPCw4HlrJtlNs8dSWi1ULFjxGLK7syW1JjVJBf-9WVrE0wxv3psZPkJugY4AaPWwXsxfR4yyYsR5SbmszsgAylLkgjI4_-uBX5KrELaUFgUIOSDvz9qa_WGno7GbbOUxj14bi202s41r0Wed89la-w3GJK6cCc721kmMuvkImbHZ1NkUCtF8Y7ZE7fv5Nbno9C7gzakOydt8tp4-5cuXx8V0sswbVoiYM8aQ1YJVTclrEMhaWtGuHY-TWrOmriE93oLoxrIGlIiiaDQTnWAdUCFbPiT3x717774OGKLauoO36aRiFfASmKQ0ueDoarwLwWOn9t58av-jgKqen-r5qZ6fOvFLmbtjxiDiPz-XIGXJfwE2mGx6</recordid><startdate>20240101</startdate><enddate>20240101</enddate><creator>Chen, Jian</creator><creator>Gao, Yuan</creator><creator>Liu, Gaoyang</creator><creator>Abdelmoniem, Ahmed M.</creator><creator>Wang, Chen</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-2566-9360</orcidid><orcidid>https://orcid.org/0000-0003-3738-2092</orcidid><orcidid>https://orcid.org/0000-0003-1963-4954</orcidid><orcidid>https://orcid.org/0000-0002-1374-1882</orcidid></search><sort><creationdate>20240101</creationdate><title>Manipulating Pre-trained Encoder for Targeted Poisoning Attacks in Contrastive Learning</title><author>Chen, Jian ; Gao, Yuan ; Liu, Gaoyang ; Abdelmoniem, Ahmed M. ; Wang, Chen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-222e2b629c53b16e2d090fd77e2bb2cbb1155d16f78b1e8ee64ca26f62f1068d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Behavioral sciences</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Coders</topic><topic>contrastive learning</topic><topic>Design optimization</topic><topic>Feature extraction</topic><topic>Machine learning</topic><topic>Pipelines</topic><topic>poisoned pre-trained encoder</topic><topic>Poisons</topic><topic>Representations</topic><topic>Targeted poisoning attack</topic><topic>Task analysis</topic><topic>Testing</topic><topic>Toxicology</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Jian</creatorcontrib><creatorcontrib>Gao, Yuan</creatorcontrib><creatorcontrib>Liu, Gaoyang</creatorcontrib><creatorcontrib>Abdelmoniem, Ahmed M.</creatorcontrib><creatorcontrib>Wang, Chen</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on information forensics and security</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chen, Jian</au><au>Gao, Yuan</au><au>Liu, Gaoyang</au><au>Abdelmoniem, Ahmed M.</au><au>Wang, Chen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Manipulating Pre-trained Encoder for Targeted Poisoning Attacks in Contrastive Learning</atitle><jtitle>IEEE transactions on information forensics and security</jtitle><stitle>TIFS</stitle><date>2024-01-01</date><risdate>2024</risdate><volume>19</volume><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>1556-6013</issn><eissn>1556-6021</eissn><coden>ITIFA6</coden><abstract>In recent years, contrastive learning has become very powerful for representation learning using large-scale unlabeled data, by involving pretrained encoders to fine-tune downstream classifiers. However, the latest research indicates that contrastive learning can potentially suffer from the risks of data poisoning attacks, where the attacker injects maliciously crafted poisoned samples into the unlabeled pretraining data. To step forward, in this paper, we present a more stealthy poisoning attack dubbed PA-CL to directly poison the pretrained encoder, such that the downstream classifier's behavior on a single target instance to the attacker-desired class can be manipulated without affecting the overall downstream classification performance. We observe that a high similarity exists between the feature representation generated by the poisoned pretrained encoder for the target sample and samples from the attacker-desired class. This leads to the downstream classifier misclassifying the target sample with the attacker-desired class. Therefore, we formulate our attack as an optimization problem, and design two novel loss functions, namely, the target effectiveness loss to effectively poison the pretrained encoder, and the model utility loss to maintain the downstream classification performance. Experimental results on four real-world datasets demonstrate that the attack success rate of the proposed attack is 40% higher on average than that of the three baseline attacks, and the fluctuation of the downstream classifier's prediction accuracy is within 5%.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIFS.2024.3350389</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0003-2566-9360</orcidid><orcidid>https://orcid.org/0000-0003-3738-2092</orcidid><orcidid>https://orcid.org/0000-0003-1963-4954</orcidid><orcidid>https://orcid.org/0000-0002-1374-1882</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1556-6013
ispartof	IEEE transactions on information forensics and security, 2024-01, Vol.19, p.1-1
issn	1556-6013 1556-6021
language	eng
recordid	cdi_ieee_primary_10381885
source	IEEE Electronic Library (IEL)
subjects	Behavioral sciences Classification Classifiers Coders contrastive learning Design optimization Feature extraction Machine learning Pipelines poisoned pre-trained encoder Poisons Representations Targeted poisoning attack Task analysis Testing Toxicology Training
title	Manipulating Pre-trained Encoder for Targeted Poisoning Attacks in Contrastive Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T10%3A56%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Manipulating%20Pre-trained%20Encoder%20for%20Targeted%20Poisoning%20Attacks%20in%20Contrastive%20Learning&rft.jtitle=IEEE%20transactions%20on%20information%20forensics%20and%20security&rft.au=Chen,%20Jian&rft.date=2024-01-01&rft.volume=19&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=1556-6013&rft.eissn=1556-6021&rft.coden=ITIFA6&rft_id=info:doi/10.1109/TIFS.2024.3350389&rft_dat=%3Cproquest_RIE%3E2913512800%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2913512800&rft_id=info:pmid/&rft_ieee_id=10381885&rfr_iscdi=true