Label Privacy in Split Learning for Large Models with Parameter-Efficient Training

As deep learning models become larger and more expensive, many practitioners turn to fine-tuning APIs. These web services allow fine-tuning a model between two parties: the client that provides the data, and the server that hosts the model. While convenient, these APIs raise a new concern: the data...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-12
Hauptverfasser: Zmushko, Philip, Mansurov, Marat, Svirschevski, Ruslan, Kuznedelev, Denis, Ryabinin, Max, Beznosikov, Aleksandr
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Zmushko, Philip
Mansurov, Marat
Svirschevski, Ruslan
Kuznedelev, Denis
Ryabinin, Max
Beznosikov, Aleksandr
description As deep learning models become larger and more expensive, many practitioners turn to fine-tuning APIs. These web services allow fine-tuning a model between two parties: the client that provides the data, and the server that hosts the model. While convenient, these APIs raise a new concern: the data of the client is at risk of privacy breach during the training procedure. This challenge presents an important practical case of vertical federated learning, where the two parties perform parameter-efficient fine-tuning (PEFT) of a large model. In this study, we systematically search for a way to fine-tune models over an API while keeping the labels private. We analyze the privacy of LoRA, a popular approach for parameter-efficient fine-tuning when training over an API. Using this analysis, we propose P\(^3\)EFT, a multi-party split learning algorithm that takes advantage of existing PEFT properties to maintain privacy at a lower performance overhead. To validate our algorithm, we fine-tune DeBERTa-v2-XXLarge, Flan-T5 Large and LLaMA-2 7B using LoRA adapters on a range of NLP tasks. We find that P\(^3\)EFT is competitive with existing privacy-preserving methods in multi-party and two-party setups while having higher accuracy.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3148950180</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3148950180</sourcerecordid><originalsourceid>FETCH-proquest_journals_31489501803</originalsourceid><addsrcrecordid>eNqNi8EKgkAUAJcgSMp_eNBZWHe17BxGBwMp77LZ01Zstbdr0d9X0Ad0msPMTJgnpAyDJBJixnxrW865WK1FHEuPHTN1xg5y0g9VvUAbOA2ddpChIqNNA3VPkClqEA79BTsLT-2ukCtSN3RIQVrXutJoHBSk9HdZsGmtOov-j3O23KXFdh8M1N9HtK5s-5HMR5UyjJJNzMOEy_-qN5FkPx8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3148950180</pqid></control><display><type>article</type><title>Label Privacy in Split Learning for Large Models with Parameter-Efficient Training</title><source>Free E- Journals</source><creator>Zmushko, Philip ; Mansurov, Marat ; Svirschevski, Ruslan ; Kuznedelev, Denis ; Ryabinin, Max ; Beznosikov, Aleksandr</creator><creatorcontrib>Zmushko, Philip ; Mansurov, Marat ; Svirschevski, Ruslan ; Kuznedelev, Denis ; Ryabinin, Max ; Beznosikov, Aleksandr</creatorcontrib><description>As deep learning models become larger and more expensive, many practitioners turn to fine-tuning APIs. These web services allow fine-tuning a model between two parties: the client that provides the data, and the server that hosts the model. While convenient, these APIs raise a new concern: the data of the client is at risk of privacy breach during the training procedure. This challenge presents an important practical case of vertical federated learning, where the two parties perform parameter-efficient fine-tuning (PEFT) of a large model. In this study, we systematically search for a way to fine-tune models over an API while keeping the labels private. We analyze the privacy of LoRA, a popular approach for parameter-efficient fine-tuning when training over an API. Using this analysis, we propose P\(^3\)EFT, a multi-party split learning algorithm that takes advantage of existing PEFT properties to maintain privacy at a lower performance overhead. To validate our algorithm, we fine-tune DeBERTa-v2-XXLarge, Flan-T5 Large and LLaMA-2 7B using LoRA adapters on a range of NLP tasks. We find that P\(^3\)EFT is competitive with existing privacy-preserving methods in multi-party and two-party setups while having higher accuracy.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Deep learning ; Federated learning ; Labels ; Machine learning ; Parameters ; Privacy ; Web services</subject><ispartof>arXiv.org, 2024-12</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Zmushko, Philip</creatorcontrib><creatorcontrib>Mansurov, Marat</creatorcontrib><creatorcontrib>Svirschevski, Ruslan</creatorcontrib><creatorcontrib>Kuznedelev, Denis</creatorcontrib><creatorcontrib>Ryabinin, Max</creatorcontrib><creatorcontrib>Beznosikov, Aleksandr</creatorcontrib><title>Label Privacy in Split Learning for Large Models with Parameter-Efficient Training</title><title>arXiv.org</title><description>As deep learning models become larger and more expensive, many practitioners turn to fine-tuning APIs. These web services allow fine-tuning a model between two parties: the client that provides the data, and the server that hosts the model. While convenient, these APIs raise a new concern: the data of the client is at risk of privacy breach during the training procedure. This challenge presents an important practical case of vertical federated learning, where the two parties perform parameter-efficient fine-tuning (PEFT) of a large model. In this study, we systematically search for a way to fine-tune models over an API while keeping the labels private. We analyze the privacy of LoRA, a popular approach for parameter-efficient fine-tuning when training over an API. Using this analysis, we propose P\(^3\)EFT, a multi-party split learning algorithm that takes advantage of existing PEFT properties to maintain privacy at a lower performance overhead. To validate our algorithm, we fine-tune DeBERTa-v2-XXLarge, Flan-T5 Large and LLaMA-2 7B using LoRA adapters on a range of NLP tasks. We find that P\(^3\)EFT is competitive with existing privacy-preserving methods in multi-party and two-party setups while having higher accuracy.</description><subject>Algorithms</subject><subject>Deep learning</subject><subject>Federated learning</subject><subject>Labels</subject><subject>Machine learning</subject><subject>Parameters</subject><subject>Privacy</subject><subject>Web services</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNi8EKgkAUAJcgSMp_eNBZWHe17BxGBwMp77LZ01Zstbdr0d9X0Ad0msPMTJgnpAyDJBJixnxrW865WK1FHEuPHTN1xg5y0g9VvUAbOA2ddpChIqNNA3VPkClqEA79BTsLT-2ukCtSN3RIQVrXutJoHBSk9HdZsGmtOov-j3O23KXFdh8M1N9HtK5s-5HMR5UyjJJNzMOEy_-qN5FkPx8</recordid><startdate>20241221</startdate><enddate>20241221</enddate><creator>Zmushko, Philip</creator><creator>Mansurov, Marat</creator><creator>Svirschevski, Ruslan</creator><creator>Kuznedelev, Denis</creator><creator>Ryabinin, Max</creator><creator>Beznosikov, Aleksandr</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241221</creationdate><title>Label Privacy in Split Learning for Large Models with Parameter-Efficient Training</title><author>Zmushko, Philip ; Mansurov, Marat ; Svirschevski, Ruslan ; Kuznedelev, Denis ; Ryabinin, Max ; Beznosikov, Aleksandr</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31489501803</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Deep learning</topic><topic>Federated learning</topic><topic>Labels</topic><topic>Machine learning</topic><topic>Parameters</topic><topic>Privacy</topic><topic>Web services</topic><toplevel>online_resources</toplevel><creatorcontrib>Zmushko, Philip</creatorcontrib><creatorcontrib>Mansurov, Marat</creatorcontrib><creatorcontrib>Svirschevski, Ruslan</creatorcontrib><creatorcontrib>Kuznedelev, Denis</creatorcontrib><creatorcontrib>Ryabinin, Max</creatorcontrib><creatorcontrib>Beznosikov, Aleksandr</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zmushko, Philip</au><au>Mansurov, Marat</au><au>Svirschevski, Ruslan</au><au>Kuznedelev, Denis</au><au>Ryabinin, Max</au><au>Beznosikov, Aleksandr</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Label Privacy in Split Learning for Large Models with Parameter-Efficient Training</atitle><jtitle>arXiv.org</jtitle><date>2024-12-21</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>As deep learning models become larger and more expensive, many practitioners turn to fine-tuning APIs. These web services allow fine-tuning a model between two parties: the client that provides the data, and the server that hosts the model. While convenient, these APIs raise a new concern: the data of the client is at risk of privacy breach during the training procedure. This challenge presents an important practical case of vertical federated learning, where the two parties perform parameter-efficient fine-tuning (PEFT) of a large model. In this study, we systematically search for a way to fine-tune models over an API while keeping the labels private. We analyze the privacy of LoRA, a popular approach for parameter-efficient fine-tuning when training over an API. Using this analysis, we propose P\(^3\)EFT, a multi-party split learning algorithm that takes advantage of existing PEFT properties to maintain privacy at a lower performance overhead. To validate our algorithm, we fine-tune DeBERTa-v2-XXLarge, Flan-T5 Large and LLaMA-2 7B using LoRA adapters on a range of NLP tasks. We find that P\(^3\)EFT is competitive with existing privacy-preserving methods in multi-party and two-party setups while having higher accuracy.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-12
issn 2331-8422
language eng
recordid cdi_proquest_journals_3148950180
source Free E- Journals
subjects Algorithms
Deep learning
Federated learning
Labels
Machine learning
Parameters
Privacy
Web services
title Label Privacy in Split Learning for Large Models with Parameter-Efficient Training
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T16%3A31%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Label%20Privacy%20in%20Split%20Learning%20for%20Large%20Models%20with%20Parameter-Efficient%20Training&rft.jtitle=arXiv.org&rft.au=Zmushko,%20Philip&rft.date=2024-12-21&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3148950180%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3148950180&rft_id=info:pmid/&rfr_iscdi=true