Fortify Your Foundations: Practical Privacy and Security for Foundation Model Deployments In The Cloud

Foundation Models (FMs) display exceptional performance in tasks such as natural language processing and are being applied across a growing range of disciplines. Although typically trained on large public datasets, FMs are often fine-tuned or integrated into Retrieval-Augmented Generation (RAG) syst...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-10
Hauptverfasser:	Chrapek, Marcin, Vahldiek-Oberwagner, Anjo, Spoczynski, Marcin, Constable, Scott, Vij, Mona, Hoefler, Torsten
Format:	Artikel
Sprache:	eng
Schlagworte:	Data augmentation Natural language processing Privacy Security Theft
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Chrapek, Marcin Vahldiek-Oberwagner, Anjo Spoczynski, Marcin Constable, Scott Vij, Mona Hoefler, Torsten
description	Foundation Models (FMs) display exceptional performance in tasks such as natural language processing and are being applied across a growing range of disciplines. Although typically trained on large public datasets, FMs are often fine-tuned or integrated into Retrieval-Augmented Generation (RAG) systems, which rely on private data. This access, along with their size and costly training, heightens the risk of intellectual property theft. Moreover, multimodal FMs may expose sensitive information. In this work, we examine the FM threat model and discuss the practicality and comprehensiveness of various approaches for securing against them, such as ML-based methods and trusted execution environments (TEEs). We demonstrate that TEEs offer an effective balance between strong security properties, usability, and performance. Specifically, we present a solution achieving less than 10\% overhead versus bare metal for the full Llama2 7B and 13B inference pipelines running inside \intel\ SGX and \intel\ TDX. We also share our configuration files and insights from our implementation. To our knowledge, our work is the first to show the practicality of TEEs for securing FMs.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3115227245</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3115227245</sourcerecordid><originalsourceid>FETCH-proquest_journals_31152272453</originalsourceid><addsrcrecordid>eNqNjb0KwjAURoMgWLTvcMG50CatFddq0UEQdHEqIU0xJebW_Ah9ezs4ODp9B86Bb0YiyliWbHNKFyR2rk_TlG5KWhQsIl2N1qtuhDsGCzUG03Kv0LgdXCwXXgmuJ1JvLkbgpoWrFMEqP0KHvz2csZUa9nLQOD6l8Q5OBm4PCZXG0K7IvOPayfi7S7KuD7fqmAwWX0E63_TTv5lUw7KsoLSkecH-qz6si0bP</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3115227245</pqid></control><display><type>article</type><title>Fortify Your Foundations: Practical Privacy and Security for Foundation Model Deployments In The Cloud</title><source>Free E- Journals</source><creator>Chrapek, Marcin ; Vahldiek-Oberwagner, Anjo ; Spoczynski, Marcin ; Constable, Scott ; Vij, Mona ; Hoefler, Torsten</creator><creatorcontrib>Chrapek, Marcin ; Vahldiek-Oberwagner, Anjo ; Spoczynski, Marcin ; Constable, Scott ; Vij, Mona ; Hoefler, Torsten</creatorcontrib><description>Foundation Models (FMs) display exceptional performance in tasks such as natural language processing and are being applied across a growing range of disciplines. Although typically trained on large public datasets, FMs are often fine-tuned or integrated into Retrieval-Augmented Generation (RAG) systems, which rely on private data. This access, along with their size and costly training, heightens the risk of intellectual property theft. Moreover, multimodal FMs may expose sensitive information. In this work, we examine the FM threat model and discuss the practicality and comprehensiveness of various approaches for securing against them, such as ML-based methods and trusted execution environments (TEEs). We demonstrate that TEEs offer an effective balance between strong security properties, usability, and performance. Specifically, we present a solution achieving less than 10\% overhead versus bare metal for the full Llama2 7B and 13B inference pipelines running inside \intel\ SGX and \intel\ TDX. We also share our configuration files and insights from our implementation. To our knowledge, our work is the first to show the practicality of TEEs for securing FMs.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Data augmentation ; Natural language processing ; Privacy ; Security ; Theft</subject><ispartof>arXiv.org, 2024-10</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Chrapek, Marcin</creatorcontrib><creatorcontrib>Vahldiek-Oberwagner, Anjo</creatorcontrib><creatorcontrib>Spoczynski, Marcin</creatorcontrib><creatorcontrib>Constable, Scott</creatorcontrib><creatorcontrib>Vij, Mona</creatorcontrib><creatorcontrib>Hoefler, Torsten</creatorcontrib><title>Fortify Your Foundations: Practical Privacy and Security for Foundation Model Deployments In The Cloud</title><title>arXiv.org</title><description>Foundation Models (FMs) display exceptional performance in tasks such as natural language processing and are being applied across a growing range of disciplines. Although typically trained on large public datasets, FMs are often fine-tuned or integrated into Retrieval-Augmented Generation (RAG) systems, which rely on private data. This access, along with their size and costly training, heightens the risk of intellectual property theft. Moreover, multimodal FMs may expose sensitive information. In this work, we examine the FM threat model and discuss the practicality and comprehensiveness of various approaches for securing against them, such as ML-based methods and trusted execution environments (TEEs). We demonstrate that TEEs offer an effective balance between strong security properties, usability, and performance. Specifically, we present a solution achieving less than 10\% overhead versus bare metal for the full Llama2 7B and 13B inference pipelines running inside \intel\ SGX and \intel\ TDX. We also share our configuration files and insights from our implementation. To our knowledge, our work is the first to show the practicality of TEEs for securing FMs.</description><subject>Data augmentation</subject><subject>Natural language processing</subject><subject>Privacy</subject><subject>Security</subject><subject>Theft</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNjb0KwjAURoMgWLTvcMG50CatFddq0UEQdHEqIU0xJebW_Ah9ezs4ODp9B86Bb0YiyliWbHNKFyR2rk_TlG5KWhQsIl2N1qtuhDsGCzUG03Kv0LgdXCwXXgmuJ1JvLkbgpoWrFMEqP0KHvz2csZUa9nLQOD6l8Q5OBm4PCZXG0K7IvOPayfi7S7KuD7fqmAwWX0E63_TTv5lUw7KsoLSkecH-qz6si0bP</recordid><startdate>20241008</startdate><enddate>20241008</enddate><creator>Chrapek, Marcin</creator><creator>Vahldiek-Oberwagner, Anjo</creator><creator>Spoczynski, Marcin</creator><creator>Constable, Scott</creator><creator>Vij, Mona</creator><creator>Hoefler, Torsten</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241008</creationdate><title>Fortify Your Foundations: Practical Privacy and Security for Foundation Model Deployments In The Cloud</title><author>Chrapek, Marcin ; Vahldiek-Oberwagner, Anjo ; Spoczynski, Marcin ; Constable, Scott ; Vij, Mona ; Hoefler, Torsten</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31152272453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Data augmentation</topic><topic>Natural language processing</topic><topic>Privacy</topic><topic>Security</topic><topic>Theft</topic><toplevel>online_resources</toplevel><creatorcontrib>Chrapek, Marcin</creatorcontrib><creatorcontrib>Vahldiek-Oberwagner, Anjo</creatorcontrib><creatorcontrib>Spoczynski, Marcin</creatorcontrib><creatorcontrib>Constable, Scott</creatorcontrib><creatorcontrib>Vij, Mona</creatorcontrib><creatorcontrib>Hoefler, Torsten</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chrapek, Marcin</au><au>Vahldiek-Oberwagner, Anjo</au><au>Spoczynski, Marcin</au><au>Constable, Scott</au><au>Vij, Mona</au><au>Hoefler, Torsten</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Fortify Your Foundations: Practical Privacy and Security for Foundation Model Deployments In The Cloud</atitle><jtitle>arXiv.org</jtitle><date>2024-10-08</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Foundation Models (FMs) display exceptional performance in tasks such as natural language processing and are being applied across a growing range of disciplines. Although typically trained on large public datasets, FMs are often fine-tuned or integrated into Retrieval-Augmented Generation (RAG) systems, which rely on private data. This access, along with their size and costly training, heightens the risk of intellectual property theft. Moreover, multimodal FMs may expose sensitive information. In this work, we examine the FM threat model and discuss the practicality and comprehensiveness of various approaches for securing against them, such as ML-based methods and trusted execution environments (TEEs). We demonstrate that TEEs offer an effective balance between strong security properties, usability, and performance. Specifically, we present a solution achieving less than 10\% overhead versus bare metal for the full Llama2 7B and 13B inference pipelines running inside \intel\ SGX and \intel\ TDX. We also share our configuration files and insights from our implementation. To our knowledge, our work is the first to show the practicality of TEEs for securing FMs.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-10
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_3115227245
source	Free E- Journals
subjects	Data augmentation Natural language processing Privacy Security Theft
title	Fortify Your Foundations: Practical Privacy and Security for Foundation Model Deployments In The Cloud
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-11T11%3A14%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Fortify%20Your%20Foundations:%20Practical%20Privacy%20and%20Security%20for%20Foundation%20Model%20Deployments%20In%20The%20Cloud&rft.jtitle=arXiv.org&rft.au=Chrapek,%20Marcin&rft.date=2024-10-08&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3115227245%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3115227245&rft_id=info:pmid/&rfr_iscdi=true