Investigating Membership Inference Attacks under Data Dependencies

Training machine learning models on privacy-sensitive data has become a popular practice, driving innovation in ever-expanding fields. This has opened the door to new attacks that can have serious privacy implications. One such attack, the Membership Inference Attack (MIA), exposes whether or not a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Humphries, Thomas, Oya, Simon, Tulloch, Lindsey, Rafuse, Matthew, Goldberg, Ian, Hengartner, Urs, Kerschbaum, Florian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Humphries, Thomas
Oya, Simon
Tulloch, Lindsey
Rafuse, Matthew
Goldberg, Ian
Hengartner, Urs
Kerschbaum, Florian
description Training machine learning models on privacy-sensitive data has become a popular practice, driving innovation in ever-expanding fields. This has opened the door to new attacks that can have serious privacy implications. One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model. A growing body of literature uses Differentially Private (DP) training algorithms as a defence against such attacks. However, these works evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed. This assumption does not hold for many real-world use cases in the literature. Motivated by this, we evaluate membership inference with statistical dependencies among samples and explain why DP does not provide meaningful protection (the privacy parameter $\epsilon$ scales with the training set size $n$) in this more general case. We conduct a series of empirical evaluations with off-the-shelf MIAs using training sets built from real-world data showing different types of dependencies among samples. Our results reveal that training set dependencies can severely increase the performance of MIAs, and therefore assuming that data samples are statistically independent can significantly underestimate the performance of MIAs.
doi_str_mv 10.48550/arxiv.2010.12112
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2010_12112</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2010_12112</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-e28c5c7889371505d776be6f55f0054c0ae41e00a67020d8b693ca47fa4339c73</originalsourceid><addsrcrecordid>eNotj8tOwzAURL1hgQofwAr_QMr1K3aWpeURqYhN99GNc12stlbkmAr-nlBYjWZ0NNJh7E7AUjtj4AHzVzwvJcyDkELIa_bYpjNNJe6xxLTnb3TqKU8fceRtCpQpeeKrUtAfJv6ZBsp8gwX5hkaaW_KRpht2FfA40e1_Ltju-Wm3fq227y_terWtsLayIum88da5RllhwAzW1j3VwZgAYLQHJC0IYIZBwuD6ulEetQ2olWq8VQt2_3d7kejGHE-Yv7tfme4io34AmBhDww</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Investigating Membership Inference Attacks under Data Dependencies</title><source>arXiv.org</source><creator>Humphries, Thomas ; Oya, Simon ; Tulloch, Lindsey ; Rafuse, Matthew ; Goldberg, Ian ; Hengartner, Urs ; Kerschbaum, Florian</creator><creatorcontrib>Humphries, Thomas ; Oya, Simon ; Tulloch, Lindsey ; Rafuse, Matthew ; Goldberg, Ian ; Hengartner, Urs ; Kerschbaum, Florian</creatorcontrib><description>Training machine learning models on privacy-sensitive data has become a popular practice, driving innovation in ever-expanding fields. This has opened the door to new attacks that can have serious privacy implications. One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model. A growing body of literature uses Differentially Private (DP) training algorithms as a defence against such attacks. However, these works evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed. This assumption does not hold for many real-world use cases in the literature. Motivated by this, we evaluate membership inference with statistical dependencies among samples and explain why DP does not provide meaningful protection (the privacy parameter $\epsilon$ scales with the training set size $n$) in this more general case. We conduct a series of empirical evaluations with off-the-shelf MIAs using training sets built from real-world data showing different types of dependencies among samples. Our results reveal that training set dependencies can severely increase the performance of MIAs, and therefore assuming that data samples are statistically independent can significantly underestimate the performance of MIAs.</description><identifier>DOI: 10.48550/arxiv.2010.12112</identifier><language>eng</language><subject>Computer Science - Cryptography and Security ; Computer Science - Learning</subject><creationdate>2020-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2010.12112$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2010.12112$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Humphries, Thomas</creatorcontrib><creatorcontrib>Oya, Simon</creatorcontrib><creatorcontrib>Tulloch, Lindsey</creatorcontrib><creatorcontrib>Rafuse, Matthew</creatorcontrib><creatorcontrib>Goldberg, Ian</creatorcontrib><creatorcontrib>Hengartner, Urs</creatorcontrib><creatorcontrib>Kerschbaum, Florian</creatorcontrib><title>Investigating Membership Inference Attacks under Data Dependencies</title><description>Training machine learning models on privacy-sensitive data has become a popular practice, driving innovation in ever-expanding fields. This has opened the door to new attacks that can have serious privacy implications. One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model. A growing body of literature uses Differentially Private (DP) training algorithms as a defence against such attacks. However, these works evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed. This assumption does not hold for many real-world use cases in the literature. Motivated by this, we evaluate membership inference with statistical dependencies among samples and explain why DP does not provide meaningful protection (the privacy parameter $\epsilon$ scales with the training set size $n$) in this more general case. We conduct a series of empirical evaluations with off-the-shelf MIAs using training sets built from real-world data showing different types of dependencies among samples. Our results reveal that training set dependencies can severely increase the performance of MIAs, and therefore assuming that data samples are statistically independent can significantly underestimate the performance of MIAs.</description><subject>Computer Science - Cryptography and Security</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8tOwzAURL1hgQofwAr_QMr1K3aWpeURqYhN99GNc12stlbkmAr-nlBYjWZ0NNJh7E7AUjtj4AHzVzwvJcyDkELIa_bYpjNNJe6xxLTnb3TqKU8fceRtCpQpeeKrUtAfJv6ZBsp8gwX5hkaaW_KRpht2FfA40e1_Ltju-Wm3fq227y_terWtsLayIum88da5RllhwAzW1j3VwZgAYLQHJC0IYIZBwuD6ulEetQ2olWq8VQt2_3d7kejGHE-Yv7tfme4io34AmBhDww</recordid><startdate>20201022</startdate><enddate>20201022</enddate><creator>Humphries, Thomas</creator><creator>Oya, Simon</creator><creator>Tulloch, Lindsey</creator><creator>Rafuse, Matthew</creator><creator>Goldberg, Ian</creator><creator>Hengartner, Urs</creator><creator>Kerschbaum, Florian</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20201022</creationdate><title>Investigating Membership Inference Attacks under Data Dependencies</title><author>Humphries, Thomas ; Oya, Simon ; Tulloch, Lindsey ; Rafuse, Matthew ; Goldberg, Ian ; Hengartner, Urs ; Kerschbaum, Florian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-e28c5c7889371505d776be6f55f0054c0ae41e00a67020d8b693ca47fa4339c73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Cryptography and Security</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Humphries, Thomas</creatorcontrib><creatorcontrib>Oya, Simon</creatorcontrib><creatorcontrib>Tulloch, Lindsey</creatorcontrib><creatorcontrib>Rafuse, Matthew</creatorcontrib><creatorcontrib>Goldberg, Ian</creatorcontrib><creatorcontrib>Hengartner, Urs</creatorcontrib><creatorcontrib>Kerschbaum, Florian</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Humphries, Thomas</au><au>Oya, Simon</au><au>Tulloch, Lindsey</au><au>Rafuse, Matthew</au><au>Goldberg, Ian</au><au>Hengartner, Urs</au><au>Kerschbaum, Florian</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Investigating Membership Inference Attacks under Data Dependencies</atitle><date>2020-10-22</date><risdate>2020</risdate><abstract>Training machine learning models on privacy-sensitive data has become a popular practice, driving innovation in ever-expanding fields. This has opened the door to new attacks that can have serious privacy implications. One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model. A growing body of literature uses Differentially Private (DP) training algorithms as a defence against such attacks. However, these works evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed. This assumption does not hold for many real-world use cases in the literature. Motivated by this, we evaluate membership inference with statistical dependencies among samples and explain why DP does not provide meaningful protection (the privacy parameter $\epsilon$ scales with the training set size $n$) in this more general case. We conduct a series of empirical evaluations with off-the-shelf MIAs using training sets built from real-world data showing different types of dependencies among samples. Our results reveal that training set dependencies can severely increase the performance of MIAs, and therefore assuming that data samples are statistically independent can significantly underestimate the performance of MIAs.</abstract><doi>10.48550/arxiv.2010.12112</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2010.12112
ispartof
issn
language eng
recordid cdi_arxiv_primary_2010_12112
source arXiv.org
subjects Computer Science - Cryptography and Security
Computer Science - Learning
title Investigating Membership Inference Attacks under Data Dependencies
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T19%3A19%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Investigating%20Membership%20Inference%20Attacks%20under%20Data%20Dependencies&rft.au=Humphries,%20Thomas&rft.date=2020-10-22&rft_id=info:doi/10.48550/arxiv.2010.12112&rft_dat=%3Carxiv_GOX%3E2010_12112%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true