Understanding and Benchmarking the Impact of GDPR on Database Systems

The General Data Protection Regulation (GDPR) provides new rights and protections to European people concerning their personal data. We analyze GDPR from a systems perspective, translating its legal articles into a set of capabilities and characteristics that compliant systems must support. Our anal...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2020-03
Hauptverfasser: Shastri, Supreeth, Banakar, Vinay, Wasserman, Melissa, Kumar, Arun, Chidambaram, Vijay
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Shastri, Supreeth
Banakar, Vinay
Wasserman, Melissa
Kumar, Arun
Chidambaram, Vijay
description The General Data Protection Regulation (GDPR) provides new rights and protections to European people concerning their personal data. We analyze GDPR from a systems perspective, translating its legal articles into a set of capabilities and characteristics that compliant systems must support. Our analysis reveals the phenomenon of metadata explosion, wherein large quantities of metadata needs to be stored along with the personal data to satisfy the GDPR requirements. Our analysis also helps us identify new workloads that must be supported under GDPR. We design and implement an open-source benchmark called GDPRbench that consists of workloads and metrics needed to understand and assess personal-data processing database systems. To gauge the readiness of modern database systems for GDPR, we follow best practices and developer recommendations to modify Redis, PostgreSQL, and a commercial database system to be GDPR compliant. Our experiments demonstrate that the resulting GDPR compliant systems achieve poor performance on GPDR workloads, and that performance scales poorly as the volume of personal data increases. We discuss the real-world implications of these findings, and identify research challenges towards making GDPR compliance efficient in production environments. We release all of our software artifacts and datasets at http://www.gdprbench.org
doi_str_mv 10.48550/arxiv.1910.00728
format Article
fullrecord <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_1910_00728</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2300425815</sourcerecordid><originalsourceid>FETCH-LOGICAL-a525-1c64ffc61478b64c2fce03525fb3032845e0ae203c449ad53af0011255ac5463</originalsourceid><addsrcrecordid>eNotj01PwkAURScmJhLkB7hyEtfFN1_tsFRAJCHRiK6b1-mMFO20doqBf-8Arm5y383LOYTcMBhLrRTcY7evfsdsEguAjOsLMuBCsERLzq_IKIQtAPA040qJAZl_-NJ2oUdfVv6TxqCP1ptNjd3Xseg3li7rFk1PG0cXs9c32ng6wx4LDJauD6G3dbgmlw6_gx3955Csn-bv0-dk9bJYTh9WCSquEmZS6ZxJmcx0kUrDnbEg4sUVAgTXUllAy0EYKSdYKoEOgLEIikbJVAzJ7fnrSTFvuypSHvKjan5SjYu786Ltmp-dDX2-bXadj0g5FwCSK82U-AOy7FSx</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2300425815</pqid></control><display><type>article</type><title>Understanding and Benchmarking the Impact of GDPR on Database Systems</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Shastri, Supreeth ; Banakar, Vinay ; Wasserman, Melissa ; Kumar, Arun ; Chidambaram, Vijay</creator><creatorcontrib>Shastri, Supreeth ; Banakar, Vinay ; Wasserman, Melissa ; Kumar, Arun ; Chidambaram, Vijay</creatorcontrib><description>The General Data Protection Regulation (GDPR) provides new rights and protections to European people concerning their personal data. We analyze GDPR from a systems perspective, translating its legal articles into a set of capabilities and characteristics that compliant systems must support. Our analysis reveals the phenomenon of metadata explosion, wherein large quantities of metadata needs to be stored along with the personal data to satisfy the GDPR requirements. Our analysis also helps us identify new workloads that must be supported under GDPR. We design and implement an open-source benchmark called GDPRbench that consists of workloads and metrics needed to understand and assess personal-data processing database systems. To gauge the readiness of modern database systems for GDPR, we follow best practices and developer recommendations to modify Redis, PostgreSQL, and a commercial database system to be GDPR compliant. Our experiments demonstrate that the resulting GDPR compliant systems achieve poor performance on GPDR workloads, and that performance scales poorly as the volume of personal data increases. We discuss the real-world implications of these findings, and identify research challenges towards making GDPR compliance efficient in production environments. We release all of our software artifacts and datasets at http://www.gdprbench.org</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.1910.00728</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computer Science - Databases ; Data processing ; General Data Protection Regulation ; Metadata ; Performance degradation ; Source code ; Workloads</subject><ispartof>arXiv.org, 2020-03</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27904</link.rule.ids><backlink>$$Uhttps://doi.org/10.14778/3384345.3384354$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.48550/arXiv.1910.00728$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Shastri, Supreeth</creatorcontrib><creatorcontrib>Banakar, Vinay</creatorcontrib><creatorcontrib>Wasserman, Melissa</creatorcontrib><creatorcontrib>Kumar, Arun</creatorcontrib><creatorcontrib>Chidambaram, Vijay</creatorcontrib><title>Understanding and Benchmarking the Impact of GDPR on Database Systems</title><title>arXiv.org</title><description>The General Data Protection Regulation (GDPR) provides new rights and protections to European people concerning their personal data. We analyze GDPR from a systems perspective, translating its legal articles into a set of capabilities and characteristics that compliant systems must support. Our analysis reveals the phenomenon of metadata explosion, wherein large quantities of metadata needs to be stored along with the personal data to satisfy the GDPR requirements. Our analysis also helps us identify new workloads that must be supported under GDPR. We design and implement an open-source benchmark called GDPRbench that consists of workloads and metrics needed to understand and assess personal-data processing database systems. To gauge the readiness of modern database systems for GDPR, we follow best practices and developer recommendations to modify Redis, PostgreSQL, and a commercial database system to be GDPR compliant. Our experiments demonstrate that the resulting GDPR compliant systems achieve poor performance on GPDR workloads, and that performance scales poorly as the volume of personal data increases. We discuss the real-world implications of these findings, and identify research challenges towards making GDPR compliance efficient in production environments. We release all of our software artifacts and datasets at http://www.gdprbench.org</description><subject>Computer Science - Databases</subject><subject>Data processing</subject><subject>General Data Protection Regulation</subject><subject>Metadata</subject><subject>Performance degradation</subject><subject>Source code</subject><subject>Workloads</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotj01PwkAURScmJhLkB7hyEtfFN1_tsFRAJCHRiK6b1-mMFO20doqBf-8Arm5y383LOYTcMBhLrRTcY7evfsdsEguAjOsLMuBCsERLzq_IKIQtAPA040qJAZl_-NJ2oUdfVv6TxqCP1ptNjd3Xseg3li7rFk1PG0cXs9c32ng6wx4LDJauD6G3dbgmlw6_gx3955Csn-bv0-dk9bJYTh9WCSquEmZS6ZxJmcx0kUrDnbEg4sUVAgTXUllAy0EYKSdYKoEOgLEIikbJVAzJ7fnrSTFvuypSHvKjan5SjYu786Ltmp-dDX2-bXadj0g5FwCSK82U-AOy7FSx</recordid><startdate>20200317</startdate><enddate>20200317</enddate><creator>Shastri, Supreeth</creator><creator>Banakar, Vinay</creator><creator>Wasserman, Melissa</creator><creator>Kumar, Arun</creator><creator>Chidambaram, Vijay</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200317</creationdate><title>Understanding and Benchmarking the Impact of GDPR on Database Systems</title><author>Shastri, Supreeth ; Banakar, Vinay ; Wasserman, Melissa ; Kumar, Arun ; Chidambaram, Vijay</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a525-1c64ffc61478b64c2fce03525fb3032845e0ae203c449ad53af0011255ac5463</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Databases</topic><topic>Data processing</topic><topic>General Data Protection Regulation</topic><topic>Metadata</topic><topic>Performance degradation</topic><topic>Source code</topic><topic>Workloads</topic><toplevel>online_resources</toplevel><creatorcontrib>Shastri, Supreeth</creatorcontrib><creatorcontrib>Banakar, Vinay</creatorcontrib><creatorcontrib>Wasserman, Melissa</creatorcontrib><creatorcontrib>Kumar, Arun</creatorcontrib><creatorcontrib>Chidambaram, Vijay</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Shastri, Supreeth</au><au>Banakar, Vinay</au><au>Wasserman, Melissa</au><au>Kumar, Arun</au><au>Chidambaram, Vijay</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Understanding and Benchmarking the Impact of GDPR on Database Systems</atitle><jtitle>arXiv.org</jtitle><date>2020-03-17</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>The General Data Protection Regulation (GDPR) provides new rights and protections to European people concerning their personal data. We analyze GDPR from a systems perspective, translating its legal articles into a set of capabilities and characteristics that compliant systems must support. Our analysis reveals the phenomenon of metadata explosion, wherein large quantities of metadata needs to be stored along with the personal data to satisfy the GDPR requirements. Our analysis also helps us identify new workloads that must be supported under GDPR. We design and implement an open-source benchmark called GDPRbench that consists of workloads and metrics needed to understand and assess personal-data processing database systems. To gauge the readiness of modern database systems for GDPR, we follow best practices and developer recommendations to modify Redis, PostgreSQL, and a commercial database system to be GDPR compliant. Our experiments demonstrate that the resulting GDPR compliant systems achieve poor performance on GPDR workloads, and that performance scales poorly as the volume of personal data increases. We discuss the real-world implications of these findings, and identify research challenges towards making GDPR compliance efficient in production environments. We release all of our software artifacts and datasets at http://www.gdprbench.org</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.1910.00728</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2020-03
issn 2331-8422
language eng
recordid cdi_arxiv_primary_1910_00728
source arXiv.org; Free E- Journals
subjects Computer Science - Databases
Data processing
General Data Protection Regulation
Metadata
Performance degradation
Source code
Workloads
title Understanding and Benchmarking the Impact of GDPR on Database Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T17%3A56%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Understanding%20and%20Benchmarking%20the%20Impact%20of%20GDPR%20on%20Database%20Systems&rft.jtitle=arXiv.org&rft.au=Shastri,%20Supreeth&rft.date=2020-03-17&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.1910.00728&rft_dat=%3Cproquest_arxiv%3E2300425815%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2300425815&rft_id=info:pmid/&rfr_iscdi=true