Securing Big Data Scientific Workflows via Trusted Heterogeneous Environments

Big data workflow management systems (BDWMS)s have recently emerged as popular data analytics platforms to conduct large-scale data analytics in the cloud. However, the protection of data confidentiality and secure execution of workflow applications remains an important and challenging problem. Alth...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on dependable and secure computing 2022-11, Vol.19 (6), p.4187-4203
Hauptverfasser: Mofrad, Saeid, Ahmed, Ishtiaq, Zhang, Fengwei, Lu, Shiyong, Yang, Ping, Cui, Heming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4203
container_issue 6
container_start_page 4187
container_title IEEE transactions on dependable and secure computing
container_volume 19
creator Mofrad, Saeid
Ahmed, Ishtiaq
Zhang, Fengwei
Lu, Shiyong
Yang, Ping
Cui, Heming
description Big data workflow management systems (BDWMS)s have recently emerged as popular data analytics platforms to conduct large-scale data analytics in the cloud. However, the protection of data confidentiality and secure execution of workflow applications remains an important and challenging problem. Although a few data analytics systems, such as VC3 and Opaque, were developed to address security problems, they are limited to specific domains such as Map-Reduce-style and SQL query workflows. A generic secure framework for BDWMSs is still missing. In this article, we propose SecDATAVIEW, a distributed BDWMS that employs heterogeneous workers, such as Intel SGX and AMD SEV, to protect both workflow and workflow data execution, addressing three major security challenges: (1) Reducing the TCB size of the big data workflow management system in the untrusted cloud by leveraging the hardware-assisted TEE and software attestation; (2) Supporting Java-written workflow tasks to overcome the limitation of SGX's lack of support for Java programs; and (3) Reducing the adverse impact of SGX enclave memory paging overhead through a "Hybrid" workflow task scheduling system that selectively deploys sensitive tasks to a mix of SGX and SEV worker nodes. Our experimental results show that SecDATAVIEW imposes moderate overhead on the workflow execution time.
doi_str_mv 10.1109/TDSC.2021.3123640
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2735379756</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9594452</ieee_id><sourcerecordid>2735379756</sourcerecordid><originalsourceid>FETCH-LOGICAL-c336t-12047c7b3b4fad7a127576a27c15f9fb76f6c26e8821499a7ac45800a9da46e43</originalsourceid><addsrcrecordid>eNo9kE1PAjEQQBujiYj-AOOliefFfm63RwUUE4wHMB6bUqakCFtsdzH-e5dAPM0c3ptJHkK3lAwoJfphPpoNB4wwOuCU8VKQM9SjWtCCEFqdd7sUspBa0Ut0lfOaECYqLXrobQauTaFe4aewwiPbWDxzAeom-ODwZ0xffhN_Mt4Hi-epzQ0s8QQaSHEFNcQ243G9DynW287J1-jC202Gm9Pso4_n8Xw4KabvL6_Dx2nhOC-bgjIilFMLvhDeLpWlTElVWqYclV77hSp96VgJVcWo0Noq64SsCLF6aUUJgvfR_fHuLsXvFnJj1rFNdffSMMUlV1rJsqPokXIp5pzAm10KW5t-DSXmUM0cqplDNXOq1jl3RycAwD-vpRZCMv4HsnloVA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2735379756</pqid></control><display><type>article</type><title>Securing Big Data Scientific Workflows via Trusted Heterogeneous Environments</title><source>IEEE Electronic Library (IEL)</source><creator>Mofrad, Saeid ; Ahmed, Ishtiaq ; Zhang, Fengwei ; Lu, Shiyong ; Yang, Ping ; Cui, Heming</creator><creatorcontrib>Mofrad, Saeid ; Ahmed, Ishtiaq ; Zhang, Fengwei ; Lu, Shiyong ; Yang, Ping ; Cui, Heming</creatorcontrib><description>Big data workflow management systems (BDWMS)s have recently emerged as popular data analytics platforms to conduct large-scale data analytics in the cloud. However, the protection of data confidentiality and secure execution of workflow applications remains an important and challenging problem. Although a few data analytics systems, such as VC3 and Opaque, were developed to address security problems, they are limited to specific domains such as Map-Reduce-style and SQL query workflows. A generic secure framework for BDWMSs is still missing. In this article, we propose SecDATAVIEW, a distributed BDWMS that employs heterogeneous workers, such as Intel SGX and AMD SEV, to protect both workflow and workflow data execution, addressing three major security challenges: (1) Reducing the TCB size of the big data workflow management system in the untrusted cloud by leveraging the hardware-assisted TEE and software attestation; (2) Supporting Java-written workflow tasks to overcome the limitation of SGX's lack of support for Java programs; and (3) Reducing the adverse impact of SGX enclave memory paging overhead through a "Hybrid" workflow task scheduling system that selectively deploys sensitive tasks to a mix of SGX and SEV worker nodes. Our experimental results show that SecDATAVIEW imposes moderate overhead on the workflow execution time.</description><identifier>ISSN: 1545-5971</identifier><identifier>EISSN: 1941-0018</identifier><identifier>DOI: 10.1109/TDSC.2021.3123640</identifier><identifier>CODEN: ITDSCM</identifier><language>eng</language><publisher>Washington: IEEE</publisher><subject>AMD SEV ; Big Data ; big data workflow ; Cloud computing ; Codes ; Computer science ; Data analysis ; heterogeneous cloud ; Hybrid systems ; Intel SGX ; Java ; Mathematical analysis ; Security ; Task analysis ; Task scheduling ; Trusted computing ; Workflow management systems ; Workflow software</subject><ispartof>IEEE transactions on dependable and secure computing, 2022-11, Vol.19 (6), p.4187-4203</ispartof><rights>Copyright IEEE Computer Society 2022</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c336t-12047c7b3b4fad7a127576a27c15f9fb76f6c26e8821499a7ac45800a9da46e43</citedby><cites>FETCH-LOGICAL-c336t-12047c7b3b4fad7a127576a27c15f9fb76f6c26e8821499a7ac45800a9da46e43</cites><orcidid>0000-0001-9058-2822 ; 0000-0001-9529-3165 ; 0000-0001-9654-7403 ; 0000-0003-3365-2526 ; 0000-0001-7746-440X ; 0000-0002-7864-1815</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9594452$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>315,782,786,798,27933,27934,54767</link.rule.ids></links><search><creatorcontrib>Mofrad, Saeid</creatorcontrib><creatorcontrib>Ahmed, Ishtiaq</creatorcontrib><creatorcontrib>Zhang, Fengwei</creatorcontrib><creatorcontrib>Lu, Shiyong</creatorcontrib><creatorcontrib>Yang, Ping</creatorcontrib><creatorcontrib>Cui, Heming</creatorcontrib><title>Securing Big Data Scientific Workflows via Trusted Heterogeneous Environments</title><title>IEEE transactions on dependable and secure computing</title><addtitle>TDSC</addtitle><description>Big data workflow management systems (BDWMS)s have recently emerged as popular data analytics platforms to conduct large-scale data analytics in the cloud. However, the protection of data confidentiality and secure execution of workflow applications remains an important and challenging problem. Although a few data analytics systems, such as VC3 and Opaque, were developed to address security problems, they are limited to specific domains such as Map-Reduce-style and SQL query workflows. A generic secure framework for BDWMSs is still missing. In this article, we propose SecDATAVIEW, a distributed BDWMS that employs heterogeneous workers, such as Intel SGX and AMD SEV, to protect both workflow and workflow data execution, addressing three major security challenges: (1) Reducing the TCB size of the big data workflow management system in the untrusted cloud by leveraging the hardware-assisted TEE and software attestation; (2) Supporting Java-written workflow tasks to overcome the limitation of SGX's lack of support for Java programs; and (3) Reducing the adverse impact of SGX enclave memory paging overhead through a "Hybrid" workflow task scheduling system that selectively deploys sensitive tasks to a mix of SGX and SEV worker nodes. Our experimental results show that SecDATAVIEW imposes moderate overhead on the workflow execution time.</description><subject>AMD SEV</subject><subject>Big Data</subject><subject>big data workflow</subject><subject>Cloud computing</subject><subject>Codes</subject><subject>Computer science</subject><subject>Data analysis</subject><subject>heterogeneous cloud</subject><subject>Hybrid systems</subject><subject>Intel SGX</subject><subject>Java</subject><subject>Mathematical analysis</subject><subject>Security</subject><subject>Task analysis</subject><subject>Task scheduling</subject><subject>Trusted computing</subject><subject>Workflow management systems</subject><subject>Workflow software</subject><issn>1545-5971</issn><issn>1941-0018</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><recordid>eNo9kE1PAjEQQBujiYj-AOOliefFfm63RwUUE4wHMB6bUqakCFtsdzH-e5dAPM0c3ptJHkK3lAwoJfphPpoNB4wwOuCU8VKQM9SjWtCCEFqdd7sUspBa0Ut0lfOaECYqLXrobQauTaFe4aewwiPbWDxzAeom-ODwZ0xffhN_Mt4Hi-epzQ0s8QQaSHEFNcQ243G9DynW287J1-jC202Gm9Pso4_n8Xw4KabvL6_Dx2nhOC-bgjIilFMLvhDeLpWlTElVWqYclV77hSp96VgJVcWo0Noq64SsCLF6aUUJgvfR_fHuLsXvFnJj1rFNdffSMMUlV1rJsqPokXIp5pzAm10KW5t-DSXmUM0cqplDNXOq1jl3RycAwD-vpRZCMv4HsnloVA</recordid><startdate>20221101</startdate><enddate>20221101</enddate><creator>Mofrad, Saeid</creator><creator>Ahmed, Ishtiaq</creator><creator>Zhang, Fengwei</creator><creator>Lu, Shiyong</creator><creator>Yang, Ping</creator><creator>Cui, Heming</creator><general>IEEE</general><general>IEEE Computer Society</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><orcidid>https://orcid.org/0000-0001-9058-2822</orcidid><orcidid>https://orcid.org/0000-0001-9529-3165</orcidid><orcidid>https://orcid.org/0000-0001-9654-7403</orcidid><orcidid>https://orcid.org/0000-0003-3365-2526</orcidid><orcidid>https://orcid.org/0000-0001-7746-440X</orcidid><orcidid>https://orcid.org/0000-0002-7864-1815</orcidid></search><sort><creationdate>20221101</creationdate><title>Securing Big Data Scientific Workflows via Trusted Heterogeneous Environments</title><author>Mofrad, Saeid ; Ahmed, Ishtiaq ; Zhang, Fengwei ; Lu, Shiyong ; Yang, Ping ; Cui, Heming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c336t-12047c7b3b4fad7a127576a27c15f9fb76f6c26e8821499a7ac45800a9da46e43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>AMD SEV</topic><topic>Big Data</topic><topic>big data workflow</topic><topic>Cloud computing</topic><topic>Codes</topic><topic>Computer science</topic><topic>Data analysis</topic><topic>heterogeneous cloud</topic><topic>Hybrid systems</topic><topic>Intel SGX</topic><topic>Java</topic><topic>Mathematical analysis</topic><topic>Security</topic><topic>Task analysis</topic><topic>Task scheduling</topic><topic>Trusted computing</topic><topic>Workflow management systems</topic><topic>Workflow software</topic><toplevel>online_resources</toplevel><creatorcontrib>Mofrad, Saeid</creatorcontrib><creatorcontrib>Ahmed, Ishtiaq</creatorcontrib><creatorcontrib>Zhang, Fengwei</creatorcontrib><creatorcontrib>Lu, Shiyong</creatorcontrib><creatorcontrib>Yang, Ping</creatorcontrib><creatorcontrib>Cui, Heming</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><jtitle>IEEE transactions on dependable and secure computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mofrad, Saeid</au><au>Ahmed, Ishtiaq</au><au>Zhang, Fengwei</au><au>Lu, Shiyong</au><au>Yang, Ping</au><au>Cui, Heming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Securing Big Data Scientific Workflows via Trusted Heterogeneous Environments</atitle><jtitle>IEEE transactions on dependable and secure computing</jtitle><stitle>TDSC</stitle><date>2022-11-01</date><risdate>2022</risdate><volume>19</volume><issue>6</issue><spage>4187</spage><epage>4203</epage><pages>4187-4203</pages><issn>1545-5971</issn><eissn>1941-0018</eissn><coden>ITDSCM</coden><abstract>Big data workflow management systems (BDWMS)s have recently emerged as popular data analytics platforms to conduct large-scale data analytics in the cloud. However, the protection of data confidentiality and secure execution of workflow applications remains an important and challenging problem. Although a few data analytics systems, such as VC3 and Opaque, were developed to address security problems, they are limited to specific domains such as Map-Reduce-style and SQL query workflows. A generic secure framework for BDWMSs is still missing. In this article, we propose SecDATAVIEW, a distributed BDWMS that employs heterogeneous workers, such as Intel SGX and AMD SEV, to protect both workflow and workflow data execution, addressing three major security challenges: (1) Reducing the TCB size of the big data workflow management system in the untrusted cloud by leveraging the hardware-assisted TEE and software attestation; (2) Supporting Java-written workflow tasks to overcome the limitation of SGX's lack of support for Java programs; and (3) Reducing the adverse impact of SGX enclave memory paging overhead through a "Hybrid" workflow task scheduling system that selectively deploys sensitive tasks to a mix of SGX and SEV worker nodes. Our experimental results show that SecDATAVIEW imposes moderate overhead on the workflow execution time.</abstract><cop>Washington</cop><pub>IEEE</pub><doi>10.1109/TDSC.2021.3123640</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0001-9058-2822</orcidid><orcidid>https://orcid.org/0000-0001-9529-3165</orcidid><orcidid>https://orcid.org/0000-0001-9654-7403</orcidid><orcidid>https://orcid.org/0000-0003-3365-2526</orcidid><orcidid>https://orcid.org/0000-0001-7746-440X</orcidid><orcidid>https://orcid.org/0000-0002-7864-1815</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1545-5971
ispartof IEEE transactions on dependable and secure computing, 2022-11, Vol.19 (6), p.4187-4203
issn 1545-5971
1941-0018
language eng
recordid cdi_proquest_journals_2735379756
source IEEE Electronic Library (IEL)
subjects AMD SEV
Big Data
big data workflow
Cloud computing
Codes
Computer science
Data analysis
heterogeneous cloud
Hybrid systems
Intel SGX
Java
Mathematical analysis
Security
Task analysis
Task scheduling
Trusted computing
Workflow management systems
Workflow software
title Securing Big Data Scientific Workflows via Trusted Heterogeneous Environments
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-03T11%3A35%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Securing%20Big%20Data%20Scientific%20Workflows%20via%20Trusted%20Heterogeneous%20Environments&rft.jtitle=IEEE%20transactions%20on%20dependable%20and%20secure%20computing&rft.au=Mofrad,%20Saeid&rft.date=2022-11-01&rft.volume=19&rft.issue=6&rft.spage=4187&rft.epage=4203&rft.pages=4187-4203&rft.issn=1545-5971&rft.eissn=1941-0018&rft.coden=ITDSCM&rft_id=info:doi/10.1109/TDSC.2021.3123640&rft_dat=%3Cproquest_cross%3E2735379756%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2735379756&rft_id=info:pmid/&rft_ieee_id=9594452&rfr_iscdi=true