The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems

We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are n clusters in total, with m nodes per cluster. A data file is coded and stored across the mn nodes, with each node storing \alpha sy...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on information theory 2018-08, Vol.64 (8), p.5783-5805
Hauptverfasser:	Prakash, N., Abdrashitov, Vitaly, Medard, Muriel
Format:	Artikel
Sprache:	eng
Schlagworte:	Bandwidth Bandwidths Cloud computing cloud storage Cluster analysis clustered storage coding for storage Data centers Data collection Data models Distributed storage Downloading Eavesdropping Encoding Lower bounds Maintenance engineering Nodes Optimization regenerating codes Repair Storage systems storage vs repair-bandwidth trade-off Symbols Tradeoffs
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5805
container_issue	8
container_start_page	5783
container_title	IEEE transactions on information theory
container_volume	64
creator	Prakash, N. Abdrashitov, Vitaly Medard, Muriel
description	We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are n clusters in total, with m nodes per cluster. A data file is coded and stored across the mn nodes, with each node storing \alpha symbols. For availability of data, we require that the file be retrievable by downloading the entire content from any subset of k clusters. Nodes represent entities that can fail. We distinguish between intra-cluster and inter-cluster bandwidth (BW) costs during node repair. Node-repair in a cluster is accomplished by downloading \beta symbols each from any set of d other clusters, dubbed remote helper clusters, and also up to \alpha symbols each from any set of \ell surviving nodes, dubbed local helper nodes, in the host cluster. We first identify the optimal trade-off between storage-overhead and inter-cluster repair-bandwidth under functional repair, and also present optimal exact-repair code constructions for a class of parameters. The new trade-off is strictly better than what is achievable via space-sharing existing coding solutions, whenever \ell > 0 . We then obtain sharp lower bounds on the necessary intra-cluster repair BW to achieve optimal trade-off. Under functional repair, random linear network codes (RLNCs) simultaneously optimize usage of both inter- and intra-cluster repair BW; simulation results based on RLNCs suggest optimality of the bounds on intra-cluster repair-bandwidth. Our bounds reveal the interesting fact that, while it is beneficial to increase the number of local helper nodes \ell in order to improve the storage-vs-inter-cluster-repair-BW trade-off, increa
doi_str_mv	10.1109/TIT.2018.2806342
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2174496199</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8292960</ieee_id><sourcerecordid>2174496199</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-e8881ab85a180f636ba9acff543c35c29bde7f32bd1266c989d3b92d4199f22d3</originalsourceid><addsrcrecordid>eNo9kE1LAzEQhoMoWKt3wcuC59R8Nzlq8aNYEOzqNWQ3E7ul7dZkF-m_N6Wlp2FmnncGHoRuKRlRSsxDOS1HjFA9YpooLtgZGlApx9goKc7RgOQVNkLoS3SV0jK3QlI2QO_lAop510b3A8U3xNSn4hO2ron4yW38X-O7RVFG5wG3IRShjcVk1acOIvhTbr7Lg3W6RhfBrRLcHOsQfb08l5M3PPt4nU4eZ7jmnHcYtNbUVVo6qklQXFXOuDoEKXjNZc1M5WEcOKs8ZUrVRhvPK8O8oMYExjwfovvD3W1sf3tInV22fdzkl5bRsRBGZTJT5EDVsU0pQrDb2Kxd3FlK7F6ZzcrsXpk9KsuRu0OkAYATrplhRhH-D8tjZxQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2174496199</pqid></control><display><type>article</type><title>The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Prakash, N. ; Abdrashitov, Vitaly ; Medard, Muriel</creator><creatorcontrib>Prakash, N. ; Abdrashitov, Vitaly ; Medard, Muriel</creatorcontrib><description><![CDATA[We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> clusters in total, with <inline-formula> <tex-math notation="LaTeX">m </tex-math></inline-formula> nodes per cluster. A data file is coded and stored across the <inline-formula> <tex-math notation="LaTeX">mn </tex-math></inline-formula> nodes, with each node storing <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols. For availability of data, we require that the file be retrievable by downloading the entire content from any subset of <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> clusters. Nodes represent entities that can fail. We distinguish between intra-cluster and inter-cluster bandwidth (BW) costs during node repair. Node-repair in a cluster is accomplished by downloading <inline-formula> <tex-math notation="LaTeX">\beta </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> other clusters, dubbed remote helper clusters, and also up to <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> surviving nodes, dubbed local helper nodes, in the host cluster. We first identify the optimal trade-off between storage-overhead and inter-cluster repair-bandwidth under functional repair, and also present optimal exact-repair code constructions for a class of parameters. The new trade-off is strictly better than what is achievable via space-sharing existing coding solutions, whenever <inline-formula> <tex-math notation="LaTeX">\ell > 0 </tex-math></inline-formula>. We then obtain sharp lower bounds on the necessary intra-cluster repair BW to achieve optimal trade-off. Under functional repair, random linear network codes (RLNCs) simultaneously optimize usage of both inter- and intra-cluster repair BW; simulation results based on RLNCs suggest optimality of the bounds on intra-cluster repair-bandwidth. Our bounds reveal the interesting fact that, while it is beneficial to increase the number of local helper nodes <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> in order to improve the storage-vs-inter-cluster-repair-BW trade-off, increasing <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> not only increases intra-cluster BW in the host-cluster, but also increases the intra-cluster BW in the remote helper clusters. We also analyze resilience of the clustered storage system against passive eavesdropping by providing file-size bounds and optimal code constructions.]]></description><identifier>ISSN: 0018-9448</identifier><identifier>EISSN: 1557-9654</identifier><identifier>DOI: 10.1109/TIT.2018.2806342</identifier><identifier>CODEN: IETTAW</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Bandwidth ; Bandwidths ; Cloud computing ; cloud storage ; Cluster analysis ; clustered storage ; coding for storage ; Data centers ; Data collection ; Data models ; Distributed storage ; Downloading ; Eavesdropping ; Encoding ; Lower bounds ; Maintenance engineering ; Nodes ; Optimization ; regenerating codes ; Repair ; Storage systems ; storage vs repair-bandwidth trade-off ; Symbols ; Tradeoffs</subject><ispartof>IEEE transactions on information theory, 2018-08, Vol.64 (8), p.5783-5805</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-e8881ab85a180f636ba9acff543c35c29bde7f32bd1266c989d3b92d4199f22d3</citedby><cites>FETCH-LOGICAL-c333t-e8881ab85a180f636ba9acff543c35c29bde7f32bd1266c989d3b92d4199f22d3</cites><orcidid>0000-0002-1428-7124 ; 0000-0003-2064-8018</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8292960$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,782,786,798,27933,27934,54767</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8292960$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Prakash, N.</creatorcontrib><creatorcontrib>Abdrashitov, Vitaly</creatorcontrib><creatorcontrib>Medard, Muriel</creatorcontrib><title>The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems</title><title>IEEE transactions on information theory</title><addtitle>TIT</addtitle><description><![CDATA[We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> clusters in total, with <inline-formula> <tex-math notation="LaTeX">m </tex-math></inline-formula> nodes per cluster. A data file is coded and stored across the <inline-formula> <tex-math notation="LaTeX">mn </tex-math></inline-formula> nodes, with each node storing <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols. For availability of data, we require that the file be retrievable by downloading the entire content from any subset of <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> clusters. Nodes represent entities that can fail. We distinguish between intra-cluster and inter-cluster bandwidth (BW) costs during node repair. Node-repair in a cluster is accomplished by downloading <inline-formula> <tex-math notation="LaTeX">\beta </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> other clusters, dubbed remote helper clusters, and also up to <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> surviving nodes, dubbed local helper nodes, in the host cluster. We first identify the optimal trade-off between storage-overhead and inter-cluster repair-bandwidth under functional repair, and also present optimal exact-repair code constructions for a class of parameters. The new trade-off is strictly better than what is achievable via space-sharing existing coding solutions, whenever <inline-formula> <tex-math notation="LaTeX">\ell > 0 </tex-math></inline-formula>. We then obtain sharp lower bounds on the necessary intra-cluster repair BW to achieve optimal trade-off. Under functional repair, random linear network codes (RLNCs) simultaneously optimize usage of both inter- and intra-cluster repair BW; simulation results based on RLNCs suggest optimality of the bounds on intra-cluster repair-bandwidth. Our bounds reveal the interesting fact that, while it is beneficial to increase the number of local helper nodes <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> in order to improve the storage-vs-inter-cluster-repair-BW trade-off, increasing <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> not only increases intra-cluster BW in the host-cluster, but also increases the intra-cluster BW in the remote helper clusters. We also analyze resilience of the clustered storage system against passive eavesdropping by providing file-size bounds and optimal code constructions.]]></description><subject>Bandwidth</subject><subject>Bandwidths</subject><subject>Cloud computing</subject><subject>cloud storage</subject><subject>Cluster analysis</subject><subject>clustered storage</subject><subject>coding for storage</subject><subject>Data centers</subject><subject>Data collection</subject><subject>Data models</subject><subject>Distributed storage</subject><subject>Downloading</subject><subject>Eavesdropping</subject><subject>Encoding</subject><subject>Lower bounds</subject><subject>Maintenance engineering</subject><subject>Nodes</subject><subject>Optimization</subject><subject>regenerating codes</subject><subject>Repair</subject><subject>Storage systems</subject><subject>storage vs repair-bandwidth trade-off</subject><subject>Symbols</subject><subject>Tradeoffs</subject><issn>0018-9448</issn><issn>1557-9654</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LAzEQhoMoWKt3wcuC59R8Nzlq8aNYEOzqNWQ3E7ul7dZkF-m_N6Wlp2FmnncGHoRuKRlRSsxDOS1HjFA9YpooLtgZGlApx9goKc7RgOQVNkLoS3SV0jK3QlI2QO_lAop510b3A8U3xNSn4hO2ron4yW38X-O7RVFG5wG3IRShjcVk1acOIvhTbr7Lg3W6RhfBrRLcHOsQfb08l5M3PPt4nU4eZ7jmnHcYtNbUVVo6qklQXFXOuDoEKXjNZc1M5WEcOKs8ZUrVRhvPK8O8oMYExjwfovvD3W1sf3tInV22fdzkl5bRsRBGZTJT5EDVsU0pQrDb2Kxd3FlK7F6ZzcrsXpk9KsuRu0OkAYATrplhRhH-D8tjZxQ</recordid><startdate>20180801</startdate><enddate>20180801</enddate><creator>Prakash, N.</creator><creator>Abdrashitov, Vitaly</creator><creator>Medard, Muriel</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1428-7124</orcidid><orcidid>https://orcid.org/0000-0003-2064-8018</orcidid></search><sort><creationdate>20180801</creationdate><title>The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems</title><author>Prakash, N. ; Abdrashitov, Vitaly ; Medard, Muriel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-e8881ab85a180f636ba9acff543c35c29bde7f32bd1266c989d3b92d4199f22d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Bandwidth</topic><topic>Bandwidths</topic><topic>Cloud computing</topic><topic>cloud storage</topic><topic>Cluster analysis</topic><topic>clustered storage</topic><topic>coding for storage</topic><topic>Data centers</topic><topic>Data collection</topic><topic>Data models</topic><topic>Distributed storage</topic><topic>Downloading</topic><topic>Eavesdropping</topic><topic>Encoding</topic><topic>Lower bounds</topic><topic>Maintenance engineering</topic><topic>Nodes</topic><topic>Optimization</topic><topic>regenerating codes</topic><topic>Repair</topic><topic>Storage systems</topic><topic>storage vs repair-bandwidth trade-off</topic><topic>Symbols</topic><topic>Tradeoffs</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Prakash, N.</creatorcontrib><creatorcontrib>Abdrashitov, Vitaly</creatorcontrib><creatorcontrib>Medard, Muriel</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on information theory</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Prakash, N.</au><au>Abdrashitov, Vitaly</au><au>Medard, Muriel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems</atitle><jtitle>IEEE transactions on information theory</jtitle><stitle>TIT</stitle><date>2018-08-01</date><risdate>2018</risdate><volume>64</volume><issue>8</issue><spage>5783</spage><epage>5805</epage><pages>5783-5805</pages><issn>0018-9448</issn><eissn>1557-9654</eissn><coden>IETTAW</coden><abstract><![CDATA[We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> clusters in total, with <inline-formula> <tex-math notation="LaTeX">m </tex-math></inline-formula> nodes per cluster. A data file is coded and stored across the <inline-formula> <tex-math notation="LaTeX">mn </tex-math></inline-formula> nodes, with each node storing <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols. For availability of data, we require that the file be retrievable by downloading the entire content from any subset of <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> clusters. Nodes represent entities that can fail. We distinguish between intra-cluster and inter-cluster bandwidth (BW) costs during node repair. Node-repair in a cluster is accomplished by downloading <inline-formula> <tex-math notation="LaTeX">\beta </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> other clusters, dubbed remote helper clusters, and also up to <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> surviving nodes, dubbed local helper nodes, in the host cluster. We first identify the optimal trade-off between storage-overhead and inter-cluster repair-bandwidth under functional repair, and also present optimal exact-repair code constructions for a class of parameters. The new trade-off is strictly better than what is achievable via space-sharing existing coding solutions, whenever <inline-formula> <tex-math notation="LaTeX">\ell > 0 </tex-math></inline-formula>. We then obtain sharp lower bounds on the necessary intra-cluster repair BW to achieve optimal trade-off. Under functional repair, random linear network codes (RLNCs) simultaneously optimize usage of both inter- and intra-cluster repair BW; simulation results based on RLNCs suggest optimality of the bounds on intra-cluster repair-bandwidth. Our bounds reveal the interesting fact that, while it is beneficial to increase the number of local helper nodes <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> in order to improve the storage-vs-inter-cluster-repair-BW trade-off, increasing <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> not only increases intra-cluster BW in the host-cluster, but also increases the intra-cluster BW in the remote helper clusters. We also analyze resilience of the clustered storage system against passive eavesdropping by providing file-size bounds and optimal code constructions.]]></abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIT.2018.2806342</doi><tpages>23</tpages><orcidid>https://orcid.org/0000-0002-1428-7124</orcidid><orcidid>https://orcid.org/0000-0003-2064-8018</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9448
ispartof	IEEE transactions on information theory, 2018-08, Vol.64 (8), p.5783-5805
issn	0018-9448 1557-9654
language	eng
recordid	cdi_proquest_journals_2174496199
source	IEEE Electronic Library (IEL)
subjects	Bandwidth Bandwidths Cloud computing cloud storage Cluster analysis clustered storage coding for storage Data centers Data collection Data models Distributed storage Downloading Eavesdropping Encoding Lower bounds Maintenance engineering Nodes Optimization regenerating codes Repair Storage systems storage vs repair-bandwidth trade-off Symbols Tradeoffs
title	The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-11-28T16%3A49%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20Storage%20Versus%20Repair-Bandwidth%20Trade-off%20for%20Clustered%20Storage%20Systems&rft.jtitle=IEEE%20transactions%20on%20information%20theory&rft.au=Prakash,%20N.&rft.date=2018-08-01&rft.volume=64&rft.issue=8&rft.spage=5783&rft.epage=5805&rft.pages=5783-5805&rft.issn=0018-9448&rft.eissn=1557-9654&rft.coden=IETTAW&rft_id=info:doi/10.1109/TIT.2018.2806342&rft_dat=%3Cproquest_RIE%3E2174496199%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2174496199&rft_id=info:pmid/&rft_ieee_id=8292960&rfr_iscdi=true