The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems

We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are n clusters in total, with m nodes per cluster. A data file is coded and stored across the mn nodes, with each node storing \alpha sy...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on information theory 2018-08, Vol.64 (8), p.5783-5805
Hauptverfasser: Prakash, N., Abdrashitov, Vitaly, Medard, Muriel
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 5805
container_issue 8
container_start_page 5783
container_title IEEE transactions on information theory
container_volume 64
creator Prakash, N.
Abdrashitov, Vitaly
Medard, Muriel
description We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are n clusters in total, with m nodes per cluster. A data file is coded and stored across the mn nodes, with each node storing \alpha symbols. For availability of data, we require that the file be retrievable by downloading the entire content from any subset of k clusters. Nodes represent entities that can fail. We distinguish between intra-cluster and inter-cluster bandwidth (BW) costs during node repair. Node-repair in a cluster is accomplished by downloading \beta symbols each from any set of d other clusters, dubbed remote helper clusters, and also up to \alpha symbols each from any set of \ell surviving nodes, dubbed local helper nodes, in the host cluster. We first identify the optimal trade-off between storage-overhead and inter-cluster repair-bandwidth under functional repair, and also present optimal exact-repair code constructions for a class of parameters. The new trade-off is strictly better than what is achievable via space-sharing existing coding solutions, whenever \ell > 0 . We then obtain sharp lower bounds on the necessary intra-cluster repair BW to achieve optimal trade-off. Under functional repair, random linear network codes (RLNCs) simultaneously optimize usage of both inter- and intra-cluster repair BW; simulation results based on RLNCs suggest optimality of the bounds on intra-cluster repair-bandwidth. Our bounds reveal the interesting fact that, while it is beneficial to increase the number of local helper nodes \ell in order to improve the storage-vs-inter-cluster-repair-BW trade-off, increa
doi_str_mv 10.1109/TIT.2018.2806342
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2174496199</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8292960</ieee_id><sourcerecordid>2174496199</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-e8881ab85a180f636ba9acff543c35c29bde7f32bd1266c989d3b92d4199f22d3</originalsourceid><addsrcrecordid>eNo9kE1LAzEQhoMoWKt3wcuC59R8Nzlq8aNYEOzqNWQ3E7ul7dZkF-m_N6Wlp2FmnncGHoRuKRlRSsxDOS1HjFA9YpooLtgZGlApx9goKc7RgOQVNkLoS3SV0jK3QlI2QO_lAop510b3A8U3xNSn4hO2ron4yW38X-O7RVFG5wG3IRShjcVk1acOIvhTbr7Lg3W6RhfBrRLcHOsQfb08l5M3PPt4nU4eZ7jmnHcYtNbUVVo6qklQXFXOuDoEKXjNZc1M5WEcOKs8ZUrVRhvPK8O8oMYExjwfovvD3W1sf3tInV22fdzkl5bRsRBGZTJT5EDVsU0pQrDb2Kxd3FlK7F6ZzcrsXpk9KsuRu0OkAYATrplhRhH-D8tjZxQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2174496199</pqid></control><display><type>article</type><title>The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Prakash, N. ; Abdrashitov, Vitaly ; Medard, Muriel</creator><creatorcontrib>Prakash, N. ; Abdrashitov, Vitaly ; Medard, Muriel</creatorcontrib><description><![CDATA[We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> clusters in total, with <inline-formula> <tex-math notation="LaTeX">m </tex-math></inline-formula> nodes per cluster. A data file is coded and stored across the <inline-formula> <tex-math notation="LaTeX">mn </tex-math></inline-formula> nodes, with each node storing <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols. For availability of data, we require that the file be retrievable by downloading the entire content from any subset of <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> clusters. Nodes represent entities that can fail. We distinguish between intra-cluster and inter-cluster bandwidth (BW) costs during node repair. Node-repair in a cluster is accomplished by downloading <inline-formula> <tex-math notation="LaTeX">\beta </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> other clusters, dubbed remote helper clusters, and also up to <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> surviving nodes, dubbed local helper nodes, in the host cluster. We first identify the optimal trade-off between storage-overhead and inter-cluster repair-bandwidth under functional repair, and also present optimal exact-repair code constructions for a class of parameters. The new trade-off is strictly better than what is achievable via space-sharing existing coding solutions, whenever <inline-formula> <tex-math notation="LaTeX">\ell > 0 </tex-math></inline-formula>. We then obtain sharp lower bounds on the necessary intra-cluster repair BW to achieve optimal trade-off. Under functional repair, random linear network codes (RLNCs) simultaneously optimize usage of both inter- and intra-cluster repair BW; simulation results based on RLNCs suggest optimality of the bounds on intra-cluster repair-bandwidth. Our bounds reveal the interesting fact that, while it is beneficial to increase the number of local helper nodes <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> in order to improve the storage-vs-inter-cluster-repair-BW trade-off, increasing <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> not only increases intra-cluster BW in the host-cluster, but also increases the intra-cluster BW in the remote helper clusters. We also analyze resilience of the clustered storage system against passive eavesdropping by providing file-size bounds and optimal code constructions.]]></description><identifier>ISSN: 0018-9448</identifier><identifier>EISSN: 1557-9654</identifier><identifier>DOI: 10.1109/TIT.2018.2806342</identifier><identifier>CODEN: IETTAW</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Bandwidth ; Bandwidths ; Cloud computing ; cloud storage ; Cluster analysis ; clustered storage ; coding for storage ; Data centers ; Data collection ; Data models ; Distributed storage ; Downloading ; Eavesdropping ; Encoding ; Lower bounds ; Maintenance engineering ; Nodes ; Optimization ; regenerating codes ; Repair ; Storage systems ; storage vs repair-bandwidth trade-off ; Symbols ; Tradeoffs</subject><ispartof>IEEE transactions on information theory, 2018-08, Vol.64 (8), p.5783-5805</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-e8881ab85a180f636ba9acff543c35c29bde7f32bd1266c989d3b92d4199f22d3</citedby><cites>FETCH-LOGICAL-c333t-e8881ab85a180f636ba9acff543c35c29bde7f32bd1266c989d3b92d4199f22d3</cites><orcidid>0000-0002-1428-7124 ; 0000-0003-2064-8018</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8292960$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,782,786,798,27933,27934,54767</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8292960$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Prakash, N.</creatorcontrib><creatorcontrib>Abdrashitov, Vitaly</creatorcontrib><creatorcontrib>Medard, Muriel</creatorcontrib><title>The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems</title><title>IEEE transactions on information theory</title><addtitle>TIT</addtitle><description><![CDATA[We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> clusters in total, with <inline-formula> <tex-math notation="LaTeX">m </tex-math></inline-formula> nodes per cluster. A data file is coded and stored across the <inline-formula> <tex-math notation="LaTeX">mn </tex-math></inline-formula> nodes, with each node storing <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols. For availability of data, we require that the file be retrievable by downloading the entire content from any subset of <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> clusters. Nodes represent entities that can fail. We distinguish between intra-cluster and inter-cluster bandwidth (BW) costs during node repair. Node-repair in a cluster is accomplished by downloading <inline-formula> <tex-math notation="LaTeX">\beta </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> other clusters, dubbed remote helper clusters, and also up to <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> surviving nodes, dubbed local helper nodes, in the host cluster. We first identify the optimal trade-off between storage-overhead and inter-cluster repair-bandwidth under functional repair, and also present optimal exact-repair code constructions for a class of parameters. The new trade-off is strictly better than what is achievable via space-sharing existing coding solutions, whenever <inline-formula> <tex-math notation="LaTeX">\ell > 0 </tex-math></inline-formula>. We then obtain sharp lower bounds on the necessary intra-cluster repair BW to achieve optimal trade-off. Under functional repair, random linear network codes (RLNCs) simultaneously optimize usage of both inter- and intra-cluster repair BW; simulation results based on RLNCs suggest optimality of the bounds on intra-cluster repair-bandwidth. Our bounds reveal the interesting fact that, while it is beneficial to increase the number of local helper nodes <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> in order to improve the storage-vs-inter-cluster-repair-BW trade-off, increasing <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> not only increases intra-cluster BW in the host-cluster, but also increases the intra-cluster BW in the remote helper clusters. We also analyze resilience of the clustered storage system against passive eavesdropping by providing file-size bounds and optimal code constructions.]]></description><subject>Bandwidth</subject><subject>Bandwidths</subject><subject>Cloud computing</subject><subject>cloud storage</subject><subject>Cluster analysis</subject><subject>clustered storage</subject><subject>coding for storage</subject><subject>Data centers</subject><subject>Data collection</subject><subject>Data models</subject><subject>Distributed storage</subject><subject>Downloading</subject><subject>Eavesdropping</subject><subject>Encoding</subject><subject>Lower bounds</subject><subject>Maintenance engineering</subject><subject>Nodes</subject><subject>Optimization</subject><subject>regenerating codes</subject><subject>Repair</subject><subject>Storage systems</subject><subject>storage vs repair-bandwidth trade-off</subject><subject>Symbols</subject><subject>Tradeoffs</subject><issn>0018-9448</issn><issn>1557-9654</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LAzEQhoMoWKt3wcuC59R8Nzlq8aNYEOzqNWQ3E7ul7dZkF-m_N6Wlp2FmnncGHoRuKRlRSsxDOS1HjFA9YpooLtgZGlApx9goKc7RgOQVNkLoS3SV0jK3QlI2QO_lAop510b3A8U3xNSn4hO2ron4yW38X-O7RVFG5wG3IRShjcVk1acOIvhTbr7Lg3W6RhfBrRLcHOsQfb08l5M3PPt4nU4eZ7jmnHcYtNbUVVo6qklQXFXOuDoEKXjNZc1M5WEcOKs8ZUrVRhvPK8O8oMYExjwfovvD3W1sf3tInV22fdzkl5bRsRBGZTJT5EDVsU0pQrDb2Kxd3FlK7F6ZzcrsXpk9KsuRu0OkAYATrplhRhH-D8tjZxQ</recordid><startdate>20180801</startdate><enddate>20180801</enddate><creator>Prakash, N.</creator><creator>Abdrashitov, Vitaly</creator><creator>Medard, Muriel</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1428-7124</orcidid><orcidid>https://orcid.org/0000-0003-2064-8018</orcidid></search><sort><creationdate>20180801</creationdate><title>The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems</title><author>Prakash, N. ; Abdrashitov, Vitaly ; Medard, Muriel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-e8881ab85a180f636ba9acff543c35c29bde7f32bd1266c989d3b92d4199f22d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Bandwidth</topic><topic>Bandwidths</topic><topic>Cloud computing</topic><topic>cloud storage</topic><topic>Cluster analysis</topic><topic>clustered storage</topic><topic>coding for storage</topic><topic>Data centers</topic><topic>Data collection</topic><topic>Data models</topic><topic>Distributed storage</topic><topic>Downloading</topic><topic>Eavesdropping</topic><topic>Encoding</topic><topic>Lower bounds</topic><topic>Maintenance engineering</topic><topic>Nodes</topic><topic>Optimization</topic><topic>regenerating codes</topic><topic>Repair</topic><topic>Storage systems</topic><topic>storage vs repair-bandwidth trade-off</topic><topic>Symbols</topic><topic>Tradeoffs</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Prakash, N.</creatorcontrib><creatorcontrib>Abdrashitov, Vitaly</creatorcontrib><creatorcontrib>Medard, Muriel</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on information theory</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Prakash, N.</au><au>Abdrashitov, Vitaly</au><au>Medard, Muriel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems</atitle><jtitle>IEEE transactions on information theory</jtitle><stitle>TIT</stitle><date>2018-08-01</date><risdate>2018</risdate><volume>64</volume><issue>8</issue><spage>5783</spage><epage>5805</epage><pages>5783-5805</pages><issn>0018-9448</issn><eissn>1557-9654</eissn><coden>IETTAW</coden><abstract><![CDATA[We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> clusters in total, with <inline-formula> <tex-math notation="LaTeX">m </tex-math></inline-formula> nodes per cluster. A data file is coded and stored across the <inline-formula> <tex-math notation="LaTeX">mn </tex-math></inline-formula> nodes, with each node storing <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols. For availability of data, we require that the file be retrievable by downloading the entire content from any subset of <inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> clusters. Nodes represent entities that can fail. We distinguish between intra-cluster and inter-cluster bandwidth (BW) costs during node repair. Node-repair in a cluster is accomplished by downloading <inline-formula> <tex-math notation="LaTeX">\beta </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> other clusters, dubbed remote helper clusters, and also up to <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula> symbols each from any set of <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> surviving nodes, dubbed local helper nodes, in the host cluster. We first identify the optimal trade-off between storage-overhead and inter-cluster repair-bandwidth under functional repair, and also present optimal exact-repair code constructions for a class of parameters. The new trade-off is strictly better than what is achievable via space-sharing existing coding solutions, whenever <inline-formula> <tex-math notation="LaTeX">\ell > 0 </tex-math></inline-formula>. We then obtain sharp lower bounds on the necessary intra-cluster repair BW to achieve optimal trade-off. Under functional repair, random linear network codes (RLNCs) simultaneously optimize usage of both inter- and intra-cluster repair BW; simulation results based on RLNCs suggest optimality of the bounds on intra-cluster repair-bandwidth. Our bounds reveal the interesting fact that, while it is beneficial to increase the number of local helper nodes <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> in order to improve the storage-vs-inter-cluster-repair-BW trade-off, increasing <inline-formula> <tex-math notation="LaTeX">\ell </tex-math></inline-formula> not only increases intra-cluster BW in the host-cluster, but also increases the intra-cluster BW in the remote helper clusters. We also analyze resilience of the clustered storage system against passive eavesdropping by providing file-size bounds and optimal code constructions.]]></abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIT.2018.2806342</doi><tpages>23</tpages><orcidid>https://orcid.org/0000-0002-1428-7124</orcidid><orcidid>https://orcid.org/0000-0003-2064-8018</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9448
ispartof IEEE transactions on information theory, 2018-08, Vol.64 (8), p.5783-5805
issn 0018-9448
1557-9654
language eng
recordid cdi_proquest_journals_2174496199
source IEEE Electronic Library (IEL)
subjects Bandwidth
Bandwidths
Cloud computing
cloud storage
Cluster analysis
clustered storage
coding for storage
Data centers
Data collection
Data models
Distributed storage
Downloading
Eavesdropping
Encoding
Lower bounds
Maintenance engineering
Nodes
Optimization
regenerating codes
Repair
Storage systems
storage vs repair-bandwidth trade-off
Symbols
Tradeoffs
title The Storage Versus Repair-Bandwidth Trade-off for Clustered Storage Systems
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-11-28T16%3A49%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20Storage%20Versus%20Repair-Bandwidth%20Trade-off%20for%20Clustered%20Storage%20Systems&rft.jtitle=IEEE%20transactions%20on%20information%20theory&rft.au=Prakash,%20N.&rft.date=2018-08-01&rft.volume=64&rft.issue=8&rft.spage=5783&rft.epage=5805&rft.pages=5783-5805&rft.issn=0018-9448&rft.eissn=1557-9654&rft.coden=IETTAW&rft_id=info:doi/10.1109/TIT.2018.2806342&rft_dat=%3Cproquest_RIE%3E2174496199%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2174496199&rft_id=info:pmid/&rft_ieee_id=8292960&rfr_iscdi=true