Cluster-Aware Scattered Repair in Erasure-Coded Storage: Design and Analysis
Erasure coding is a storage-efficient means to guarantee data reliability in today's commodity storage systems, yet its repair performance is seriously hindered by the substantial repair traffic. Repair in clustered storage systems is even complicated because of the scarcity of the cross-cluste...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on computers 2021-11, Vol.70 (11), p.1861-1874 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1874 |
---|---|
container_issue | 11 |
container_start_page | 1861 |
container_title | IEEE transactions on computers |
container_volume | 70 |
creator | Shen, Zhirong Lin, Shiyao Shu, Jiwu Xie, Chengxin Huang, Zhijie Fu, Yingxun |
description | Erasure coding is a storage-efficient means to guarantee data reliability in today's commodity storage systems, yet its repair performance is seriously hindered by the substantial repair traffic. Repair in clustered storage systems is even complicated because of the scarcity of the cross-cluster bandwidth. We present {\sf ClusterSR} ClusterSR , a cluster-aware scattered repair approach. {\sf ClusterSR} ClusterSR minimizes the cross-cluster repair traffic by carefully choosing the clusters for reading and repairing chunks. It further balances the cross-cluster repair traffic by scheduling the repair of multiple chunks. Large-scale simulation and Alibaba Cloud ECS experiments show that {\sf ClusterSR} ClusterSR can reduce 5.6-52.7 percent of the cross-cluster repair traffic and improve 14.4-68.8 percent of the repair throughput. |
doi_str_mv | 10.1109/TC.2020.3028353 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_9210857</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9210857</ieee_id><sourcerecordid>2580099407</sourcerecordid><originalsourceid>FETCH-LOGICAL-c330t-85c20e00775002f8123cd839abe9b52bda6351ba02e47f4d8619c8716bda639b3</originalsourceid><addsrcrecordid>eNo9kN9LwzAQx4MoOKfPPvgS8DnbJWmaxLdR5w8YCG4-h7S9jo7ZzqRF9t-vc8On4-4-3-P4EHLPYcI52OkqmwgQMJEgjFTygoy4UppZq9JLMgLghlmZwDW5iXEDAKkAOyKLbNvHDgOb_fqAdFn4buiwpJ-483WgdUPnwcc-IMvacpgvuzb4NT7RZ4z1uqG-Kems8dt9rOMtuar8NuLduY7J18t8lb2xxcfrezZbsEJK6JhRhQAE0FoBiMpwIYvSSOtztLkSeelTqXjuQWCiq6Q0KbeF0Tz929hcjsnj6e4utD89xs5t2j4MT0QnlAGwNgE9UNMTVYQ2xoCV24X624e94-COytwqc0dl7qxsSDycEjUi_tNWcDBKywNhbmVx</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2580099407</pqid></control><display><type>article</type><title>Cluster-Aware Scattered Repair in Erasure-Coded Storage: Design and Analysis</title><source>IEEE Electronic Library (IEL)</source><creator>Shen, Zhirong ; Lin, Shiyao ; Shu, Jiwu ; Xie, Chengxin ; Huang, Zhijie ; Fu, Yingxun</creator><creatorcontrib>Shen, Zhirong ; Lin, Shiyao ; Shu, Jiwu ; Xie, Chengxin ; Huang, Zhijie ; Fu, Yingxun</creatorcontrib><description><![CDATA[Erasure coding is a storage-efficient means to guarantee data reliability in today's commodity storage systems, yet its repair performance is seriously hindered by the substantial repair traffic. Repair in clustered storage systems is even complicated because of the scarcity of the cross-cluster bandwidth. We present <inline-formula><tex-math notation="LaTeX">{\sf ClusterSR}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">ClusterSR</mml:mi></mml:math><inline-graphic xlink:href="shen-ieq1-3028353.gif"/> </inline-formula>, a cluster-aware scattered repair approach. <inline-formula><tex-math notation="LaTeX">{\sf ClusterSR}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">ClusterSR</mml:mi></mml:math><inline-graphic xlink:href="shen-ieq2-3028353.gif"/> </inline-formula> minimizes the cross-cluster repair traffic by carefully choosing the clusters for reading and repairing chunks. It further balances the cross-cluster repair traffic by scheduling the repair of multiple chunks. Large-scale simulation and Alibaba Cloud ECS experiments show that <inline-formula><tex-math notation="LaTeX">{\sf ClusterSR}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">ClusterSR</mml:mi></mml:math><inline-graphic xlink:href="shen-ieq3-3028353.gif"/> </inline-formula> can reduce 5.6-52.7 percent of the cross-cluster repair traffic and improve 14.4-68.8 percent of the repair throughput.]]></description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2020.3028353</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Bandwidth ; Clusters ; Computer architecture ; Cross-cluster repair traffic ; Data centers ; Encoding ; Fault tolerance ; Fault tolerant systems ; full duplex transmission ; load balancing ; Maintenance engineering ; Reliability analysis ; Repair ; scattered repair ; Storage systems</subject><ispartof>IEEE transactions on computers, 2021-11, Vol.70 (11), p.1861-1874</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c330t-85c20e00775002f8123cd839abe9b52bda6351ba02e47f4d8619c8716bda639b3</citedby><cites>FETCH-LOGICAL-c330t-85c20e00775002f8123cd839abe9b52bda6351ba02e47f4d8619c8716bda639b3</cites><orcidid>0000-0003-2673-5868 ; 0000-0002-5796-7314</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9210857$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9210857$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Shen, Zhirong</creatorcontrib><creatorcontrib>Lin, Shiyao</creatorcontrib><creatorcontrib>Shu, Jiwu</creatorcontrib><creatorcontrib>Xie, Chengxin</creatorcontrib><creatorcontrib>Huang, Zhijie</creatorcontrib><creatorcontrib>Fu, Yingxun</creatorcontrib><title>Cluster-Aware Scattered Repair in Erasure-Coded Storage: Design and Analysis</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description><![CDATA[Erasure coding is a storage-efficient means to guarantee data reliability in today's commodity storage systems, yet its repair performance is seriously hindered by the substantial repair traffic. Repair in clustered storage systems is even complicated because of the scarcity of the cross-cluster bandwidth. We present <inline-formula><tex-math notation="LaTeX">{\sf ClusterSR}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">ClusterSR</mml:mi></mml:math><inline-graphic xlink:href="shen-ieq1-3028353.gif"/> </inline-formula>, a cluster-aware scattered repair approach. <inline-formula><tex-math notation="LaTeX">{\sf ClusterSR}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">ClusterSR</mml:mi></mml:math><inline-graphic xlink:href="shen-ieq2-3028353.gif"/> </inline-formula> minimizes the cross-cluster repair traffic by carefully choosing the clusters for reading and repairing chunks. It further balances the cross-cluster repair traffic by scheduling the repair of multiple chunks. Large-scale simulation and Alibaba Cloud ECS experiments show that <inline-formula><tex-math notation="LaTeX">{\sf ClusterSR}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">ClusterSR</mml:mi></mml:math><inline-graphic xlink:href="shen-ieq3-3028353.gif"/> </inline-formula> can reduce 5.6-52.7 percent of the cross-cluster repair traffic and improve 14.4-68.8 percent of the repair throughput.]]></description><subject>Bandwidth</subject><subject>Clusters</subject><subject>Computer architecture</subject><subject>Cross-cluster repair traffic</subject><subject>Data centers</subject><subject>Encoding</subject><subject>Fault tolerance</subject><subject>Fault tolerant systems</subject><subject>full duplex transmission</subject><subject>load balancing</subject><subject>Maintenance engineering</subject><subject>Reliability analysis</subject><subject>Repair</subject><subject>scattered repair</subject><subject>Storage systems</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kN9LwzAQx4MoOKfPPvgS8DnbJWmaxLdR5w8YCG4-h7S9jo7ZzqRF9t-vc8On4-4-3-P4EHLPYcI52OkqmwgQMJEgjFTygoy4UppZq9JLMgLghlmZwDW5iXEDAKkAOyKLbNvHDgOb_fqAdFn4buiwpJ-483WgdUPnwcc-IMvacpgvuzb4NT7RZ4z1uqG-Kems8dt9rOMtuar8NuLduY7J18t8lb2xxcfrezZbsEJK6JhRhQAE0FoBiMpwIYvSSOtztLkSeelTqXjuQWCiq6Q0KbeF0Tz929hcjsnj6e4utD89xs5t2j4MT0QnlAGwNgE9UNMTVYQ2xoCV24X624e94-COytwqc0dl7qxsSDycEjUi_tNWcDBKywNhbmVx</recordid><startdate>20211101</startdate><enddate>20211101</enddate><creator>Shen, Zhirong</creator><creator>Lin, Shiyao</creator><creator>Shu, Jiwu</creator><creator>Xie, Chengxin</creator><creator>Huang, Zhijie</creator><creator>Fu, Yingxun</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-2673-5868</orcidid><orcidid>https://orcid.org/0000-0002-5796-7314</orcidid></search><sort><creationdate>20211101</creationdate><title>Cluster-Aware Scattered Repair in Erasure-Coded Storage: Design and Analysis</title><author>Shen, Zhirong ; Lin, Shiyao ; Shu, Jiwu ; Xie, Chengxin ; Huang, Zhijie ; Fu, Yingxun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c330t-85c20e00775002f8123cd839abe9b52bda6351ba02e47f4d8619c8716bda639b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Bandwidth</topic><topic>Clusters</topic><topic>Computer architecture</topic><topic>Cross-cluster repair traffic</topic><topic>Data centers</topic><topic>Encoding</topic><topic>Fault tolerance</topic><topic>Fault tolerant systems</topic><topic>full duplex transmission</topic><topic>load balancing</topic><topic>Maintenance engineering</topic><topic>Reliability analysis</topic><topic>Repair</topic><topic>scattered repair</topic><topic>Storage systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Shen, Zhirong</creatorcontrib><creatorcontrib>Lin, Shiyao</creatorcontrib><creatorcontrib>Shu, Jiwu</creatorcontrib><creatorcontrib>Xie, Chengxin</creatorcontrib><creatorcontrib>Huang, Zhijie</creatorcontrib><creatorcontrib>Fu, Yingxun</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Shen, Zhirong</au><au>Lin, Shiyao</au><au>Shu, Jiwu</au><au>Xie, Chengxin</au><au>Huang, Zhijie</au><au>Fu, Yingxun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cluster-Aware Scattered Repair in Erasure-Coded Storage: Design and Analysis</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2021-11-01</date><risdate>2021</risdate><volume>70</volume><issue>11</issue><spage>1861</spage><epage>1874</epage><pages>1861-1874</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract><![CDATA[Erasure coding is a storage-efficient means to guarantee data reliability in today's commodity storage systems, yet its repair performance is seriously hindered by the substantial repair traffic. Repair in clustered storage systems is even complicated because of the scarcity of the cross-cluster bandwidth. We present <inline-formula><tex-math notation="LaTeX">{\sf ClusterSR}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">ClusterSR</mml:mi></mml:math><inline-graphic xlink:href="shen-ieq1-3028353.gif"/> </inline-formula>, a cluster-aware scattered repair approach. <inline-formula><tex-math notation="LaTeX">{\sf ClusterSR}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">ClusterSR</mml:mi></mml:math><inline-graphic xlink:href="shen-ieq2-3028353.gif"/> </inline-formula> minimizes the cross-cluster repair traffic by carefully choosing the clusters for reading and repairing chunks. It further balances the cross-cluster repair traffic by scheduling the repair of multiple chunks. Large-scale simulation and Alibaba Cloud ECS experiments show that <inline-formula><tex-math notation="LaTeX">{\sf ClusterSR}</tex-math> <mml:math><mml:mi mathvariant="sans-serif">ClusterSR</mml:mi></mml:math><inline-graphic xlink:href="shen-ieq3-3028353.gif"/> </inline-formula> can reduce 5.6-52.7 percent of the cross-cluster repair traffic and improve 14.4-68.8 percent of the repair throughput.]]></abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TC.2020.3028353</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-2673-5868</orcidid><orcidid>https://orcid.org/0000-0002-5796-7314</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0018-9340 |
ispartof | IEEE transactions on computers, 2021-11, Vol.70 (11), p.1861-1874 |
issn | 0018-9340 1557-9956 |
language | eng |
recordid | cdi_ieee_primary_9210857 |
source | IEEE Electronic Library (IEL) |
subjects | Bandwidth Clusters Computer architecture Cross-cluster repair traffic Data centers Encoding Fault tolerance Fault tolerant systems full duplex transmission load balancing Maintenance engineering Reliability analysis Repair scattered repair Storage systems |
title | Cluster-Aware Scattered Repair in Erasure-Coded Storage: Design and Analysis |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T01%3A55%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cluster-Aware%20Scattered%20Repair%20in%20Erasure-Coded%20Storage:%20Design%20and%20Analysis&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Shen,%20Zhirong&rft.date=2021-11-01&rft.volume=70&rft.issue=11&rft.spage=1861&rft.epage=1874&rft.pages=1861-1874&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2020.3028353&rft_dat=%3Cproquest_RIE%3E2580099407%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2580099407&rft_id=info:pmid/&rft_ieee_id=9210857&rfr_iscdi=true |