Cost-effective BlackWater Raft on Highly Unreliable Nodes at Scale Out

The Raft algorithm maintains strong consistency across data replicas in Cloud. This algorithm divides nodes into leaders and followers, to satisfy read/write requests spanning geo-diverse sites. With the increase of workload, Raft shall provide scale-out performance in proportion. However, tradition...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Xu, Zichen, Du, Yunxiao, Zhang, Kanqi, Huang, Jiacheng, Liu, Jie, Gao, Jingxiong, Stewart, Christopher
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Distributed, Parallel, and Cluster Computing
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Xu, Zichen Du, Yunxiao Zhang, Kanqi Huang, Jiacheng Liu, Jie Gao, Jingxiong Stewart, Christopher
description	The Raft algorithm maintains strong consistency across data replicas in Cloud. This algorithm divides nodes into leaders and followers, to satisfy read/write requests spanning geo-diverse sites. With the increase of workload, Raft shall provide scale-out performance in proportion. However, traditional scale-out techniques encounter bottlenecks in Raft, and when the provisioned sites exhaust local resources, the performance loss will grow exponentially. To provide scalability in Raft, this paper proposes a cost-effective mechanism for elastic auto-scaling in Raft, called BlackWater-Raft or BW-Raft. BW-Raft extends the original Raft with the following abstractions: (1) secretary nodes that take over expensive log synchronization operations from the leader, relaxing the performance constraints on locks. (2) massive low cost observer nodes that handle reads only, improving throughput for typical data intensive services. These abstractions are stateless, allowing elastic scale-out on unreliable yet cheap spot instances. In theory, we demonstrate that BW-Raft can maintain Raft's strong consistency guarantees when scaling out, processing a 50X increase in the number of nodes compared to the original Raft. We have prototyped the BW-Raft on key-value services and evaluated it with many state-of-the-arts on Amazon EC2 and Alibaba Cloud. Our results show that within the same budget, BW-Raft's resource footprint increments are 5-7X smaller than Multi-Raft, and 2X better than original Raft. Using spot instances, BW-Raft can reduces costs by 84.5\% compared to Multi-Raft. In the real world experiments, BW-Raft improves goodput of the 95th-percentile SLO by 9.4X, thus serving as an alternative for services scaling out with strong consistency.
doi_str_mv	10.48550/arxiv.2203.07920
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2203_07920</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2203_07920</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-955c9b9748f8a38456be36085b8e1fadbbff0021649352f4cef6f7f332abe9733</originalsourceid><addsrcrecordid>eNotz71OwzAUhmEvDKhwAUz4BhIc_3uEiFKkqpWgiDE6ds8Bi9CgxFT07oHC9OldPulh7KIRtfbGiCsYv_K-llKoWrggxSmbt8NUKiTCVPIe-U0P6e0ZCo78AajwYccX-eW1P_Cn3Yh9htgjXw1bnDgU_pjgJ9ef5YydEPQTnv_vjG3mt5t2US3Xd_ft9bIC60QVjEkhBqc9eVBeGxtRWeFN9NgQbGMkEkI2VgdlJOmEZMmRUhIiBqfUjF3-3R4h3ceY32E8dL-g7ghS32KwRRI</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Cost-effective BlackWater Raft on Highly Unreliable Nodes at Scale Out</title><source>arXiv.org</source><creator>Xu, Zichen ; Du, Yunxiao ; Zhang, Kanqi ; Huang, Jiacheng ; Liu, Jie ; Gao, Jingxiong ; Stewart, Christopher</creator><creatorcontrib>Xu, Zichen ; Du, Yunxiao ; Zhang, Kanqi ; Huang, Jiacheng ; Liu, Jie ; Gao, Jingxiong ; Stewart, Christopher</creatorcontrib><description>The Raft algorithm maintains strong consistency across data replicas in Cloud. This algorithm divides nodes into leaders and followers, to satisfy read/write requests spanning geo-diverse sites. With the increase of workload, Raft shall provide scale-out performance in proportion. However, traditional scale-out techniques encounter bottlenecks in Raft, and when the provisioned sites exhaust local resources, the performance loss will grow exponentially. To provide scalability in Raft, this paper proposes a cost-effective mechanism for elastic auto-scaling in Raft, called BlackWater-Raft or BW-Raft. BW-Raft extends the original Raft with the following abstractions: (1) secretary nodes that take over expensive log synchronization operations from the leader, relaxing the performance constraints on locks. (2) massive low cost observer nodes that handle reads only, improving throughput for typical data intensive services. These abstractions are stateless, allowing elastic scale-out on unreliable yet cheap spot instances. In theory, we demonstrate that BW-Raft can maintain Raft's strong consistency guarantees when scaling out, processing a 50X increase in the number of nodes compared to the original Raft. We have prototyped the BW-Raft on key-value services and evaluated it with many state-of-the-arts on Amazon EC2 and Alibaba Cloud. Our results show that within the same budget, BW-Raft's resource footprint increments are 5-7X smaller than Multi-Raft, and 2X better than original Raft. Using spot instances, BW-Raft can reduces costs by 84.5\% compared to Multi-Raft. In the real world experiments, BW-Raft improves goodput of the 95th-percentile SLO by 9.4X, thus serving as an alternative for services scaling out with strong consistency.</description><identifier>DOI: 10.48550/arxiv.2203.07920</identifier><language>eng</language><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><creationdate>2022-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2203.07920$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2203.07920$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Xu, Zichen</creatorcontrib><creatorcontrib>Du, Yunxiao</creatorcontrib><creatorcontrib>Zhang, Kanqi</creatorcontrib><creatorcontrib>Huang, Jiacheng</creatorcontrib><creatorcontrib>Liu, Jie</creatorcontrib><creatorcontrib>Gao, Jingxiong</creatorcontrib><creatorcontrib>Stewart, Christopher</creatorcontrib><title>Cost-effective BlackWater Raft on Highly Unreliable Nodes at Scale Out</title><description>The Raft algorithm maintains strong consistency across data replicas in Cloud. This algorithm divides nodes into leaders and followers, to satisfy read/write requests spanning geo-diverse sites. With the increase of workload, Raft shall provide scale-out performance in proportion. However, traditional scale-out techniques encounter bottlenecks in Raft, and when the provisioned sites exhaust local resources, the performance loss will grow exponentially. To provide scalability in Raft, this paper proposes a cost-effective mechanism for elastic auto-scaling in Raft, called BlackWater-Raft or BW-Raft. BW-Raft extends the original Raft with the following abstractions: (1) secretary nodes that take over expensive log synchronization operations from the leader, relaxing the performance constraints on locks. (2) massive low cost observer nodes that handle reads only, improving throughput for typical data intensive services. These abstractions are stateless, allowing elastic scale-out on unreliable yet cheap spot instances. In theory, we demonstrate that BW-Raft can maintain Raft's strong consistency guarantees when scaling out, processing a 50X increase in the number of nodes compared to the original Raft. We have prototyped the BW-Raft on key-value services and evaluated it with many state-of-the-arts on Amazon EC2 and Alibaba Cloud. Our results show that within the same budget, BW-Raft's resource footprint increments are 5-7X smaller than Multi-Raft, and 2X better than original Raft. Using spot instances, BW-Raft can reduces costs by 84.5\% compared to Multi-Raft. In the real world experiments, BW-Raft improves goodput of the 95th-percentile SLO by 9.4X, thus serving as an alternative for services scaling out with strong consistency.</description><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz71OwzAUhmEvDKhwAUz4BhIc_3uEiFKkqpWgiDE6ds8Bi9CgxFT07oHC9OldPulh7KIRtfbGiCsYv_K-llKoWrggxSmbt8NUKiTCVPIe-U0P6e0ZCo78AajwYccX-eW1P_Cn3Yh9htgjXw1bnDgU_pjgJ9ef5YydEPQTnv_vjG3mt5t2US3Xd_ft9bIC60QVjEkhBqc9eVBeGxtRWeFN9NgQbGMkEkI2VgdlJOmEZMmRUhIiBqfUjF3-3R4h3ceY32E8dL-g7ghS32KwRRI</recordid><startdate>20220315</startdate><enddate>20220315</enddate><creator>Xu, Zichen</creator><creator>Du, Yunxiao</creator><creator>Zhang, Kanqi</creator><creator>Huang, Jiacheng</creator><creator>Liu, Jie</creator><creator>Gao, Jingxiong</creator><creator>Stewart, Christopher</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220315</creationdate><title>Cost-effective BlackWater Raft on Highly Unreliable Nodes at Scale Out</title><author>Xu, Zichen ; Du, Yunxiao ; Zhang, Kanqi ; Huang, Jiacheng ; Liu, Jie ; Gao, Jingxiong ; Stewart, Christopher</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-955c9b9748f8a38456be36085b8e1fadbbff0021649352f4cef6f7f332abe9733</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><toplevel>online_resources</toplevel><creatorcontrib>Xu, Zichen</creatorcontrib><creatorcontrib>Du, Yunxiao</creatorcontrib><creatorcontrib>Zhang, Kanqi</creatorcontrib><creatorcontrib>Huang, Jiacheng</creatorcontrib><creatorcontrib>Liu, Jie</creatorcontrib><creatorcontrib>Gao, Jingxiong</creatorcontrib><creatorcontrib>Stewart, Christopher</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xu, Zichen</au><au>Du, Yunxiao</au><au>Zhang, Kanqi</au><au>Huang, Jiacheng</au><au>Liu, Jie</au><au>Gao, Jingxiong</au><au>Stewart, Christopher</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Cost-effective BlackWater Raft on Highly Unreliable Nodes at Scale Out</atitle><date>2022-03-15</date><risdate>2022</risdate><abstract>The Raft algorithm maintains strong consistency across data replicas in Cloud. This algorithm divides nodes into leaders and followers, to satisfy read/write requests spanning geo-diverse sites. With the increase of workload, Raft shall provide scale-out performance in proportion. However, traditional scale-out techniques encounter bottlenecks in Raft, and when the provisioned sites exhaust local resources, the performance loss will grow exponentially. To provide scalability in Raft, this paper proposes a cost-effective mechanism for elastic auto-scaling in Raft, called BlackWater-Raft or BW-Raft. BW-Raft extends the original Raft with the following abstractions: (1) secretary nodes that take over expensive log synchronization operations from the leader, relaxing the performance constraints on locks. (2) massive low cost observer nodes that handle reads only, improving throughput for typical data intensive services. These abstractions are stateless, allowing elastic scale-out on unreliable yet cheap spot instances. In theory, we demonstrate that BW-Raft can maintain Raft's strong consistency guarantees when scaling out, processing a 50X increase in the number of nodes compared to the original Raft. We have prototyped the BW-Raft on key-value services and evaluated it with many state-of-the-arts on Amazon EC2 and Alibaba Cloud. Our results show that within the same budget, BW-Raft's resource footprint increments are 5-7X smaller than Multi-Raft, and 2X better than original Raft. Using spot instances, BW-Raft can reduces costs by 84.5\% compared to Multi-Raft. In the real world experiments, BW-Raft improves goodput of the 95th-percentile SLO by 9.4X, thus serving as an alternative for services scaling out with strong consistency.</abstract><doi>10.48550/arxiv.2203.07920</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2203.07920
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2203_07920
source	arXiv.org
subjects	Computer Science - Distributed, Parallel, and Cluster Computing
title	Cost-effective BlackWater Raft on Highly Unreliable Nodes at Scale Out
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T05%3A14%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Cost-effective%20BlackWater%20Raft%20on%20Highly%20Unreliable%20Nodes%20at%20Scale%20Out&rft.au=Xu,%20Zichen&rft.date=2022-03-15&rft_id=info:doi/10.48550/arxiv.2203.07920&rft_dat=%3Carxiv_GOX%3E2203_07920%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true