CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows

Stream processing in the last decade has seen broad adoption in both commercial and research settings. One key element for this success is the ability of modern stream processors to handle failures while ensuring exactly-once processing guarantees. At the moment of writing, virtually all stream proc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-03
Hauptverfasser: Siachamis, George, Psarakis, Kyriakos, Fragkoulis, Marios, Arie van Deursen, Carbone, Paris, Katsifodimos, Asterios
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Siachamis, George
Psarakis, Kyriakos
Fragkoulis, Marios
Arie van Deursen
Carbone, Paris
Katsifodimos, Asterios
description Stream processing in the last decade has seen broad adoption in both commercial and research settings. One key element for this success is the ability of modern stream processors to handle failures while ensuring exactly-once processing guarantees. At the moment of writing, virtually all stream processors that guarantee exactly-once processing implement a variant of Apache Flink's coordinated checkpoints - an extension of the original Chandy-Lamport checkpoints from 1985. However, the reasons behind this prevalence of the coordinated approach remain anecdotal, as reported by practitioners of the stream processing community. At the same time, common checkpointing approaches, such as the uncoordinated and the communication-induced ones, remain largely unexplored. This paper is the first to address this gap by i) shedding light on why practitioners have favored the coordinated approach and ii) by investigating whether there are viable alternatives. To this end, we implement three checkpointing approaches that we surveyed and adapted for the distinct needs of streaming dataflows. Our analysis shows that the coordinated approach outperforms the uncoordinated and communication-induced protocols under uniformly distributed workloads. To our surprise, however, the uncoordinated approach is not only competitive to the coordinated one in uniformly distributed workloads, but it also outperforms the coordinated approach in skewed workloads. We conclude that rather than blindly employing coordinated checkpointing, research should focus on optimizing the very promising uncoordinated approach, as it can address issues with skew and support prevalent cyclic queries. We believe that our findings can trigger further research into checkpointing mechanisms.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2972955163</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2972955163</sourcerecordid><originalsourceid>FETCH-proquest_journals_29729551633</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRwdc5ITc72TSxJtVJwLUvMKU0sycxLVwCLFuRn5oF5AUX5JfnJ-TnFCmn5RQrBJUWpibkgcZfEksS0nPzyYh4G1rTEnOJUXijNzaDs5hri7KFbUJRfWJpaXBKflV9alAeUijeyNDeyNDU1NDM2Jk4VAJTPOzw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2972955163</pqid></control><display><type>article</type><title>CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows</title><source>Free E- Journals</source><creator>Siachamis, George ; Psarakis, Kyriakos ; Fragkoulis, Marios ; Arie van Deursen ; Carbone, Paris ; Katsifodimos, Asterios</creator><creatorcontrib>Siachamis, George ; Psarakis, Kyriakos ; Fragkoulis, Marios ; Arie van Deursen ; Carbone, Paris ; Katsifodimos, Asterios</creatorcontrib><description>Stream processing in the last decade has seen broad adoption in both commercial and research settings. One key element for this success is the ability of modern stream processors to handle failures while ensuring exactly-once processing guarantees. At the moment of writing, virtually all stream processors that guarantee exactly-once processing implement a variant of Apache Flink's coordinated checkpoints - an extension of the original Chandy-Lamport checkpoints from 1985. However, the reasons behind this prevalence of the coordinated approach remain anecdotal, as reported by practitioners of the stream processing community. At the same time, common checkpointing approaches, such as the uncoordinated and the communication-induced ones, remain largely unexplored. This paper is the first to address this gap by i) shedding light on why practitioners have favored the coordinated approach and ii) by investigating whether there are viable alternatives. To this end, we implement three checkpointing approaches that we surveyed and adapted for the distinct needs of streaming dataflows. Our analysis shows that the coordinated approach outperforms the uncoordinated and communication-induced protocols under uniformly distributed workloads. To our surprise, however, the uncoordinated approach is not only competitive to the coordinated one in uniformly distributed workloads, but it also outperforms the coordinated approach in skewed workloads. We conclude that rather than blindly employing coordinated checkpointing, research should focus on optimizing the very promising uncoordinated approach, as it can address issues with skew and support prevalent cyclic queries. We believe that our findings can trigger further research into checkpointing mechanisms.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Checkpointing ; Processors ; Workload ; Workloads</subject><ispartof>arXiv.org, 2024-03</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Siachamis, George</creatorcontrib><creatorcontrib>Psarakis, Kyriakos</creatorcontrib><creatorcontrib>Fragkoulis, Marios</creatorcontrib><creatorcontrib>Arie van Deursen</creatorcontrib><creatorcontrib>Carbone, Paris</creatorcontrib><creatorcontrib>Katsifodimos, Asterios</creatorcontrib><title>CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows</title><title>arXiv.org</title><description>Stream processing in the last decade has seen broad adoption in both commercial and research settings. One key element for this success is the ability of modern stream processors to handle failures while ensuring exactly-once processing guarantees. At the moment of writing, virtually all stream processors that guarantee exactly-once processing implement a variant of Apache Flink's coordinated checkpoints - an extension of the original Chandy-Lamport checkpoints from 1985. However, the reasons behind this prevalence of the coordinated approach remain anecdotal, as reported by practitioners of the stream processing community. At the same time, common checkpointing approaches, such as the uncoordinated and the communication-induced ones, remain largely unexplored. This paper is the first to address this gap by i) shedding light on why practitioners have favored the coordinated approach and ii) by investigating whether there are viable alternatives. To this end, we implement three checkpointing approaches that we surveyed and adapted for the distinct needs of streaming dataflows. Our analysis shows that the coordinated approach outperforms the uncoordinated and communication-induced protocols under uniformly distributed workloads. To our surprise, however, the uncoordinated approach is not only competitive to the coordinated one in uniformly distributed workloads, but it also outperforms the coordinated approach in skewed workloads. We conclude that rather than blindly employing coordinated checkpointing, research should focus on optimizing the very promising uncoordinated approach, as it can address issues with skew and support prevalent cyclic queries. We believe that our findings can trigger further research into checkpointing mechanisms.</description><subject>Checkpointing</subject><subject>Processors</subject><subject>Workload</subject><subject>Workloads</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRwdc5ITc72TSxJtVJwLUvMKU0sycxLVwCLFuRn5oF5AUX5JfnJ-TnFCmn5RQrBJUWpibkgcZfEksS0nPzyYh4G1rTEnOJUXijNzaDs5hri7KFbUJRfWJpaXBKflV9alAeUijeyNDeyNDU1NDM2Jk4VAJTPOzw</recordid><startdate>20240320</startdate><enddate>20240320</enddate><creator>Siachamis, George</creator><creator>Psarakis, Kyriakos</creator><creator>Fragkoulis, Marios</creator><creator>Arie van Deursen</creator><creator>Carbone, Paris</creator><creator>Katsifodimos, Asterios</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240320</creationdate><title>CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows</title><author>Siachamis, George ; Psarakis, Kyriakos ; Fragkoulis, Marios ; Arie van Deursen ; Carbone, Paris ; Katsifodimos, Asterios</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29729551633</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Checkpointing</topic><topic>Processors</topic><topic>Workload</topic><topic>Workloads</topic><toplevel>online_resources</toplevel><creatorcontrib>Siachamis, George</creatorcontrib><creatorcontrib>Psarakis, Kyriakos</creatorcontrib><creatorcontrib>Fragkoulis, Marios</creatorcontrib><creatorcontrib>Arie van Deursen</creatorcontrib><creatorcontrib>Carbone, Paris</creatorcontrib><creatorcontrib>Katsifodimos, Asterios</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Siachamis, George</au><au>Psarakis, Kyriakos</au><au>Fragkoulis, Marios</au><au>Arie van Deursen</au><au>Carbone, Paris</au><au>Katsifodimos, Asterios</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows</atitle><jtitle>arXiv.org</jtitle><date>2024-03-20</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Stream processing in the last decade has seen broad adoption in both commercial and research settings. One key element for this success is the ability of modern stream processors to handle failures while ensuring exactly-once processing guarantees. At the moment of writing, virtually all stream processors that guarantee exactly-once processing implement a variant of Apache Flink's coordinated checkpoints - an extension of the original Chandy-Lamport checkpoints from 1985. However, the reasons behind this prevalence of the coordinated approach remain anecdotal, as reported by practitioners of the stream processing community. At the same time, common checkpointing approaches, such as the uncoordinated and the communication-induced ones, remain largely unexplored. This paper is the first to address this gap by i) shedding light on why practitioners have favored the coordinated approach and ii) by investigating whether there are viable alternatives. To this end, we implement three checkpointing approaches that we surveyed and adapted for the distinct needs of streaming dataflows. Our analysis shows that the coordinated approach outperforms the uncoordinated and communication-induced protocols under uniformly distributed workloads. To our surprise, however, the uncoordinated approach is not only competitive to the coordinated one in uniformly distributed workloads, but it also outperforms the coordinated approach in skewed workloads. We conclude that rather than blindly employing coordinated checkpointing, research should focus on optimizing the very promising uncoordinated approach, as it can address issues with skew and support prevalent cyclic queries. We believe that our findings can trigger further research into checkpointing mechanisms.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-03
issn 2331-8422
language eng
recordid cdi_proquest_journals_2972955163
source Free E- Journals
subjects Checkpointing
Processors
Workload
Workloads
title CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T19%3A11%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=CheckMate:%20Evaluating%20Checkpointing%20Protocols%20for%20Streaming%20Dataflows&rft.jtitle=arXiv.org&rft.au=Siachamis,%20George&rft.date=2024-03-20&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2972955163%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2972955163&rft_id=info:pmid/&rfr_iscdi=true