Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine
Load balancing, operator instance collocations and horizontal scaling are critical issues in Parallel Stream Processing Engines to achieve low data processing latency, optimized cluster utilization and minimized communication cost respectively. In previous work, these issues are typically tackled se...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Madsen, Kasper Grud Skat Zhou, Yongluan Cao, Jianneng |
description | Load balancing, operator instance collocations and horizontal scaling are
critical issues in Parallel Stream Processing Engines to achieve low data
processing latency, optimized cluster utilization and minimized communication
cost respectively. In previous work, these issues are typically tackled
separately and independently. We argue that these problems are tightly coupled
in the sense that they all need to determine the allocations of workloads and
migrate computational states at runtime. Optimizing them independently would
result in suboptimal solutions. Therefore, in this paper, we investigate how
these three issues can be modeled as one integrated optimization problem. In
particular, we first consider jobs where workload allocations have little
effect on the communication cost, and model the problem of load balance as a
Mixed-Integer Linear Program. Afterwards, we present an extended solution
called ALBIC, which support general jobs. We implement the proposed techniques
on top of Apache Storm, an open-source Parallel Stream Processing Engine. The
extensive experimental results over both synthetic and real datasets show that
our techniques clearly outperform existing approaches. |
doi_str_mv | 10.48550/arxiv.1602.03770 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1602_03770</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1602_03770</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-6a764311be376077c3067e1d145cc2a4ef88fb1acbb7291637d8f5efb42d911f3</originalsourceid><addsrcrecordid>eNotz0FOwzAQhWFvWKCWA7DCF0jwxI4nXValQFElKug-mjjjyFLiICdU9PaohdWT_sWTPiHuQeWmKkv1SOknnHKwqsiVRlS34m0XZ-4SzeHE8ukcaQhOfrAbow_d96WPUYYoSR4oUd9zLz_nxDTIQxodT1OIndzGLkReihtP_cR3_7sQx-ftcfOa7d9fdpv1PiOLKrOE1miAhjVahei0ssjQgimdK8iwryrfALmmwWIFVmNb-ZJ9Y4p2BeD1Qjz83V4x9VcKA6VzfUHVV5T-BVVAR30</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine</title><source>arXiv.org</source><creator>Madsen, Kasper Grud Skat ; Zhou, Yongluan ; Cao, Jianneng</creator><creatorcontrib>Madsen, Kasper Grud Skat ; Zhou, Yongluan ; Cao, Jianneng</creatorcontrib><description>Load balancing, operator instance collocations and horizontal scaling are
critical issues in Parallel Stream Processing Engines to achieve low data
processing latency, optimized cluster utilization and minimized communication
cost respectively. In previous work, these issues are typically tackled
separately and independently. We argue that these problems are tightly coupled
in the sense that they all need to determine the allocations of workloads and
migrate computational states at runtime. Optimizing them independently would
result in suboptimal solutions. Therefore, in this paper, we investigate how
these three issues can be modeled as one integrated optimization problem. In
particular, we first consider jobs where workload allocations have little
effect on the communication cost, and model the problem of load balance as a
Mixed-Integer Linear Program. Afterwards, we present an extended solution
called ALBIC, which support general jobs. We implement the proposed techniques
on top of Apache Storm, an open-source Parallel Stream Processing Engine. The
extensive experimental results over both synthetic and real datasets show that
our techniques clearly outperform existing approaches.</description><identifier>DOI: 10.48550/arxiv.1602.03770</identifier><language>eng</language><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><creationdate>2016-02</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1602.03770$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1602.03770$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Madsen, Kasper Grud Skat</creatorcontrib><creatorcontrib>Zhou, Yongluan</creatorcontrib><creatorcontrib>Cao, Jianneng</creatorcontrib><title>Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine</title><description>Load balancing, operator instance collocations and horizontal scaling are
critical issues in Parallel Stream Processing Engines to achieve low data
processing latency, optimized cluster utilization and minimized communication
cost respectively. In previous work, these issues are typically tackled
separately and independently. We argue that these problems are tightly coupled
in the sense that they all need to determine the allocations of workloads and
migrate computational states at runtime. Optimizing them independently would
result in suboptimal solutions. Therefore, in this paper, we investigate how
these three issues can be modeled as one integrated optimization problem. In
particular, we first consider jobs where workload allocations have little
effect on the communication cost, and model the problem of load balance as a
Mixed-Integer Linear Program. Afterwards, we present an extended solution
called ALBIC, which support general jobs. We implement the proposed techniques
on top of Apache Storm, an open-source Parallel Stream Processing Engine. The
extensive experimental results over both synthetic and real datasets show that
our techniques clearly outperform existing approaches.</description><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz0FOwzAQhWFvWKCWA7DCF0jwxI4nXValQFElKug-mjjjyFLiICdU9PaohdWT_sWTPiHuQeWmKkv1SOknnHKwqsiVRlS34m0XZ-4SzeHE8ukcaQhOfrAbow_d96WPUYYoSR4oUd9zLz_nxDTIQxodT1OIndzGLkReihtP_cR3_7sQx-ftcfOa7d9fdpv1PiOLKrOE1miAhjVahei0ssjQgimdK8iwryrfALmmwWIFVmNb-ZJ9Y4p2BeD1Qjz83V4x9VcKA6VzfUHVV5T-BVVAR30</recordid><startdate>20160211</startdate><enddate>20160211</enddate><creator>Madsen, Kasper Grud Skat</creator><creator>Zhou, Yongluan</creator><creator>Cao, Jianneng</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20160211</creationdate><title>Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine</title><author>Madsen, Kasper Grud Skat ; Zhou, Yongluan ; Cao, Jianneng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-6a764311be376077c3067e1d145cc2a4ef88fb1acbb7291637d8f5efb42d911f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><toplevel>online_resources</toplevel><creatorcontrib>Madsen, Kasper Grud Skat</creatorcontrib><creatorcontrib>Zhou, Yongluan</creatorcontrib><creatorcontrib>Cao, Jianneng</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Madsen, Kasper Grud Skat</au><au>Zhou, Yongluan</au><au>Cao, Jianneng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine</atitle><date>2016-02-11</date><risdate>2016</risdate><abstract>Load balancing, operator instance collocations and horizontal scaling are
critical issues in Parallel Stream Processing Engines to achieve low data
processing latency, optimized cluster utilization and minimized communication
cost respectively. In previous work, these issues are typically tackled
separately and independently. We argue that these problems are tightly coupled
in the sense that they all need to determine the allocations of workloads and
migrate computational states at runtime. Optimizing them independently would
result in suboptimal solutions. Therefore, in this paper, we investigate how
these three issues can be modeled as one integrated optimization problem. In
particular, we first consider jobs where workload allocations have little
effect on the communication cost, and model the problem of load balance as a
Mixed-Integer Linear Program. Afterwards, we present an extended solution
called ALBIC, which support general jobs. We implement the proposed techniques
on top of Apache Storm, an open-source Parallel Stream Processing Engine. The
extensive experimental results over both synthetic and real datasets show that
our techniques clearly outperform existing approaches.</abstract><doi>10.48550/arxiv.1602.03770</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.1602.03770 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_1602_03770 |
source | arXiv.org |
subjects | Computer Science - Distributed, Parallel, and Cluster Computing |
title | Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T07%3A00%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Integrative%20Dynamic%20Reconfiguration%20in%20a%20Parallel%20Stream%20Processing%20Engine&rft.au=Madsen,%20Kasper%20Grud%20Skat&rft.date=2016-02-11&rft_id=info:doi/10.48550/arxiv.1602.03770&rft_dat=%3Carxiv_GOX%3E1602_03770%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |