An Adaptive Efficiency-Fairness Meta-Scheduler for Data-Intensive Computing

In data-intensive cluster computing platforms such as Hadoop YARN, efficiency and fairness are two important factors for system design and optimizations. Previous studies are either for efficiency or for fairness solely, without considering the tradeoff between efficiency and fairness. Recent studie...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on services computing 2019-11, Vol.12 (6), p.865-879
Hauptverfasser:	Niu, Zhaojie, Tang, Shanjiang, He, Bingsheng
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation models Clusters Computation Computer simulation data-intensive Design factors Design optimization Efficiency efficiency-fairness tradeoff Flexible printed circuits Google Hadoop YARN Machine learning Meta-scheduling Optimization Resource management Scheduling algorithms Simulation Systems design Task scheduling Tradeoffs Workload
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	879
container_issue	6
container_start_page	865
container_title	IEEE transactions on services computing
container_volume	12
creator	Niu, Zhaojie Tang, Shanjiang He, Bingsheng
description	In data-intensive cluster computing platforms such as Hadoop YARN, efficiency and fairness are two important factors for system design and optimizations. Previous studies are either for efficiency or for fairness solely, without considering the tradeoff between efficiency and fairness. Recent studies observe that there is a tradeoff between efficiency and fairness because of resource contention between users/jobs. By leveraging the existing schedulers, a meta-scheduler is able to dynamically choose one of them for job/task scheduling at runtime. In this paper, we propose a meta-scheduler called FLEX to realize the tradeoff between system efficiency and fairness in Hadoop YARN. FLEX combines multiple existing schedulers into a single aggregated view without any modification on the original schedulers. Equipped with these candidate schedulers, FLEX utilizes machine learning approach to adaptively choose the most proper scheduler according to the characteristic of current running workload and user-defined Service Level Agreement (SLA). We implement FLEX in Hadoop YARN. We conduct experiments with real deployment in a local cluster and perform simulation studies with production traces. Experimental results show that the FLEX outperforms the state-of-the-art approach in two aspects: 1) Given a predefined threshold on the fairness loss, the FLEX reduces the makespan by up to 22 and 24 percent in real deployment and the large-scale simulation, respectively; 2) Given the predefined threshold on the makespan reduction, the FLEX reduces the fairness loss by up to 75 and 73 percent in real deployment and the large-scale simulation, respectively.
doi_str_mv	10.1109/TSC.2016.2635133
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2325191167</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7765048</ieee_id><sourcerecordid>2325191167</sourcerecordid><originalsourceid>FETCH-LOGICAL-c291t-9b498cd2bf85043cd3d82d6d3ec50c647bedb9cf42367ba3db0208fc2359a2f13</originalsourceid><addsrcrecordid>eNo9kM9LwzAUx4MoOKd3wUvBc2uS16bNcdRNhxMPm-eQJi_asbU1aYX993ZseHrw-Hzfjw8h94wmjFH5tFmXCadMJFxAxgAuyIRDzmPKaXpJJkyCjBnk6TW5CWFLqeBFISfkbdZEM6u7vv7FaO5cbWpszCFe6No3GEL0jr2O1-Yb7bBDH7nWR896bC2bHptwTJXtvhv6uvm6JVdO7wLeneuUfC7mm_I1Xn28LMvZKjZcsj6WVSoLY3nlioymYCzYglthAU1GjUjzCm0ljUs5iLzSYKvxh8IZDpnU3DGYksfT3M63PwOGXm3bwTfjSsWBZ0wyJvKRoifK-DYEj051vt5rf1CMqqMyNSpTR2XqrGyMPJwiNSL-43kuxjsL-APGamcm</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2325191167</pqid></control><display><type>article</type><title>An Adaptive Efficiency-Fairness Meta-Scheduler for Data-Intensive Computing</title><source>IEEE Electronic Library (IEL)</source><creator>Niu, Zhaojie ; Tang, Shanjiang ; He, Bingsheng</creator><creatorcontrib>Niu, Zhaojie ; Tang, Shanjiang ; He, Bingsheng</creatorcontrib><description>In data-intensive cluster computing platforms such as Hadoop YARN, efficiency and fairness are two important factors for system design and optimizations. Previous studies are either for efficiency or for fairness solely, without considering the tradeoff between efficiency and fairness. Recent studies observe that there is a tradeoff between efficiency and fairness because of resource contention between users/jobs. By leveraging the existing schedulers, a meta-scheduler is able to dynamically choose one of them for job/task scheduling at runtime. In this paper, we propose a meta-scheduler called FLEX to realize the tradeoff between system efficiency and fairness in Hadoop YARN. FLEX combines multiple existing schedulers into a single aggregated view without any modification on the original schedulers. Equipped with these candidate schedulers, FLEX utilizes machine learning approach to adaptively choose the most proper scheduler according to the characteristic of current running workload and user-defined Service Level Agreement (SLA). We implement FLEX in Hadoop YARN. We conduct experiments with real deployment in a local cluster and perform simulation studies with production traces. Experimental results show that the FLEX outperforms the state-of-the-art approach in two aspects: 1) Given a predefined threshold on the fairness loss, the FLEX reduces the makespan by up to 22 and 24 percent in real deployment and the large-scale simulation, respectively; 2) Given the predefined threshold on the makespan reduction, the FLEX reduces the fairness loss by up to 75 and 73 percent in real deployment and the large-scale simulation, respectively.</description><identifier>ISSN: 1939-1374</identifier><identifier>EISSN: 2372-0204</identifier><identifier>DOI: 10.1109/TSC.2016.2635133</identifier><identifier>CODEN: ITSCAD</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Adaptation models ; Clusters ; Computation ; Computer simulation ; data-intensive ; Design factors ; Design optimization ; Efficiency ; efficiency-fairness tradeoff ; Flexible printed circuits ; Google ; Hadoop YARN ; Machine learning ; Meta-scheduling ; Optimization ; Resource management ; Scheduling algorithms ; Simulation ; Systems design ; Task scheduling ; Tradeoffs ; Workload</subject><ispartof>IEEE transactions on services computing, 2019-11, Vol.12 (6), p.865-879</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c291t-9b498cd2bf85043cd3d82d6d3ec50c647bedb9cf42367ba3db0208fc2359a2f13</citedby><cites>FETCH-LOGICAL-c291t-9b498cd2bf85043cd3d82d6d3ec50c647bedb9cf42367ba3db0208fc2359a2f13</cites><orcidid>0000-0002-3552-2170</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7765048$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7765048$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Niu, Zhaojie</creatorcontrib><creatorcontrib>Tang, Shanjiang</creatorcontrib><creatorcontrib>He, Bingsheng</creatorcontrib><title>An Adaptive Efficiency-Fairness Meta-Scheduler for Data-Intensive Computing</title><title>IEEE transactions on services computing</title><addtitle>TSC</addtitle><description>In data-intensive cluster computing platforms such as Hadoop YARN, efficiency and fairness are two important factors for system design and optimizations. Previous studies are either for efficiency or for fairness solely, without considering the tradeoff between efficiency and fairness. Recent studies observe that there is a tradeoff between efficiency and fairness because of resource contention between users/jobs. By leveraging the existing schedulers, a meta-scheduler is able to dynamically choose one of them for job/task scheduling at runtime. In this paper, we propose a meta-scheduler called FLEX to realize the tradeoff between system efficiency and fairness in Hadoop YARN. FLEX combines multiple existing schedulers into a single aggregated view without any modification on the original schedulers. Equipped with these candidate schedulers, FLEX utilizes machine learning approach to adaptively choose the most proper scheduler according to the characteristic of current running workload and user-defined Service Level Agreement (SLA). We implement FLEX in Hadoop YARN. We conduct experiments with real deployment in a local cluster and perform simulation studies with production traces. Experimental results show that the FLEX outperforms the state-of-the-art approach in two aspects: 1) Given a predefined threshold on the fairness loss, the FLEX reduces the makespan by up to 22 and 24 percent in real deployment and the large-scale simulation, respectively; 2) Given the predefined threshold on the makespan reduction, the FLEX reduces the fairness loss by up to 75 and 73 percent in real deployment and the large-scale simulation, respectively.</description><subject>Adaptation models</subject><subject>Clusters</subject><subject>Computation</subject><subject>Computer simulation</subject><subject>data-intensive</subject><subject>Design factors</subject><subject>Design optimization</subject><subject>Efficiency</subject><subject>efficiency-fairness tradeoff</subject><subject>Flexible printed circuits</subject><subject>Google</subject><subject>Hadoop YARN</subject><subject>Machine learning</subject><subject>Meta-scheduling</subject><subject>Optimization</subject><subject>Resource management</subject><subject>Scheduling algorithms</subject><subject>Simulation</subject><subject>Systems design</subject><subject>Task scheduling</subject><subject>Tradeoffs</subject><subject>Workload</subject><issn>1939-1374</issn><issn>2372-0204</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kM9LwzAUx4MoOKd3wUvBc2uS16bNcdRNhxMPm-eQJi_asbU1aYX993ZseHrw-Hzfjw8h94wmjFH5tFmXCadMJFxAxgAuyIRDzmPKaXpJJkyCjBnk6TW5CWFLqeBFISfkbdZEM6u7vv7FaO5cbWpszCFe6No3GEL0jr2O1-Yb7bBDH7nWR896bC2bHptwTJXtvhv6uvm6JVdO7wLeneuUfC7mm_I1Xn28LMvZKjZcsj6WVSoLY3nlioymYCzYglthAU1GjUjzCm0ljUs5iLzSYKvxh8IZDpnU3DGYksfT3M63PwOGXm3bwTfjSsWBZ0wyJvKRoifK-DYEj051vt5rf1CMqqMyNSpTR2XqrGyMPJwiNSL-43kuxjsL-APGamcm</recordid><startdate>20191101</startdate><enddate>20191101</enddate><creator>Niu, Zhaojie</creator><creator>Tang, Shanjiang</creator><creator>He, Bingsheng</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-3552-2170</orcidid></search><sort><creationdate>20191101</creationdate><title>An Adaptive Efficiency-Fairness Meta-Scheduler for Data-Intensive Computing</title><author>Niu, Zhaojie ; Tang, Shanjiang ; He, Bingsheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c291t-9b498cd2bf85043cd3d82d6d3ec50c647bedb9cf42367ba3db0208fc2359a2f13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Adaptation models</topic><topic>Clusters</topic><topic>Computation</topic><topic>Computer simulation</topic><topic>data-intensive</topic><topic>Design factors</topic><topic>Design optimization</topic><topic>Efficiency</topic><topic>efficiency-fairness tradeoff</topic><topic>Flexible printed circuits</topic><topic>Google</topic><topic>Hadoop YARN</topic><topic>Machine learning</topic><topic>Meta-scheduling</topic><topic>Optimization</topic><topic>Resource management</topic><topic>Scheduling algorithms</topic><topic>Simulation</topic><topic>Systems design</topic><topic>Task scheduling</topic><topic>Tradeoffs</topic><topic>Workload</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Niu, Zhaojie</creatorcontrib><creatorcontrib>Tang, Shanjiang</creatorcontrib><creatorcontrib>He, Bingsheng</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on services computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Niu, Zhaojie</au><au>Tang, Shanjiang</au><au>He, Bingsheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Adaptive Efficiency-Fairness Meta-Scheduler for Data-Intensive Computing</atitle><jtitle>IEEE transactions on services computing</jtitle><stitle>TSC</stitle><date>2019-11-01</date><risdate>2019</risdate><volume>12</volume><issue>6</issue><spage>865</spage><epage>879</epage><pages>865-879</pages><issn>1939-1374</issn><eissn>2372-0204</eissn><coden>ITSCAD</coden><abstract>In data-intensive cluster computing platforms such as Hadoop YARN, efficiency and fairness are two important factors for system design and optimizations. Previous studies are either for efficiency or for fairness solely, without considering the tradeoff between efficiency and fairness. Recent studies observe that there is a tradeoff between efficiency and fairness because of resource contention between users/jobs. By leveraging the existing schedulers, a meta-scheduler is able to dynamically choose one of them for job/task scheduling at runtime. In this paper, we propose a meta-scheduler called FLEX to realize the tradeoff between system efficiency and fairness in Hadoop YARN. FLEX combines multiple existing schedulers into a single aggregated view without any modification on the original schedulers. Equipped with these candidate schedulers, FLEX utilizes machine learning approach to adaptively choose the most proper scheduler according to the characteristic of current running workload and user-defined Service Level Agreement (SLA). We implement FLEX in Hadoop YARN. We conduct experiments with real deployment in a local cluster and perform simulation studies with production traces. Experimental results show that the FLEX outperforms the state-of-the-art approach in two aspects: 1) Given a predefined threshold on the fairness loss, the FLEX reduces the makespan by up to 22 and 24 percent in real deployment and the large-scale simulation, respectively; 2) Given the predefined threshold on the makespan reduction, the FLEX reduces the fairness loss by up to 75 and 73 percent in real deployment and the large-scale simulation, respectively.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TSC.2016.2635133</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-3552-2170</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1939-1374
ispartof	IEEE transactions on services computing, 2019-11, Vol.12 (6), p.865-879
issn	1939-1374 2372-0204
language	eng
recordid	cdi_proquest_journals_2325191167
source	IEEE Electronic Library (IEL)
subjects	Adaptation models Clusters Computation Computer simulation data-intensive Design factors Design optimization Efficiency efficiency-fairness tradeoff Flexible printed circuits Google Hadoop YARN Machine learning Meta-scheduling Optimization Resource management Scheduling algorithms Simulation Systems design Task scheduling Tradeoffs Workload
title	An Adaptive Efficiency-Fairness Meta-Scheduler for Data-Intensive Computing
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T13%3A58%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Adaptive%20Efficiency-Fairness%20Meta-Scheduler%20for%20Data-Intensive%20Computing&rft.jtitle=IEEE%20transactions%20on%20services%20computing&rft.au=Niu,%20Zhaojie&rft.date=2019-11-01&rft.volume=12&rft.issue=6&rft.spage=865&rft.epage=879&rft.pages=865-879&rft.issn=1939-1374&rft.eissn=2372-0204&rft.coden=ITSCAD&rft_id=info:doi/10.1109/TSC.2016.2635133&rft_dat=%3Cproquest_RIE%3E2325191167%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2325191167&rft_id=info:pmid/&rft_ieee_id=7765048&rfr_iscdi=true