Kronos: towards bus contention-aware job scheduling in warehouse scale computers

While researchers have proposed many techniques to mitigate the contention on the shared cache and memory bandwidth, none of them has considered the memory bus contention due to split lock. Our study shows that the split lock may cause 9X longer data access latency without saturating the memory band...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Frontiers of Computer Science 2023-02, Vol.17 (1), p.171101, Article 171101
Hauptverfasser:	XUE, Shuai, ZHAO, Shang, CHEN, Quan, SONG, Zhuo, CHEN, Shanpei, MA, Tao, YANG, Yong, ZHENG, Wenli, GUO, Minyi
Format:	Artikel
Sprache:	eng
Schlagworte:	Bandwidths bus contention cloud Computer Science Employment high performance Nodes Policies Polynomials Regression models Research Article schedule Scheduling split lock Tenants
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	1
container_start_page	171101
container_title	Frontiers of Computer Science
container_volume	17
creator	XUE, Shuai ZHAO, Shang CHEN, Quan SONG, Zhuo CHEN, Shanpei MA, Tao YANG, Yong ZHENG, Wenli GUO, Minyi
description	While researchers have proposed many techniques to mitigate the contention on the shared cache and memory bandwidth, none of them has considered the memory bus contention due to split lock. Our study shows that the split lock may cause 9X longer data access latency without saturating the memory bandwidth. To minimize the impact of split lock, we propose Kronos, a runtime system composed of an online bus contention tolerance meter and a bus contention-aware job scheduler. The meter characterizes the tolerance of jobs to the “pressure” of bus contention and builds a tolerance model with the polynomial regression technique. The job scheduler allocates user jobs to the physical nodes in a contention aware manner. We design three scheduling policies that minimize the number of required nodes while ensuring the Service Level Agreement (SLA) of all the user jobs, minimize the number of jobs that suffer from SLA violation without enough nodes, and maximize the overall performance without considering the SLA violation, respectively. Adopting the three policies, Kronos reduces the number of the required nodes by 42.1% while ensuring the SLA of all the jobs, reduces the number of the jobs that suffer from SLA violation without enough nodes by 72.8%, and improves the overall performance by 35.2% without considering SLA.
doi_str_mv	10.1007/s11704-021-0418-5
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2918721605</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2918721605</sourcerecordid><originalsourceid>FETCH-LOGICAL-c393t-13ad3aea4f87689eaa6833f0a6723c86e22ac928ad07cf241751dc49f8a292113</originalsourceid><addsrcrecordid>eNp9kc1LwzAchoMoOOb-AG8Fz9HklzYf3mT4hQM96Dlkabp2bMlMWsT_3pSK3nZKeHmfN_AEoUtKrikh4iZRKkiJCVBMSipxdYJmQFSFARg__buDPEeLlLaEECBQVQAz9PYSgw_ptujDl4l1KtZDKmzwvfN9Fzw2OXXFNqyLZFtXD7vOb4rOF2PchiG5nJudy8j-MPQupgt01phdcovfc44-Hu7fl0949fr4vLxbYcsU6zFlpmbGmbKRgkvljOGSsYYYLoBZyR2AsQqkqYmwDZRUVLS2pWqkAQWUsjm6mnYPMXwOLvV6G4bo85MaFJUCKCfV0RZXSkkQnOcWnVo2hpSia_QhdnsTvzUlejSsJ8M6G9ajYT0uw8Sk3PUbF_-Xj0Fygtpu07ro6kN0Kekm_0HfZXlH0B8_vo8T</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2918721605</pqid></control><display><type>article</type><title>Kronos: towards bus contention-aware job scheduling in warehouse scale computers</title><source>ProQuest Central UK/Ireland</source><source>SpringerLink Journals - AutoHoldings</source><source>ProQuest Central</source><creator>XUE, Shuai ; ZHAO, Shang ; CHEN, Quan ; SONG, Zhuo ; CHEN, Shanpei ; MA, Tao ; YANG, Yong ; ZHENG, Wenli ; GUO, Minyi</creator><creatorcontrib>XUE, Shuai ; ZHAO, Shang ; CHEN, Quan ; SONG, Zhuo ; CHEN, Shanpei ; MA, Tao ; YANG, Yong ; ZHENG, Wenli ; GUO, Minyi</creatorcontrib><description>While researchers have proposed many techniques to mitigate the contention on the shared cache and memory bandwidth, none of them has considered the memory bus contention due to split lock. Our study shows that the split lock may cause 9X longer data access latency without saturating the memory bandwidth. To minimize the impact of split lock, we propose Kronos, a runtime system composed of an online bus contention tolerance meter and a bus contention-aware job scheduler. The meter characterizes the tolerance of jobs to the “pressure” of bus contention and builds a tolerance model with the polynomial regression technique. The job scheduler allocates user jobs to the physical nodes in a contention aware manner. We design three scheduling policies that minimize the number of required nodes while ensuring the Service Level Agreement (SLA) of all the user jobs, minimize the number of jobs that suffer from SLA violation without enough nodes, and maximize the overall performance without considering the SLA violation, respectively. Adopting the three policies, Kronos reduces the number of the required nodes by 42.1% while ensuring the SLA of all the jobs, reduces the number of the jobs that suffer from SLA violation without enough nodes by 72.8%, and improves the overall performance by 35.2% without considering SLA.</description><identifier>ISSN: 2095-2228</identifier><identifier>EISSN: 2095-2236</identifier><identifier>DOI: 10.1007/s11704-021-0418-5</identifier><language>eng</language><publisher>Beijing: Higher Education Press</publisher><subject>Bandwidths ; bus contention ; cloud ; Computer Science ; Employment ; high performance ; Nodes ; Policies ; Polynomials ; Regression models ; Research Article ; schedule ; Scheduling ; split lock ; Tenants</subject><ispartof>Frontiers of Computer Science, 2023-02, Vol.17 (1), p.171101, Article 171101</ispartof><rights>Copyright reserved, 2021, Higher Education Press 2021</rights><rights>Higher Education Press 2023</rights><rights>Higher Education Press 2023.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c393t-13ad3aea4f87689eaa6833f0a6723c86e22ac928ad07cf241751dc49f8a292113</citedby><cites>FETCH-LOGICAL-c393t-13ad3aea4f87689eaa6833f0a6723c86e22ac928ad07cf241751dc49f8a292113</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11704-021-0418-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2918721605?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,21387,27923,27924,33743,41487,42556,43804,51318,64384,64388,72240</link.rule.ids></links><search><creatorcontrib>XUE, Shuai</creatorcontrib><creatorcontrib>ZHAO, Shang</creatorcontrib><creatorcontrib>CHEN, Quan</creatorcontrib><creatorcontrib>SONG, Zhuo</creatorcontrib><creatorcontrib>CHEN, Shanpei</creatorcontrib><creatorcontrib>MA, Tao</creatorcontrib><creatorcontrib>YANG, Yong</creatorcontrib><creatorcontrib>ZHENG, Wenli</creatorcontrib><creatorcontrib>GUO, Minyi</creatorcontrib><title>Kronos: towards bus contention-aware job scheduling in warehouse scale computers</title><title>Frontiers of Computer Science</title><addtitle>Front. Comput. Sci</addtitle><description>While researchers have proposed many techniques to mitigate the contention on the shared cache and memory bandwidth, none of them has considered the memory bus contention due to split lock. Our study shows that the split lock may cause 9X longer data access latency without saturating the memory bandwidth. To minimize the impact of split lock, we propose Kronos, a runtime system composed of an online bus contention tolerance meter and a bus contention-aware job scheduler. The meter characterizes the tolerance of jobs to the “pressure” of bus contention and builds a tolerance model with the polynomial regression technique. The job scheduler allocates user jobs to the physical nodes in a contention aware manner. We design three scheduling policies that minimize the number of required nodes while ensuring the Service Level Agreement (SLA) of all the user jobs, minimize the number of jobs that suffer from SLA violation without enough nodes, and maximize the overall performance without considering the SLA violation, respectively. Adopting the three policies, Kronos reduces the number of the required nodes by 42.1% while ensuring the SLA of all the jobs, reduces the number of the jobs that suffer from SLA violation without enough nodes by 72.8%, and improves the overall performance by 35.2% without considering SLA.</description><subject>Bandwidths</subject><subject>bus contention</subject><subject>cloud</subject><subject>Computer Science</subject><subject>Employment</subject><subject>high performance</subject><subject>Nodes</subject><subject>Policies</subject><subject>Polynomials</subject><subject>Regression models</subject><subject>Research Article</subject><subject>schedule</subject><subject>Scheduling</subject><subject>split lock</subject><subject>Tenants</subject><issn>2095-2228</issn><issn>2095-2236</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9kc1LwzAchoMoOOb-AG8Fz9HklzYf3mT4hQM96Dlkabp2bMlMWsT_3pSK3nZKeHmfN_AEoUtKrikh4iZRKkiJCVBMSipxdYJmQFSFARg__buDPEeLlLaEECBQVQAz9PYSgw_ptujDl4l1KtZDKmzwvfN9Fzw2OXXFNqyLZFtXD7vOb4rOF2PchiG5nJudy8j-MPQupgt01phdcovfc44-Hu7fl0949fr4vLxbYcsU6zFlpmbGmbKRgkvljOGSsYYYLoBZyR2AsQqkqYmwDZRUVLS2pWqkAQWUsjm6mnYPMXwOLvV6G4bo85MaFJUCKCfV0RZXSkkQnOcWnVo2hpSia_QhdnsTvzUlejSsJ8M6G9ajYT0uw8Sk3PUbF_-Xj0Fygtpu07ro6kN0Kekm_0HfZXlH0B8_vo8T</recordid><startdate>20230201</startdate><enddate>20230201</enddate><creator>XUE, Shuai</creator><creator>ZHAO, Shang</creator><creator>CHEN, Quan</creator><creator>SONG, Zhuo</creator><creator>CHEN, Shanpei</creator><creator>MA, Tao</creator><creator>YANG, Yong</creator><creator>ZHENG, Wenli</creator><creator>GUO, Minyi</creator><general>Higher Education Press</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K7-</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope></search><sort><creationdate>20230201</creationdate><title>Kronos: towards bus contention-aware job scheduling in warehouse scale computers</title><author>XUE, Shuai ; ZHAO, Shang ; CHEN, Quan ; SONG, Zhuo ; CHEN, Shanpei ; MA, Tao ; YANG, Yong ; ZHENG, Wenli ; GUO, Minyi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c393t-13ad3aea4f87689eaa6833f0a6723c86e22ac928ad07cf241751dc49f8a292113</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Bandwidths</topic><topic>bus contention</topic><topic>cloud</topic><topic>Computer Science</topic><topic>Employment</topic><topic>high performance</topic><topic>Nodes</topic><topic>Policies</topic><topic>Polynomials</topic><topic>Regression models</topic><topic>Research Article</topic><topic>schedule</topic><topic>Scheduling</topic><topic>split lock</topic><topic>Tenants</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>XUE, Shuai</creatorcontrib><creatorcontrib>ZHAO, Shang</creatorcontrib><creatorcontrib>CHEN, Quan</creatorcontrib><creatorcontrib>SONG, Zhuo</creatorcontrib><creatorcontrib>CHEN, Shanpei</creatorcontrib><creatorcontrib>MA, Tao</creatorcontrib><creatorcontrib>YANG, Yong</creatorcontrib><creatorcontrib>ZHENG, Wenli</creatorcontrib><creatorcontrib>GUO, Minyi</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Frontiers of Computer Science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>XUE, Shuai</au><au>ZHAO, Shang</au><au>CHEN, Quan</au><au>SONG, Zhuo</au><au>CHEN, Shanpei</au><au>MA, Tao</au><au>YANG, Yong</au><au>ZHENG, Wenli</au><au>GUO, Minyi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Kronos: towards bus contention-aware job scheduling in warehouse scale computers</atitle><jtitle>Frontiers of Computer Science</jtitle><stitle>Front. Comput. Sci</stitle><date>2023-02-01</date><risdate>2023</risdate><volume>17</volume><issue>1</issue><spage>171101</spage><pages>171101-</pages><artnum>171101</artnum><issn>2095-2228</issn><eissn>2095-2236</eissn><abstract>While researchers have proposed many techniques to mitigate the contention on the shared cache and memory bandwidth, none of them has considered the memory bus contention due to split lock. Our study shows that the split lock may cause 9X longer data access latency without saturating the memory bandwidth. To minimize the impact of split lock, we propose Kronos, a runtime system composed of an online bus contention tolerance meter and a bus contention-aware job scheduler. The meter characterizes the tolerance of jobs to the “pressure” of bus contention and builds a tolerance model with the polynomial regression technique. The job scheduler allocates user jobs to the physical nodes in a contention aware manner. We design three scheduling policies that minimize the number of required nodes while ensuring the Service Level Agreement (SLA) of all the user jobs, minimize the number of jobs that suffer from SLA violation without enough nodes, and maximize the overall performance without considering the SLA violation, respectively. Adopting the three policies, Kronos reduces the number of the required nodes by 42.1% while ensuring the SLA of all the jobs, reduces the number of the jobs that suffer from SLA violation without enough nodes by 72.8%, and improves the overall performance by 35.2% without considering SLA.</abstract><cop>Beijing</cop><pub>Higher Education Press</pub><doi>10.1007/s11704-021-0418-5</doi></addata></record>
fulltext	fulltext
identifier	ISSN: 2095-2228
ispartof	Frontiers of Computer Science, 2023-02, Vol.17 (1), p.171101, Article 171101
issn	2095-2228 2095-2236
language	eng
recordid	cdi_proquest_journals_2918721605
source	ProQuest Central UK/Ireland; SpringerLink Journals - AutoHoldings; ProQuest Central
subjects	Bandwidths bus contention cloud Computer Science Employment high performance Nodes Policies Polynomials Regression models Research Article schedule Scheduling split lock Tenants
title	Kronos: towards bus contention-aware job scheduling in warehouse scale computers
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T22%3A58%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Kronos:%20towards%20bus%20contention-aware%20job%20scheduling%20in%20warehouse%20scale%20computers&rft.jtitle=Frontiers%20of%20Computer%20Science&rft.au=XUE,%20Shuai&rft.date=2023-02-01&rft.volume=17&rft.issue=1&rft.spage=171101&rft.pages=171101-&rft.artnum=171101&rft.issn=2095-2228&rft.eissn=2095-2236&rft_id=info:doi/10.1007/s11704-021-0418-5&rft_dat=%3Cproquest_cross%3E2918721605%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2918721605&rft_id=info:pmid/&rfr_iscdi=true