Interference and Need Aware Workload Colocation in Hyperscale Datacenters

Datacenters suffer from resource utilization inefficiencies due to the conflicting goals of service owners and platform providers. Service owners intending to maintain Service Level Objectives (SLO) for themselves typically request a conservative amount of resources. Platform providers want to incre...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2022-07
Hauptverfasser:	Chakraborti, Sayak, Coutinho, Brian, Dwarkadas, Sandhya, Malani, Parth, Sharma, Bikash
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer architecture Data centers Degradation Efficiency Heterogeneity Interference Network latency Operating costs Resource allocation Resource utilization Tuning Workload Workloads
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Chakraborti, Sayak Coutinho, Brian Dwarkadas, Sandhya Malani, Parth Sharma, Bikash
description	Datacenters suffer from resource utilization inefficiencies due to the conflicting goals of service owners and platform providers. Service owners intending to maintain Service Level Objectives (SLO) for themselves typically request a conservative amount of resources. Platform providers want to increase operational efficiency to reduce capital and operating costs. Achieving both operational efficiency and SLO for individual services at the same time is challenging due to the diversity in service workload characteristics, resource usage patterns that are dependent on input load, heterogeneity in platform, memory, I/O, and network architecture, and resource bundling. This paper presents a tunable approach to resource allocation that accounts for both dynamic service resource needs and platform heterogeneity. In addition, an online K-Means-based service classification method is used in conjunction with an offline sensitivity component. Our tunable approach allows trading resource utilization efficiency for absolute SLO guarantees based on the service owners' sensitivity to its SLO. We evaluate our tunable resource allocator at scale in a private cloud environment with mostly latency-critical workloads. When tuning for operational efficiency, we demonstrate up to ~50% reduction in required machines; ~40% reduction in Total-Cost-of-Ownership (TCO); and ~60% reduction in CPU and memory fragmentation, but at the cost of increasing the number of tasks experiencing degradation of SLO by up to ~25% compared to the baseline. When tuning for SLO, by introducing interference-aware colocation, we can tune the solver to reduce tasks experiencing degradation of SLO by up to ~22% compared to the baseline, but at an additional cost of ~30% in terms of the number of hosts. We highlight this trade-off between TCO and SLO violations, and offer tuning based on the requirements of the platform owners.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2695191521</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2695191521</sourcerecordid><originalsourceid>FETCH-proquest_journals_26951915213</originalsourceid><addsrcrecordid>eNqNjLEKwjAUAIMgWLT_8MC50CSm2lGqUhcnwbE80ldoDUlNUsS_V8EPcLrhjpuxREjJs91GiAVLQxjyPBfFViglE3Y-20i-I09WE6Bt4ULUwv6JnuDm_N04bKFyxmmMvbPQW6hfI_mg0RAcMKKm7yKs2LxDEyj9ccnWp-O1qrPRu8dEITaDm7z9qEYUpeIlV4LL_6o3aPM8PA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2695191521</pqid></control><display><type>article</type><title>Interference and Need Aware Workload Colocation in Hyperscale Datacenters</title><source>Free E- Journals</source><creator>Chakraborti, Sayak ; Coutinho, Brian ; Dwarkadas, Sandhya ; Malani, Parth ; Sharma, Bikash</creator><creatorcontrib>Chakraborti, Sayak ; Coutinho, Brian ; Dwarkadas, Sandhya ; Malani, Parth ; Sharma, Bikash</creatorcontrib><description>Datacenters suffer from resource utilization inefficiencies due to the conflicting goals of service owners and platform providers. Service owners intending to maintain Service Level Objectives (SLO) for themselves typically request a conservative amount of resources. Platform providers want to increase operational efficiency to reduce capital and operating costs. Achieving both operational efficiency and SLO for individual services at the same time is challenging due to the diversity in service workload characteristics, resource usage patterns that are dependent on input load, heterogeneity in platform, memory, I/O, and network architecture, and resource bundling. This paper presents a tunable approach to resource allocation that accounts for both dynamic service resource needs and platform heterogeneity. In addition, an online K-Means-based service classification method is used in conjunction with an offline sensitivity component. Our tunable approach allows trading resource utilization efficiency for absolute SLO guarantees based on the service owners' sensitivity to its SLO. We evaluate our tunable resource allocator at scale in a private cloud environment with mostly latency-critical workloads. When tuning for operational efficiency, we demonstrate up to ~50% reduction in required machines; ~40% reduction in Total-Cost-of-Ownership (TCO); and ~60% reduction in CPU and memory fragmentation, but at the cost of increasing the number of tasks experiencing degradation of SLO by up to ~25% compared to the baseline. When tuning for SLO, by introducing interference-aware colocation, we can tune the solver to reduce tasks experiencing degradation of SLO by up to ~22% compared to the baseline, but at an additional cost of ~30% in terms of the number of hosts. We highlight this trade-off between TCO and SLO violations, and offer tuning based on the requirements of the platform owners.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computer architecture ; Data centers ; Degradation ; Efficiency ; Heterogeneity ; Interference ; Network latency ; Operating costs ; Resource allocation ; Resource utilization ; Tuning ; Workload ; Workloads</subject><ispartof>arXiv.org, 2022-07</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Chakraborti, Sayak</creatorcontrib><creatorcontrib>Coutinho, Brian</creatorcontrib><creatorcontrib>Dwarkadas, Sandhya</creatorcontrib><creatorcontrib>Malani, Parth</creatorcontrib><creatorcontrib>Sharma, Bikash</creatorcontrib><title>Interference and Need Aware Workload Colocation in Hyperscale Datacenters</title><title>arXiv.org</title><description>Datacenters suffer from resource utilization inefficiencies due to the conflicting goals of service owners and platform providers. Service owners intending to maintain Service Level Objectives (SLO) for themselves typically request a conservative amount of resources. Platform providers want to increase operational efficiency to reduce capital and operating costs. Achieving both operational efficiency and SLO for individual services at the same time is challenging due to the diversity in service workload characteristics, resource usage patterns that are dependent on input load, heterogeneity in platform, memory, I/O, and network architecture, and resource bundling. This paper presents a tunable approach to resource allocation that accounts for both dynamic service resource needs and platform heterogeneity. In addition, an online K-Means-based service classification method is used in conjunction with an offline sensitivity component. Our tunable approach allows trading resource utilization efficiency for absolute SLO guarantees based on the service owners' sensitivity to its SLO. We evaluate our tunable resource allocator at scale in a private cloud environment with mostly latency-critical workloads. When tuning for operational efficiency, we demonstrate up to ~50% reduction in required machines; ~40% reduction in Total-Cost-of-Ownership (TCO); and ~60% reduction in CPU and memory fragmentation, but at the cost of increasing the number of tasks experiencing degradation of SLO by up to ~25% compared to the baseline. When tuning for SLO, by introducing interference-aware colocation, we can tune the solver to reduce tasks experiencing degradation of SLO by up to ~22% compared to the baseline, but at an additional cost of ~30% in terms of the number of hosts. We highlight this trade-off between TCO and SLO violations, and offer tuning based on the requirements of the platform owners.</description><subject>Computer architecture</subject><subject>Data centers</subject><subject>Degradation</subject><subject>Efficiency</subject><subject>Heterogeneity</subject><subject>Interference</subject><subject>Network latency</subject><subject>Operating costs</subject><subject>Resource allocation</subject><subject>Resource utilization</subject><subject>Tuning</subject><subject>Workload</subject><subject>Workloads</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjLEKwjAUAIMgWLT_8MC50CSm2lGqUhcnwbE80ldoDUlNUsS_V8EPcLrhjpuxREjJs91GiAVLQxjyPBfFViglE3Y-20i-I09WE6Bt4ULUwv6JnuDm_N04bKFyxmmMvbPQW6hfI_mg0RAcMKKm7yKs2LxDEyj9ccnWp-O1qrPRu8dEITaDm7z9qEYUpeIlV4LL_6o3aPM8PA</recordid><startdate>20220725</startdate><enddate>20220725</enddate><creator>Chakraborti, Sayak</creator><creator>Coutinho, Brian</creator><creator>Dwarkadas, Sandhya</creator><creator>Malani, Parth</creator><creator>Sharma, Bikash</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220725</creationdate><title>Interference and Need Aware Workload Colocation in Hyperscale Datacenters</title><author>Chakraborti, Sayak ; Coutinho, Brian ; Dwarkadas, Sandhya ; Malani, Parth ; Sharma, Bikash</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_26951915213</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer architecture</topic><topic>Data centers</topic><topic>Degradation</topic><topic>Efficiency</topic><topic>Heterogeneity</topic><topic>Interference</topic><topic>Network latency</topic><topic>Operating costs</topic><topic>Resource allocation</topic><topic>Resource utilization</topic><topic>Tuning</topic><topic>Workload</topic><topic>Workloads</topic><toplevel>online_resources</toplevel><creatorcontrib>Chakraborti, Sayak</creatorcontrib><creatorcontrib>Coutinho, Brian</creatorcontrib><creatorcontrib>Dwarkadas, Sandhya</creatorcontrib><creatorcontrib>Malani, Parth</creatorcontrib><creatorcontrib>Sharma, Bikash</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chakraborti, Sayak</au><au>Coutinho, Brian</au><au>Dwarkadas, Sandhya</au><au>Malani, Parth</au><au>Sharma, Bikash</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Interference and Need Aware Workload Colocation in Hyperscale Datacenters</atitle><jtitle>arXiv.org</jtitle><date>2022-07-25</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Datacenters suffer from resource utilization inefficiencies due to the conflicting goals of service owners and platform providers. Service owners intending to maintain Service Level Objectives (SLO) for themselves typically request a conservative amount of resources. Platform providers want to increase operational efficiency to reduce capital and operating costs. Achieving both operational efficiency and SLO for individual services at the same time is challenging due to the diversity in service workload characteristics, resource usage patterns that are dependent on input load, heterogeneity in platform, memory, I/O, and network architecture, and resource bundling. This paper presents a tunable approach to resource allocation that accounts for both dynamic service resource needs and platform heterogeneity. In addition, an online K-Means-based service classification method is used in conjunction with an offline sensitivity component. Our tunable approach allows trading resource utilization efficiency for absolute SLO guarantees based on the service owners' sensitivity to its SLO. We evaluate our tunable resource allocator at scale in a private cloud environment with mostly latency-critical workloads. When tuning for operational efficiency, we demonstrate up to ~50% reduction in required machines; ~40% reduction in Total-Cost-of-Ownership (TCO); and ~60% reduction in CPU and memory fragmentation, but at the cost of increasing the number of tasks experiencing degradation of SLO by up to ~25% compared to the baseline. When tuning for SLO, by introducing interference-aware colocation, we can tune the solver to reduce tasks experiencing degradation of SLO by up to ~22% compared to the baseline, but at an additional cost of ~30% in terms of the number of hosts. We highlight this trade-off between TCO and SLO violations, and offer tuning based on the requirements of the platform owners.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-07
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2695191521
source	Free E- Journals
subjects	Computer architecture Data centers Degradation Efficiency Heterogeneity Interference Network latency Operating costs Resource allocation Resource utilization Tuning Workload Workloads
title	Interference and Need Aware Workload Colocation in Hyperscale Datacenters
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T06%3A30%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Interference%20and%20Need%20Aware%20Workload%20Colocation%20in%20Hyperscale%20Datacenters&rft.jtitle=arXiv.org&rft.au=Chakraborti,%20Sayak&rft.date=2022-07-25&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2695191521%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2695191521&rft_id=info:pmid/&rfr_iscdi=true