SafeTail: Efficient Tail Latency Optimization in Edge Service Scheduling via Computational Redundancy Management

Optimizing tail latency while efficiently managing computational resources is crucial for delivering high-performance, latency-sensitive services in edge computing. Emerging applications, such as augmented reality, require low-latency computing services with high reliability on user devices, which o...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Shokhanda, Jyoti, Pal, Utkarsh, Kumar, Aman, Chattopadhyay, Soumi, Bhattacharya, Arani
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Shokhanda, Jyoti Pal, Utkarsh Kumar, Aman Chattopadhyay, Soumi Bhattacharya, Arani
description	Optimizing tail latency while efficiently managing computational resources is crucial for delivering high-performance, latency-sensitive services in edge computing. Emerging applications, such as augmented reality, require low-latency computing services with high reliability on user devices, which often have limited computational capabilities. Consequently, these devices depend on nearby edge servers for processing. However, inherent uncertainties in network and computation latencies stemming from variability in wireless networks and fluctuating server loads make service delivery on time challenging. Existing approaches often focus on optimizing median latency but fall short of addressing the specific challenges of tail latency in edge environments, particularly under uncertain network and computational conditions. Although some methods do address tail latency, they typically rely on fixed or excessive redundancy and lack adaptability to dynamic network conditions, often being designed for cloud environments rather than the unique demands of edge computing. In this paper, we introduce SafeTail, a framework that meets both median and tail response time targets, with tail latency defined as latency beyond the 90^th percentile threshold. SafeTail addresses this challenge by selectively replicating services across multiple edge servers to meet target latencies. SafeTail employs a reward-based deep learning framework to learn optimal placement strategies, balancing the need to achieve target latencies with minimizing additional resource usage. Through trace-driven simulations, SafeTail demonstrated near-optimal performance and outperformed most baseline strategies across three diverse services.
doi_str_mv	10.48550/arxiv.2408.17171
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2408_17171</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2408_17171</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2408_171713</originalsourceid><addsrcrecordid>eNqFjkELgkAUhPfSIaof0Kn3BzItJekqRociSO_yWJ_2YF3FVsl-fat0jzkMwwzDJ8Tacx0_DAJ3h-2be2fvu6HjHa3mokmwoBRZnSAuCpZM2sCY4YqGtBzg3hiu-IOGaw2sIc5LgoTanqV1-aS8U6xL6BkhqqumM9MUFTxspXMcT26osaTKni_FrED1otXPF2JzjtPosp3YsqblCtshGxmzifHwf_EFEmpIfw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>SafeTail: Efficient Tail Latency Optimization in Edge Service Scheduling via Computational Redundancy Management</title><source>arXiv.org</source><creator>Shokhanda, Jyoti ; Pal, Utkarsh ; Kumar, Aman ; Chattopadhyay, Soumi ; Bhattacharya, Arani</creator><creatorcontrib>Shokhanda, Jyoti ; Pal, Utkarsh ; Kumar, Aman ; Chattopadhyay, Soumi ; Bhattacharya, Arani</creatorcontrib><description>Optimizing tail latency while efficiently managing computational resources is crucial for delivering high-performance, latency-sensitive services in edge computing. Emerging applications, such as augmented reality, require low-latency computing services with high reliability on user devices, which often have limited computational capabilities. Consequently, these devices depend on nearby edge servers for processing. However, inherent uncertainties in network and computation latencies stemming from variability in wireless networks and fluctuating server loads make service delivery on time challenging. Existing approaches often focus on optimizing median latency but fall short of addressing the specific challenges of tail latency in edge environments, particularly under uncertain network and computational conditions. Although some methods do address tail latency, they typically rely on fixed or excessive redundancy and lack adaptability to dynamic network conditions, often being designed for cloud environments rather than the unique demands of edge computing. In this paper, we introduce SafeTail, a framework that meets both median and tail response time targets, with tail latency defined as latency beyond the 90^th percentile threshold. SafeTail addresses this challenge by selectively replicating services across multiple edge servers to meet target latencies. SafeTail employs a reward-based deep learning framework to learn optimal placement strategies, balancing the need to achieve target latencies with minimizing additional resource usage. Through trace-driven simulations, SafeTail demonstrated near-optimal performance and outperformed most baseline strategies across three diverse services.</description><identifier>DOI: 10.48550/arxiv.2408.17171</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2024-08</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2408.17171$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2408.17171$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Shokhanda, Jyoti</creatorcontrib><creatorcontrib>Pal, Utkarsh</creatorcontrib><creatorcontrib>Kumar, Aman</creatorcontrib><creatorcontrib>Chattopadhyay, Soumi</creatorcontrib><creatorcontrib>Bhattacharya, Arani</creatorcontrib><title>SafeTail: Efficient Tail Latency Optimization in Edge Service Scheduling via Computational Redundancy Management</title><description>Optimizing tail latency while efficiently managing computational resources is crucial for delivering high-performance, latency-sensitive services in edge computing. Emerging applications, such as augmented reality, require low-latency computing services with high reliability on user devices, which often have limited computational capabilities. Consequently, these devices depend on nearby edge servers for processing. However, inherent uncertainties in network and computation latencies stemming from variability in wireless networks and fluctuating server loads make service delivery on time challenging. Existing approaches often focus on optimizing median latency but fall short of addressing the specific challenges of tail latency in edge environments, particularly under uncertain network and computational conditions. Although some methods do address tail latency, they typically rely on fixed or excessive redundancy and lack adaptability to dynamic network conditions, often being designed for cloud environments rather than the unique demands of edge computing. In this paper, we introduce SafeTail, a framework that meets both median and tail response time targets, with tail latency defined as latency beyond the 90^th percentile threshold. SafeTail addresses this challenge by selectively replicating services across multiple edge servers to meet target latencies. SafeTail employs a reward-based deep learning framework to learn optimal placement strategies, balancing the need to achieve target latencies with minimizing additional resource usage. Through trace-driven simulations, SafeTail demonstrated near-optimal performance and outperformed most baseline strategies across three diverse services.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjkELgkAUhPfSIaof0Kn3BzItJekqRociSO_yWJ_2YF3FVsl-fat0jzkMwwzDJ8Tacx0_DAJ3h-2be2fvu6HjHa3mokmwoBRZnSAuCpZM2sCY4YqGtBzg3hiu-IOGaw2sIc5LgoTanqV1-aS8U6xL6BkhqqumM9MUFTxspXMcT26osaTKni_FrED1otXPF2JzjtPosp3YsqblCtshGxmzifHwf_EFEmpIfw</recordid><startdate>20240830</startdate><enddate>20240830</enddate><creator>Shokhanda, Jyoti</creator><creator>Pal, Utkarsh</creator><creator>Kumar, Aman</creator><creator>Chattopadhyay, Soumi</creator><creator>Bhattacharya, Arani</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240830</creationdate><title>SafeTail: Efficient Tail Latency Optimization in Edge Service Scheduling via Computational Redundancy Management</title><author>Shokhanda, Jyoti ; Pal, Utkarsh ; Kumar, Aman ; Chattopadhyay, Soumi ; Bhattacharya, Arani</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2408_171713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Shokhanda, Jyoti</creatorcontrib><creatorcontrib>Pal, Utkarsh</creatorcontrib><creatorcontrib>Kumar, Aman</creatorcontrib><creatorcontrib>Chattopadhyay, Soumi</creatorcontrib><creatorcontrib>Bhattacharya, Arani</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Shokhanda, Jyoti</au><au>Pal, Utkarsh</au><au>Kumar, Aman</au><au>Chattopadhyay, Soumi</au><au>Bhattacharya, Arani</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SafeTail: Efficient Tail Latency Optimization in Edge Service Scheduling via Computational Redundancy Management</atitle><date>2024-08-30</date><risdate>2024</risdate><abstract>Optimizing tail latency while efficiently managing computational resources is crucial for delivering high-performance, latency-sensitive services in edge computing. Emerging applications, such as augmented reality, require low-latency computing services with high reliability on user devices, which often have limited computational capabilities. Consequently, these devices depend on nearby edge servers for processing. However, inherent uncertainties in network and computation latencies stemming from variability in wireless networks and fluctuating server loads make service delivery on time challenging. Existing approaches often focus on optimizing median latency but fall short of addressing the specific challenges of tail latency in edge environments, particularly under uncertain network and computational conditions. Although some methods do address tail latency, they typically rely on fixed or excessive redundancy and lack adaptability to dynamic network conditions, often being designed for cloud environments rather than the unique demands of edge computing. In this paper, we introduce SafeTail, a framework that meets both median and tail response time targets, with tail latency defined as latency beyond the 90^th percentile threshold. SafeTail addresses this challenge by selectively replicating services across multiple edge servers to meet target latencies. SafeTail employs a reward-based deep learning framework to learn optimal placement strategies, balancing the need to achieve target latencies with minimizing additional resource usage. Through trace-driven simulations, SafeTail demonstrated near-optimal performance and outperformed most baseline strategies across three diverse services.</abstract><doi>10.48550/arxiv.2408.17171</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2408.17171
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2408_17171
source	arXiv.org
subjects	Computer Science - Learning
title	SafeTail: Efficient Tail Latency Optimization in Edge Service Scheduling via Computational Redundancy Management
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T16%3A26%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SafeTail:%20Efficient%20Tail%20Latency%20Optimization%20in%20Edge%20Service%20Scheduling%20via%20Computational%20Redundancy%20Management&rft.au=Shokhanda,%20Jyoti&rft.date=2024-08-30&rft_id=info:doi/10.48550/arxiv.2408.17171&rft_dat=%3Carxiv_GOX%3E2408_17171%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true