Reducing Tail Latency via Safe and Simple Duplication

Duplication can be a powerful strategy for overcoming stragglers in cloud services, but is often used conservatively because of the risk of overloading the system. We present duplicate-aware scheduling or DAS, which makes duplication safe and easy to use, by leveraging the two well-known primitives...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Bashir, Hafiz Mohsin, Faisal, Abdullah Bin, Jamshed, Muhammad Asim, Vondras, Peter, Iftikhar, Ali Musa, Qazi, Ihsan Ayyub, Dogar, Fahad R
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Distributed, Parallel, and Cluster Computing Computer Science - Networking and Internet Architecture
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Duplication can be a powerful strategy for overcoming stragglers in cloud services, but is often used conservatively because of the risk of overloading the system. We present duplicate-aware scheduling or DAS, which makes duplication safe and easy to use, by leveraging the two well-known primitives of prioritization and purging. To support DAS across diverse layers of a cloud system (e.g., network, storage, etc), we propose the D-Stage abstraction, which decouples the duplication policy from the mechanism, and facilitates working with legacy layers of a system. Using this abstraction, we evaluate the benefits of DAS for two data parallel applications (HDFS, an in-memory workload generator) and a network function (snort-based IDS cluster). Our experiments on the public cloud and Emulab show that DAS is safe to use, and the tail latency improvement holds across a wide range of workloads
DOI:	10.48550/arxiv.1905.13352