Reducing Tail Latency via Safe and Simple Duplication
Duplication can be a powerful strategy for overcoming stragglers in cloud services, but is often used conservatively because of the risk of overloading the system. We present duplicate-aware scheduling or DAS, which makes duplication safe and easy to use, by leveraging the two well-known primitives...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Duplication can be a powerful strategy for overcoming stragglers in cloud
services, but is often used conservatively because of the risk of overloading
the system. We present duplicate-aware scheduling or DAS, which makes
duplication safe and easy to use, by leveraging the two well-known primitives
of prioritization and purging. To support DAS across diverse layers of a cloud
system (e.g., network, storage, etc), we propose the D-Stage abstraction, which
decouples the duplication policy from the mechanism, and facilitates working
with legacy layers of a system. Using this abstraction, we evaluate the
benefits of DAS for two data parallel applications (HDFS, an in-memory workload
generator) and a network function (snort-based IDS cluster). Our experiments on
the public cloud and Emulab show that DAS is safe to use, and the tail latency
improvement holds across a wide range of workloads |
---|---|
DOI: | 10.48550/arxiv.1905.13352 |