Load is not what you should balance: Introducing Prequal
We present Prequal (Probing to Reduce Queuing and Latency), a load balancer for distributed multi-tenant systems. Prequal aims to minimize real-time request latency in the presence of heterogeneous server capacities and non-uniform, time-varying antagonist load. It actively probes server load to lev...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present Prequal (Probing to Reduce Queuing and Latency), a load balancer
for distributed multi-tenant systems. Prequal aims to minimize real-time
request latency in the presence of heterogeneous server capacities and
non-uniform, time-varying antagonist load. It actively probes server load to
leverage the power-of-d-choices paradigm, extending it with asynchronous and
reusable probes. Cutting against received wisdom, Prequal does not balance CPU
load, but instead selects servers according to estimated latency and active
requests-in-flight (RIF). We explore its major design features on a testbed
system and evaluate it on YouTube, where it has been deployed for more than two
years. Prequal has dramatically decreased tail latency, error rates, and
resource use, enabling YouTube and other production systems at Google to run at
much higher utilization. |
---|---|
DOI: | 10.48550/arxiv.2312.10172 |