CloudSentry: Two-Stage Heavy Hitter Detection for Cloud-Scale Gateway Overload Protection
The cloud vendors provide sharing resources for millions of tenants across the world to achieve economies of scale. At the same time, the cloud network keeps the performance isolation between different tenants as if they use their private dedicated resources. However, heavy hitters caused by a singl...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on parallel and distributed systems 2024-04, Vol.35 (4), p.616-633 |
---|---|
Hauptverfasser: | , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The cloud vendors provide sharing resources for millions of tenants across the world to achieve economies of scale. At the same time, the cloud network keeps the performance isolation between different tenants as if they use their private dedicated resources. However, heavy hitters caused by a single tenant at cloud gateways will break such isolation, undermining the predictable performance expected by other cloud tenants. To prevent it, heavy hitter detection becomes a key concern at the performance-critical cloud gateways but faces the dilemma between fine granularity and low overhead. In this work, we present CloudSentry , a scalable two-stage heavy hitter detection system dedicated to multi-tenant cloud gateways against such a dilemma. CloudSentry uses CPU utilization as an indicator of heavy hitters and conducts a lightweight coarse-grained detection running 24/7 to detect such CPU spikes. Then it invokes a fine-grained detection to precisely dump and analyze the potential heavy-hitter packets at the CPU spikes. After that, a more comprehensive analysis is conducted to associate heavy hitters with the cloud service scenarios and invoke a corresponding backpressure procedure. CloudSentry significantly reduces memory, computation and storage overhead compared with existing approaches. In a gateway cluster under an average traffic throughput of 251 Gbps, CloudSentry consumes only a fraction of 2%-5% CPU utilization with 8 KB run-time memory, producing only 10 MB heavy hitter logs during one month. Additionally, as it has been deployed in Alibaba Cloud for over two years, we share case studies and a lot of deployment experiences in this article. |
---|---|
ISSN: | 1045-9219 1558-2183 |
DOI: | 10.1109/TPDS.2023.3301852 |