Joint Deployment and Request Routing for Microservice Call Graphs in Data Centers

Microservices are an architectural and organizational paradigm for Internet application development. In cloud data centers, delay-sensitive applications receive massive user requests, which are fed into multiple queues and subsequently served by multiple microservice instances. Accordingly, effectiv...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems 2023-11, Vol.34 (11), p.2994-3011
Hauptverfasser: Hu, Yi, Wang, Hao, Wang, Liangyuan, Hu, Menglan, Peng, Kai, Veeravalli, Bharadwaj
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Microservices are an architectural and organizational paradigm for Internet application development. In cloud data centers, delay-sensitive applications receive massive user requests, which are fed into multiple queues and subsequently served by multiple microservice instances. Accordingly, effective deployment of multiple queues and containers can significantly reduce queuing delay, processing delay, and communication delay. Due to the increased complexity of call dependencies and probabilistic routing paths, the deployment of service instances fully interacts with request routing, bringing great difficulties to service orchestration. In this case, it is valuable to simultaneously consider service deployment and request routing in a fine-grained manner. However, most existing studies considered them as two independent components with local optimization, while data dependencies and the instance-level deployment are ignored. Therefore, this paper proposes to jointly optimize the deployment and request routing of microservice call graphs based on fine-grained queuing network analysis and container orchestration. We first formulate the problem as a mixed-integer nonlinear program and exploit open Jackson queuing networks to model intrinsic data dependencies and analyze response latency. To optimize the overall cost and latency, this paper presents an efficient two-stage heuristic algorithm, which consists of a resource-splitting-based deployment approach and a partition-mapping-based routing method. Further, this paper also provides mathematical analysis on the performance and complexity of the proposed algorithm. Finally, comprehensive trace-driven experiments demonstrate that the overall performance of our approach is better than existing microservice benchmarks. The average deployment cost is reduced by 27.4% and end-to-end response latency is reduced by 15.1% on average.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2023.3311767