On System-Wide Safety Staffing of Large-Scale Parallel Server Networks

Large-scale stochastic networks are widely used for a variety of systems including telecommunications, patient flows, service and data centers, and so on. Because of their complexity, ensuring the stability of these networks by allocating the required resources needed by each customer class is quite...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Operations research 2023-03, Vol.71 (2), p.415-432
1. Verfasser: Hmedi, Hassan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Large-scale stochastic networks are widely used for a variety of systems including telecommunications, patient flows, service and data centers, and so on. Because of their complexity, ensuring the stability of these networks by allocating the required resources needed by each customer class is quite challenging. When there are sufficient resources to serve each customer class, the existence of a policy that stabilizes the system is trivial. One can decouple the network and assign the required resources to each customer class. Conversely, one can anticipate that, when each class is resourceless, it is impossible to stabilize the system independent of the policy used. However, previous analyses did not tackle this question. In this work, we assume that some classes have excess resources, whereas others are deficient. We provide a full characterization of the stability of these networks by identifying a parameter that describes the overall excess or lack of resources in the network. We introduce a “system-wide safety staffing” (SWSS) parameter for multiclass multipool networks of any tree topology, Markovian or non-Markovian, in the Halfin-Whitt regime. This parameter can be regarded as the optimal reallocation of the capacity fluctuations (positive or negative) of order n when each server pool uses a square-root staffing rule. We provide an explicit form of the SWSS as a function of the system parameters, which is derived using a graph theoretic approach based on Gaussian elimination. For Markovian networks, we give an equivalent characterization of the SWSS parameter via the drift parameters of the limiting diffusion. We show that if the SWSS parameter is negative, the limiting diffusion and the diffusion-scaled queueing processes are transient under any Markov control and cannot have a stationary distribution when this parameter is zero. If it is positive, we show that the diffusion-scaled queueing processes are uniformly stabilizable ; that is, there exists a scheduling policy under which the stationary distributions of the controlled processes are tight over the size of the network. In addition, there exists a control under which the limiting controlled diffusion is exponentially ergodic. Thus, we identified a necessary and sufficient condition for the uniform stabilizability of such networks in the Halfin-Whitt regime. We use a constant control resulting from the leaf elimination algorithm to stabilize the limiting controlled diffusion while a family of
ISSN:0030-364X
1526-5463
DOI:10.1287/opre.2021.2256