Blind Dynamic Resource Allocation in Closed Networks via Mirror Backpressure

We study the problem of maximizing payoff generated over a period of time in a general class of closed queueing networks with a finite, fixed number of supply units which circulate in the system. Demand arrives stochastically, and serving a demand unit (customer) causes a supply unit to relocate fro...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kanoria, Yash, Qian, Pengyu
Format:	Artikel
Sprache:	eng
Schlagworte:	Mathematics - Optimization and Control Mathematics - Probability
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We study the problem of maximizing payoff generated over a period of time in a general class of closed queueing networks with a finite, fixed number of supply units which circulate in the system. Demand arrives stochastically, and serving a demand unit (customer) causes a supply unit to relocate from the ``origin'' to the ``destination'' of the customer. The key challenge is to manage the distribution of supply in the network. We consider general controls including customer entry control, pricing, and assignment. Motivating applications include shared transportation platforms and scrip systems. Inspired by the mirror descent algorithm for optimization and the backpressure policy for network control, we introduce a rich family of \emph{Mirror Backpressure} (MBP) control policies. The MBP policies are simple and practical, and crucially do not need any statistical knowledge of the demand (customer) arrival rates (these rates are permitted to vary in time). Under mild conditions, we propose MBP policies that are provably near optimal. Specifically, our policies lose at most $O(\frac{K}{T}+\frac{1}{K} + \sqrt{\eta K})$ payoff per customer relative to the optimal policy that knows the demand arrival rates, where $K$ is the number of supply units, $T$ is the total number of customers over the time horizon, and $\eta$ is the demand process' average rate of change per customer arrival. An adaptation of MBP is found to perform well in numerical experiments based on data from ride-hailing.
DOI:	10.48550/arxiv.1903.02764