An Improved Analysis of LP-based Control for Revenue Management
In this paper, we study a class of revenue management problems where the decision maker aims to maximize the total revenue subject to budget constraints on multiple type of resources over a finite horizon. At each time, a new order/customer/bid is revealed with a request of some resource(s) and a re...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we study a class of revenue management problems where the
decision maker aims to maximize the total revenue subject to budget constraints
on multiple type of resources over a finite horizon. At each time, a new
order/customer/bid is revealed with a request of some resource(s) and a reward,
and the decision maker needs to either accept or reject the order. Upon the
acceptance of the order, the resource request must be satisfied and the
associated revenue (reward) can be collected. We consider a stochastic setting
where all the orders are i.i.d. sampled, i.e., the reward-request pair at each
time is drawn from an unknown distribution with finite support. The formulation
contains many classic applications such as the quantity-based network revenue
management problem and the Adwords problem. We focus on the classic LP-based
adaptive algorithm and consider regret as the performance measure defined by
the gap between the optimal objective value of the certainty-equivalent linear
program (LP) and the expected revenue obtained by the online algorithm. Our
contribution is two-fold: (i) when the underlying LP is nondegenerate, the
algorithm achieves a problem-dependent regret upper bound that is independent
of the horizon/number of time periods $T$; (ii) when the underlying LP is
degenerate, the algorithm achieves a regret upper bound that scales on the
order of $\sqrt{T}\log T$. To our knowledge, both results are new and improve
the best existing bounds for the LP-based adaptive algorithm in the
corresponding setting. We conclude with numerical experiments to further
demonstrate our findings. |
---|---|
DOI: | 10.48550/arxiv.2101.11092 |