Dynamic Courier Capacity Acquisition in Rapid Delivery Systems: A Deep Q-Learning Approach

With the recent boom of the gig economy, urban delivery systems have experienced substantial demand growth. In such systems, orders are delivered to customers from local distribution points respecting a delivery time promise. An important example is a restaurant meal delivery system, where delivery...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Transportation science 2024-01, Vol.58 (1), p.67-93
1. Verfasser: Auad, Ramon
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the recent boom of the gig economy, urban delivery systems have experienced substantial demand growth. In such systems, orders are delivered to customers from local distribution points respecting a delivery time promise. An important example is a restaurant meal delivery system, where delivery times are expected to be minutes after an order is placed. The system serves orders by making use of couriers that continuously perform pickups and deliveries. Operating such a rapid delivery system is very challenging, primarily because of the high service expectations and the considerable uncertainty in both demand and delivery capacity. Delivery providers typically plan courier shifts for an operating period based on a demand forecast. However, because of the high demand volatility, it may at times during the operating period be necessary to adjust and dynamically add couriers. We study the problem of dynamically adding courier capacity in a rapid delivery system and propose a deep reinforcement-learning approach to obtain a policy that balances the cost of adding couriers and the cost-of-service quality degradation because of insufficient delivery capacity. Specifically, we seek to ensure that a high fraction of orders is delivered on time with a small number of courier hours. A computational study in the meal delivery space shows that a learned policy outperforms policies representing current practice and demonstrates the potential of deep learning for solving operational problems in highly stochastic logistic settings. History: This paper has been accepted for the Transportation Science Special Issue on Machine-Learning Methods and Applications in Large-Scale Route Planning Problems. Funding: This work was supported by Agencia Nacional de Investigación y Desarrollo [72180404]. Supplemental Material: The e-companion is available at https://doi.org/10.1287/trsc.2022.0042 .
ISSN:0041-1655
1526-5447
DOI:10.1287/trsc.2022.0042