Online food ordering delivery strategies based on deep reinforcement learning
With the rapid development of Online to Offline (O2O) business, millions of transactions are performed on the popular online food ordering platforms each day. Efficient dispatching of orders and dynamic adjustment of delivery routes are critical to the success of the O2O platforms. However, the vast...
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2022-04, Vol.52 (6), p.6853-6865 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | With the rapid development of Online to Offline (O2O) business, millions of transactions are performed on the popular online food ordering platforms each day. Efficient dispatching of orders and dynamic adjustment of delivery routes are critical to the success of the O2O platforms. However, the vast volume of transactions and the computational complexity of delivery routes pose significant challenges to the efficient dispatching of orders. The action to dispatch orders and the resulting state transition of couriers form a Markov decision process (MDP). The reinforcement learning technique had proven its capability of dealing with MDP. This paper proposes a Double Deep Q Netwok (DQN) based reinforcement learning framework that gradually tests and learns the order dispatching policy by communicating with an O2O simulation model developed by SUMO. The preliminary experimental results using the real order data demonstrate the effectiveness and efficiency of the proposed Double-DQN based order dispatcher. Also, different state encoding schemes are designed and tested to improve the performance of the Double-DQN based dispatcher. |
---|---|
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-021-02750-3 |