Traffic pattern-aware elevator dispatching via deep reinforcement learning

This study addresses the elevator dispatching problem using deep reinforcement learning, with a specific emphasis on traffic pattern awareness. Previous studies on reinforcement learning-based elevator dispatching have largely focused on training separate models for single traffic patterns, such as...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Advanced engineering informatics 2024-08, Vol.61, p.102497, Article 102497
Hauptverfasser:	Wan, Jiansong, Lee, Kanghoon, Shin, Hayong
Format:	Artikel
Sprache:	eng
Schlagworte:	Deep reinforcement learning Elevator dispatching Semi-Markov decision process Traffic pattern awareness
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This study addresses the elevator dispatching problem using deep reinforcement learning, with a specific emphasis on traffic pattern awareness. Previous studies on reinforcement learning-based elevator dispatching have largely focused on training separate models for single traffic patterns, such as up-peak, down-peak, lunch-peak, and inter-floor. This separate training approach not only introduces practical complexities by requiring an auxiliary model to predict traffic patterns for guiding dispatching decisions but is also computationally burdensome. In contrast, our goal is to develop a unified, traffic pattern-aware dispatching model. We formulate the elevator dispatching problem as a Semi-Markov Decision Process (SMDP) with novel state representation, action space, and reward function designs. To solve the formulated SMDP, we propose a Dueling Double Deep Q-Network (D3QN) architecture associated with the training algorithm. To ensure traffic pattern awareness, we train our model in a unified ‘All in One’ traffic scenario, employing two practical techniques to enhance the training process: (1) temporal grouping with gradient surgery and (2) incorporation of passenger arrival information. Empirical evaluations confirm the superiority of our model over multiple benchmarks, including those relying on separate, pattern-specific models. Remarkably, our unified model demonstrates robust performance across unseen traffic scenarios and performs exceptionally well in single traffic patterns despite being trained solely on the unified ‘All in One’ scenario. The short inference time for decision-making further solidifies the model’s practical viability. Additionally, the incremental benefits contributed by each of our introduced techniques are also investigated. Our code is available at https://github.com/jswan95/RL-based-traffic-pattern-aware-elevator-dispatching
ISSN:	1474-0346
DOI:	10.1016/j.aei.2024.102497