A mixed-integer programming-based Q-learning approach for electric bus scheduling with multiple termini and service routes

Electric buses (EBs) are considered a more environmentally friendly mode of public transit. In addition to other practical challenges, including high infrastructure costs and short driving ranges, the operations of EBs are more demanding due to the necessary battery charging activities. Consequently...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Transportation research. Part C, Emerging technologies Emerging technologies, 2024-05, Vol.162, p.104570, Article 104570
Hauptverfasser: Yan, Yimo, Wen, Haomin, Deng, Yang, Chow, Andy H.F., Wu, Qihao, Kuo, Yong-Hong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Electric buses (EBs) are considered a more environmentally friendly mode of public transit. In addition to other practical challenges, including high infrastructure costs and short driving ranges, the operations of EBs are more demanding due to the necessary battery charging activities. Consequently, more sophisticated optimisation models and algorithms are required for effective operations. This paper presents an EB scheduling problem with multiple termini and service routes. Various realistic but complicated factors, such as shared facilities at multiple termini, the flexibility of plugging and unplugging chargers before an EB is fully charged, stochastic travel times, and EB breakdowns, are considered. We propose an integrated learning and mixed-integer linear programming (MILP) framework to overcome the computational difficulties when solving the problem. This framework leverages the strengths of reinforcement learning and MILP for fast computations due to its capability of learning from outcomes of state–action pairs and computational effectiveness guaranteed by the constraints governing the solution feasibility. Q-Learning and Twin Delayed Deep Deterministic Policy Gradient are adopted as our training methods. We conduct numerical experiments on artificial instances and realistic instances of a bus network in Hong Kong to assess the performance of our proposed approach. The results show that our proposed framework outperforms the benchmark optimisation approach, in terms of penalty on missed service trips, average headway, and variance of headway. The benefits of our proposed framework are more significant under a highly stochastic environment. •Electric bus scheduling with multiple termini and service routes.•Shared facilities at multiple termini and the flexibility of plugging and unplugging chargers considered.•Integration of Q-learning and mixed-integer linear programming for solving the problem.•Experiments on artificial instances and a real-world case to demonstrate the high computational performance.•Managerial insights derived from the experiments.
ISSN:0968-090X
1879-2359
DOI:10.1016/j.trc.2024.104570