A bi-objective deep reinforcement learning approach for low-carbon-emission high-speed railway alignment design

•Optimize railway alignment with bi-objective deep reinforcement learning.•Propose a multiobjective deep deterministic policy gradient (MODDPG) algorithm.•Quantify the life-cycle carbon emissions due to energy use. Reasonable design and planning of alignments are crucial for both economic investment...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Transportation research. Part C, Emerging technologies Emerging technologies, 2023-02, Vol.147, p.104006, Article 104006
Hauptverfasser: He, Qing, Gao, Tianci, Gao, Yan, Li, Huailong, Schonfeld, Paul, Zhu, Ying, Li, Qilong, Wang, Ping
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Optimize railway alignment with bi-objective deep reinforcement learning.•Propose a multiobjective deep deterministic policy gradient (MODDPG) algorithm.•Quantify the life-cycle carbon emissions due to energy use. Reasonable design and planning of alignments are crucial for both economic investment and the environmental impact of high-speed railway projects. Approaches that can integrate economic investment and environmental factors, thus selecting an economical and eco-friendly railway alignment, are very demanding. To address the above issue, this study focuses on optimizing a railway’s comprehensive investment, including the construction and environmental costs, as well as the railway’s life-cycle carbon emission caused by the production of building materials and the trains’ energy consumption. A novel railway alignment optimization model is formulated based on the multi-objective reinforcement learning (MORL) framework to reduce the railway total cost, accounting for both the construction cost and environmental factors. In the proposed model, a deep deterministic policy gradient (DDPG) algorithm is enhanced with an envelope algorithm that can optimize the convex envelope of multi-objective Q-values to ensure an efficient consistency between the entire space of preferences in a domain and the corresponding optimal policies. Finally, the proposed model is applied to a real-world high-speed railway project. Results show that the MORL model can automatically explore and optimize railway alignment, and produce less expensive and more eco-friendly solutions than manual work while satisfying various alignment constraints.
ISSN:0968-090X
DOI:10.1016/j.trc.2022.104006