Safe, efficient, and comfortable velocity control based on reinforcement learning for autonomous driving

•Reinforcement learning for safe, efficient, comfortable vehicle velocity control.•A reward function is developed by combining driving features.•Collision avoidance strategy is incorporated for safety and faster convergence.•The model outperforms human drivers and has faster running speed than MPC....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Transportation research. Part C, Emerging technologies Emerging technologies, 2020-08, Vol.117, p.102662, Article 102662
Hauptverfasser:	Zhu, Meixin, Wang, Yinhai, Pu, Ziyuan, Hu, Jingyun, Wang, Xuesong, Ke, Ruimin
Format:	Artikel
Sprache:	eng
Schlagworte:	Autonomous driving Car following Deep Deterministic Policy Gradient (DDPG) NGSIM Reinforcement learning Velocity control
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•Reinforcement learning for safe, efficient, comfortable vehicle velocity control.•A reward function is developed by combining driving features.•Collision avoidance strategy is incorporated for safety and faster convergence.•The model outperforms human drivers and has faster running speed than MPC. A model used for velocity control during car following is proposed based on reinforcement learning (RL). To optimize driving performance, a reward function is developed by referencing human driving data and combining driving features related to safety, efficiency, and comfort. With the developed reward function, the RL agent learns to control vehicle speed in a fashion that maximizes cumulative rewards, through trials and errors in the simulation environment. To avoid potential unsafe actions, the proposed RL model is incorporated with a collision avoidance strategy for safety checks. The safety check strategy is used during both model training and testing phases, which results in faster convergence and zero collisions. A total of 1,341 car-following events extracted from the Next Generation Simulation (NGSIM) dataset are used to train and test the proposed model. The performance of the proposed model is evaluated by the comparison with empirical NGSIM data and with adaptive cruise control (ACC) algorithm implemented through model predictive control (MPC). The experimental results show that the proposed model demonstrates the capability of safe, efficient, and comfortable velocity control and outperforms human drivers in that it 1) has larger TTC values than those of human drivers, 2) can maintain efficient and safe headways around 1.2s, and 3) can follow the lead vehicle comfortably with smooth acceleration (jerk value is only a third of that of human drivers). Compared with the MPC-based ACC algorithm, the proposed model has better performance in terms of safety, comfort, and especially running speed during testing (more than 200 times faster). The results indicate that the proposed approach could contribute to the development of better autonomous driving systems. Source code of this paper can be found at https://github.com/MeixinZhu/Velocity_control.
ISSN:	0968-090X 1879-2359
DOI:	10.1016/j.trc.2020.102662