Proximal Policy Optimization Through a Deep Reinforcement Learning Framework for Multiple Autonomous Vehicles at a Non-Signalized Intersection

Advanced deep reinforcement learning shows promise as an approach to addressing continuous control tasks, especially in mixed-autonomy traffic. In this study, we present a deep reinforcement-learning-based model that considers the effectiveness of leading autonomous vehicles in mixed-autonomy traffi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied sciences 2020-08, Vol.10 (16), p.5722
Hauptverfasser:	Quang Tran, Duy, Bae, Sang-Hoon
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Application programming interface Artificial intelligence Automation Autonomous vehicles Autonomy Computer simulation Control tasks Deep learning deep reinforcement learning Delay time Experiments flow framework Human error Learning Methods multiple autonomous vehicles Neural networks Optimization proximal policy optimization Reinforcement Researchers Simulation simulation of urban mobility (SUMO) Traffic Traffic congestion Traffic control Traffic intersections Traffic speed Vehicles
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Advanced deep reinforcement learning shows promise as an approach to addressing continuous control tasks, especially in mixed-autonomy traffic. In this study, we present a deep reinforcement-learning-based model that considers the effectiveness of leading autonomous vehicles in mixed-autonomy traffic at a non-signalized intersection. This model integrates the Flow framework, the simulation of urban mobility simulator, and a reinforcement learning library. We also propose a set of proximal policy optimization hyperparameters to obtain reliable simulation performance. First, the leading autonomous vehicles at the non-signalized intersection are considered with varying autonomous vehicle penetration rates that range from 10% to 100% in 10% increments. Second, the proximal policy optimization hyperparameters are input into the multiple perceptron algorithm for the leading autonomous vehicle experiment. Finally, the superiority of the proposed model is evaluated using all human-driven vehicle and leading human-driven vehicle experiments. We demonstrate that full-autonomy traffic can improve the average speed and delay time by 1.38 times and 2.55 times, respectively, compared with all human-driven vehicle experiments. Our proposed method generates more positive effects when the autonomous vehicle penetration rate increases. Additionally, the leading autonomous vehicle experiment can be used to dissipate the stop-and-go waves at a non-signalized intersection.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app10165722