Intersection decision making for autonomous vehicles based on improved PPO algorithm

The deployment of autonomous vehicles (AVs) in complex urban environments faces numerous challenges, especially at intersections where they coexist with human‐driven vehicles (HVs), resulting in increased safety risks. In response, this study proposes an improved control strategy based on the Proxim...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IET intelligent transport systems 2024-12, Vol.18 (S1), p.2921-2938
Hauptverfasser:	Guo, Dong, He, Shoulin, Ji, Shouwen
Format:	Artikel
Sprache:	eng
Schlagworte:	artificial intelligence automated driving and intelligent vehicles autonomous driving Markov processes
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The deployment of autonomous vehicles (AVs) in complex urban environments faces numerous challenges, especially at intersections where they coexist with human‐driven vehicles (HVs), resulting in increased safety risks. In response, this study proposes an improved control strategy based on the Proximal Policy Optimization (PPO) algorithm, specifically designed for hybrid intersections, known as MSA‐PPO. First, the Self‐Attention Mechanism (SAM) is introduced into the algorithmic framework to quickly identify the surrounding vehicles with a greater impact on the ego vehicle from different perspectives, accelerating data processing and improving decision quality. Second, an invalid action masking mechanism is adopted to reduce the action space, ensuring actions are only selected from feasible sets, thereby enhancing decision efficiency. Finally, comparative and ablation experiments in hybrid intersection simulation environments of varying complexity are conducted to validate the algorithm's effectiveness. The results show that the improved algorithm converges faster, achieves higher decision accuracy, and demonstrates the highest speed levels during driving compared to other baseline algorithms. In this article, we improve the proximal policy optimisation (PPO) algorithm in deep reinforcement learning and propose an MSA‐PPO algorithm, which adopts the self‐attention mechanism for input data processing, effectively identifies and focuses on the most important information in the interaction between vehicles to improve the overall performance of the system, in addition to applying the ineffective action masking mechanism to select the effective actions under specific conditions and narrow the decision space, greatly improve the learning efficiency, and thus improve the overall performance of the system.
ISSN:	1751-956X 1751-9578
DOI:	10.1049/itr2.12593