Dynamic traffic signal control using mean field multi‐agent reinforcement learning in large scale road‐networks

Multi‐agent reinforcement learning has played an increasingly important role in intelligent traffic signal control due to its self‐learning ability. However, existing algorithms only focus on signal timing mechanism design while ignoring the exponential growth of the joint action dimension as the nu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IET intelligent transport systems 2023-09, Vol.17 (9), p.1715-1728
Hauptverfasser: Hu, Tianfeng, Hu, Zhiqun, Lu, Zhaoming, Wen, Xiangming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multi‐agent reinforcement learning has played an increasingly important role in intelligent traffic signal control due to its self‐learning ability. However, existing algorithms only focus on signal timing mechanism design while ignoring the exponential growth of the joint action dimension as the number of intersections increases, which will ultimately face the learning difficulty. In this paper, traditional traffic methods are introduced into MARL to flexibly determine the phase and duration of each intersection. The proposed MARL algorithm based on mean field theory has the ability to convert a large number of agents to approximately binary interaction, which can effectively reduce the dimension of joint action space in multi‐agent environment and learn in a robust process. Besides, to improve the performance of traditional traffic methods, the recurrent neural network (RNN) and an improved Webster's formula with revised parameters are combined to dynamically determine the phase duration according to the historical volume of traffic flow. The simulation results indicate that the proposed algorithm shows superior scalability compared to baseline methods and has great potential to be applied in the large scale road‐networks scenario.
ISSN:1751-956X
1751-9578
DOI:10.1049/itr2.12364