Adaptive Broad Deep Reinforcement Learning for Intelligent Traffic Light Control

Deep reinforcement learning (DRL) has superior autonomous decision-making capabilities, combining deep learning and reinforcement learning (RL). Unlike DRL employs deep neural networks (DNNs), broad RL (BRL) adopts the broad learning system (BLS) that is established with flat networks to generate th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE internet of things journal 2024-09, Vol.11 (17), p.28496-28507
Hauptverfasser:	Zhu, Ruijie, Wu, Shuning, Li, Lulu, Ding, Wenting, Lv, Ping, Sui, Luyao
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive control Algorithms Artificial neural networks Autonomous vehicles Broad learning system (BLS) broad reinforcement learning (BRL) Deep learning Deep reinforcement learning deep reinforcement learning (DRL) Feature extraction Finite element analysis Internet of Things Machine learning multiagent DRL (MADRL) Multiagent systems Optimization Random sampling Reagents Robustness Traffic control Traffic information traffic light control (TLC) Traffic signals Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep reinforcement learning (DRL) has superior autonomous decision-making capabilities, combining deep learning and reinforcement learning (RL). Unlike DRL employs deep neural networks (DNNs), broad RL (BRL) adopts the broad learning system (BLS) that is established with flat networks to generate the strategy. This article proposes the multiagent adaptive broad-DRL (ABDRL) approach for traffic light control (TLC), which combines the broad network with the deep network structure. Specifically, the structure of ABDRL first expands in the form of flatted broad networks. Then, the feature representation module that contains DNNs is employed to extract the critical traffic information. In addition, experiences sampled randomly by the experience replay mechanism cannot reflect the current training status of the agent effectively. In order to alleviate the impacts caused by random sampling, the forgetful experience mechanism (FEM) is incorporated into ABDRL. The FEM enables the agent to discriminate the importance of experiences stored in the experience reply buffer to improve robustness and adaptability. We validate the effectiveness of ABDRL in TLC, and the results illustrate the optimality and robustness of ABDRL over the state-of-the-art multiagent DRL (MADRL) algorithms.
ISSN:	2327-4662 2327-4662
DOI:	10.1109/JIOT.2024.3401829