Policy-Based Reinforcement Learning for Training Autonomous Driving Agents in Urban Areas With Affordance Learning

Learning to drive in urban areas is an open challenge for autonomous vehicles (AVs), as complex decision making requirements are needed in multi-task co-ordinations environments. In this paper, we propose a hybrid framework with a new perception model involving affordance learning to simplify the su...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent transportation systems 2022-08, Vol.23 (8), p.12562-12571
Hauptverfasser: Ahmed, Marwa, Abobakr, Ahmed, Lim, Chee Peng, Nahavandi, Saeid
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Learning to drive in urban areas is an open challenge for autonomous vehicles (AVs), as complex decision making requirements are needed in multi-task co-ordinations environments. In this paper, we propose a hybrid framework with a new perception model involving affordance learning to simplify the surrounding urban scenes for training an AV agent, along with a planned trajectory and the associated driving measurements. Our proposed solution encompasses two main aspects. Firstly, a supervised learning network is used to map the input sensory data into affordance predictions. The predicted affordances provide a low-dimensional representation of surrounding scenes of the AV in the form of key perception indicators, e.g., true or false with respect to a traffic light signal. Secondly, a deep deterministic policy gradient model that maps the perception information into a series of actions is devised. We evaluate the proposed solution using the CARLA driving simulator in an urban town and evaluate the performance in a new, unseen town under different weather conditions. The quantitative and qualitative results indicate that our proposed solution can generalize well to cope with different traffic solutions and environmental conditions. Our proposed solution also outperforms other baseline methods in a comparative study in handling various AV driving tasks with different levels of difficulty. In addition, the model trained with simulated scenes yields promising prediction results when testing on recorded video streams on real-world highway and suburb environments with varying traffic and weather conditions.
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2021.3115235