Hierarchical reinforcement learning for self-driving decision-making without reliance on labelled driving data

Decision making for self-driving cars is usually tackled by manually encoding rules from drivers’ behaviours or imitating drivers’ manipulation using supervised learning techniques. Both of them rely on mass driving data to cover all possible driving scenarios. This study presents a hierarchical rei...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IET intelligent transport systems 2020-05, Vol.14 (5), p.297-305
Hauptverfasser:	Duan, Jingliang, Eben Li, Shengbo, Guan, Yang, Sun, Qi, Cheng, Bo
Format:	Artikel
Sprache:	eng
Schlagworte:	asynchronous parallel reinforcement learners control engineering computing decision making driver information systems driving decisions driving in lane fully‐connected neural networks hierarchical reinforcement learning highway driving scenario high‐level manoeuvre selection labelled driving data learning (artificial intelligence) left lane change low‐level motion control motion control neural nets parallel processing right lane change road traffic control self‐driving cars Special Issue: AI Applications to Intelligent Vehicles for Advancing Intelligent Transport Systems supervised learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Decision making for self-driving cars is usually tackled by manually encoding rules from drivers’ behaviours or imitating drivers’ manipulation using supervised learning techniques. Both of them rely on mass driving data to cover all possible driving scenarios. This study presents a hierarchical reinforcement learning method for decision making of self-driving cars, which does not depend on a large amount of labelled driving data. This method comprehensively considers both high-level manoeuvre selection and low-level motion control in both lateral and longitudinal directions. The authors firstly decompose the driving tasks into three manoeuvres, including driving in lane, right lane change and left lane change, and learn the sub-policy for each manoeuvre. Then, a master policy is learned to choose the manoeuvre policy to be executed in the current state. All policies, including master policy and manoeuvre policies, are represented by fully-connected neural networks and trained by using asynchronous parallel reinforcement learners, which builds a mapping from the sensory outputs to driving decisions. Different state spaces and reward functions are designed for each manoeuvre. They apply this method to a highway driving scenario, which demonstrates that it can realise smooth and safe decision making for self-driving cars.
ISSN:	1751-956X 1751-9578
DOI:	10.1049/iet-its.2019.0317