Learning Smooth Motion Planning for Intelligent Aerial Transportation Vehicles by Stable Auxiliary Gradient

Deep Reinforcement Learning (DRL) has been widely attempted for solving real-time intelligent aerial transportation vehicle motion planning tasks recently. When interacting with environment, DRL-driven aerial vehicles inevitably switch the steering actions in high frequency during both exploration a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent transportation systems 2022-12, Vol.23 (12), p.24464-24473
Hauptverfasser: Piao, Haiyin, Yu, Jin, Mo, Li, Yang, Xin, Liu, Zhimin, Sun, Zhixiao, Lu, Ming, Yang, Zhen, Zhou, Deyun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Deep Reinforcement Learning (DRL) has been widely attempted for solving real-time intelligent aerial transportation vehicle motion planning tasks recently. When interacting with environment, DRL-driven aerial vehicles inevitably switch the steering actions in high frequency during both exploration and execution phase, resulting in the well known flight trajectory oscillation issue, which makes flight dynamics unstable, and even endangers flight safety in serious cases. Unfortunately, there is hardly any literature about achieving flight trajectory smoothness in DRL-based motion planning. In view of this, we originally formalize the practical flight trajectory smoothen problem as a three-level Nested pArameterized Smooth Trajectory Optimization (NASTO) form. On this basis, a novel Stable Auxiliary Gradient (SAG) algorithm is proposed, which significantly smoothens the DRL-generated flight motions by constructing two independent optimization aspects: the major gradient, and the stable auxiliary gradient. Experimental result reveals that the proposed SAG algorithm outperforms baseline DRL-based intelligent aerial transportation vehicle motion planning algorithms in terms of both learning efficiency and flight motion smoothness.
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2022.3198766