Visual Navigation with Asynchronous Proximal Policy Optimization in Artificial Agents

Vanilla policy gradient methods suffer from high variance, leading to unstable policies during training, where the policy’s performance fluctuates drastically between iterations. To address this issue, we analyze the policy optimization process of the navigation method based on deep reinforcement le...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Robotics 2020, Vol.2020 (2020), p.1-7
Hauptverfasser: Zeng, Fanyu, Wang, Chen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Vanilla policy gradient methods suffer from high variance, leading to unstable policies during training, where the policy’s performance fluctuates drastically between iterations. To address this issue, we analyze the policy optimization process of the navigation method based on deep reinforcement learning (DRL) that uses asynchronous gradient descent for optimization. A variant navigation (asynchronous proximal policy optimization navigation, appoNav) is presented that can guarantee the policy monotonic improvement during the process of policy optimization. Our experiments are tested in DeepMind Lab, and the experimental results show that the artificial agents with appoNav perform better than the compared algorithm.
ISSN:1687-9600
1687-9619
DOI:10.1155/2020/8702962