Risk-Aware Model-Based Control

Model-Based Reinforcement Learning (MBRL) algorithms have been shown to have an advantage on data-efficiency, but often overshadowed by state-of-the-art model-free methods in performance, especially when facing high-dimensional and complex problems. In this work, a novel MBRL method is proposed, cal...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Frontiers in robotics and AI 2021-03, Vol.8, p.617839-617839
Hauptverfasser:	Yu, Chen, Rosendo, Andre
Format:	Artikel
Sprache:	eng
Schlagworte:	conditional value at risk data efficiency dynamics model machine learning reinforcement learning risk awareness Robotics and AI
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Model-Based Reinforcement Learning (MBRL) algorithms have been shown to have an advantage on data-efficiency, but often overshadowed by state-of-the-art model-free methods in performance, especially when facing high-dimensional and complex problems. In this work, a novel MBRL method is proposed, called Risk-Aware Model-Based Control (RAMCO). It combines uncertainty-aware deep dynamics models and the risk assessment technique Conditional Value at Risk (CVaR). This mechanism is appropriate for real-world application since it takes epistemic risk into consideration. In addition, we use a model-free solver to produce warm-up training data, and this setting improves the performance in low-dimensional environments and covers the shortage of MBRL's nature in the high-dimensional scenarios. In comparison with other state-of-the-art reinforcement learning algorithms, we show that it produces superior results on a walking robot model. We also evaluate the method with an Eidos environment, which is a novel experimental method with multi-dimensional randomly initialized deep neural networks to measure the performance of any reinforcement learning algorithm, and the advantages of RAMCO are highlighted.
ISSN:	2296-9144 2296-9144
DOI:	10.3389/frobt.2021.617839