PP-PG: Combining Parameter Perturbation with Policy Gradient Methods for Effective and Efficient Explorations in Deep Reinforcement Learning

Efficient and stable exploration remains a key challenge for deep reinforcement learning (DRL) operating in high-dimensional action and state spaces. Recently, a more promising approach by combining the exploration in the action space with the exploration in the parameters space has been proposed to...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on intelligent systems and technology 2021-07, Vol.12 (3), p.1-21
Hauptverfasser:	Li, Shilei, Li, Meng, Su, Jiongming, Chen, Shaofei, Yuan, Zhimin, Ye, Qing
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!