Asynchronous Parallel Policy Gradient Methods for the Linear Quadratic Regulator

Learning policies in an asynchronous parallel way is essential to the numerous successes of RL for solving large-scale problems. However, their convergence performance is still not rigorously evaluated. To this end, we adopt the asynchronous parallel zero-order policy gradient (AZOPG) method to solv...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Sha, Xingyu, Zhao, Feiran, You, Keyou
Format:	Artikel
Sprache:	eng
Schlagworte:	Mathematics - Optimization and Control
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!