Off-Policy Actor-Critic with Emphatic Weightings

Journal of Machine Learning Research 24 (2023) 1-63 A variety of theoretically-sound policy gradient algorithms exist for the on-policy setting due to the policy gradient theorem, which provides a simplified form for the gradient. The off-policy setting, however, has been less clear due to the exist...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Graves, Eric, Imani, Ehsan, Kumaraswamy, Raksha, White, Martha
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!