Tree Search-Based Policy Optimization under Stochastic Execution Delay

The standard formulation of Markov decision processes (MDPs) assumes that the agent's decisions are executed immediately. However, in numerous realistic applications such as robotics or healthcare, actions are performed with a delay whose value can even be stochastic. In this work, we introduce...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-04
Hauptverfasser:	Valensi, David, Derman, Esther, Mannor, Shie, Dalal, Gal
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Delay Markov processes Policies Robotics Searching
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!