The actor-critic algorithm as multi-time-scale stochastic approximation
The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied.
Gespeichert in:
Veröffentlicht in: | Sadhana (Bangalore) 1997-08, Vol.22 (4), p.525-543 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time scale stochastic approximation. Convergence analysis, approximation issues and an example are studied. |
---|---|
ISSN: | 0256-2499 0973-7677 |
DOI: | 10.1007/BF02745577 |