Continuous adaptive critic designs
A continuous formulation of an adaptive critic design (ACD) is investigated. Connections to the discrete case are made, where backpropagation through time (BPTT) and realtime recurrent learning (RTRL) are prevalent. A second order actor adaptation, based on Newton's method, is established for f...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A continuous formulation of an adaptive critic design (ACD) is investigated. Connections to the discrete case are made, where backpropagation through time (BPTT) and realtime recurrent learning (RTRL) are prevalent. A second order actor adaptation, based on Newton's method, is established for fast actor convergence. Also, a fast critic update for concurrent actor-critic training is outlined that keeps the Bellman optimality correct to first order approximation after actor changes. |
---|---|
ISSN: | 2161-4393 2161-4407 |
DOI: | 10.1109/IJCNN.2005.1556403 |