Off-Policy: Model-Free Optimal Synchronization Control for Complex Dynamical Networks
In this paper, a novel off-policy iterative algorithm is developed, which only uses the measurement data along the trajectory of the system to deal with the optimal control problem of the discrete-time complex dynamic networks. By approximating the solutions of the coupled Hamilton–Jacobi–Bellman eq...
Gespeichert in:
Veröffentlicht in: | Neural processing letters 2022-08, Vol.54 (4), p.2941-2958 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, a novel off-policy iterative algorithm is developed, which only uses the measurement data along the trajectory of the system to deal with the optimal control problem of the discrete-time complex dynamic networks. By approximating the solutions of the coupled Hamilton–Jacobi–Bellman equations, a local performance index is defined to solve the optimal synchronization problem for discrete-time nonlinear complex dynamic networks without knowing the node dynamics and the topology of the directed graph. Based on this, an off-policy iteration algorithm is designed to iteratively improve the target policy, and the convergence of the algorithm is proved theoretically. Actor-critic neural networks along with the gradient descent approach are employed to approximate optimal control policies and performance index functions using the data generated by applying prescribed behavior policies. Finally, two numerical simulation examples are given to show the effectiveness of our proposed method. |
---|---|
ISSN: | 1370-4621 1573-773X |
DOI: | 10.1007/s11063-022-10748-2 |