Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning

In this paper, a data-driven optimal control method based on adaptive dynamic programming and game theory is presented for solving the output feedback solutions of the H ∞ control problem for linear discrete-time systems with multiple players subject to multi-source disturbances. We first transform...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2020, Vol.8, p.208938-208951
Hauptverfasser:	Xiao, Zhenfei, Li, Jinna, Li, Ping
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive control adaptive dynamic programming Algorithms Control methods Discrete time systems Disturbances Dynamic programming Game theory Games H-infinity control Heuristic algorithms H∞ control Machine learning Nash equilibrium Optimal control Output feedback Performance analysis reinforcement learning System dynamics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this paper, a data-driven optimal control method based on adaptive dynamic programming and game theory is presented for solving the output feedback solutions of the H ∞ control problem for linear discrete-time systems with multiple players subject to multi-source disturbances. We first transform the H ∞ control problem into a multi-player game problem following the theoretical solutions according to game theory. Since the system state may not be measurable, we derive the output feedback based control policies and disturbances through mathematical operations. Considering the advantages of off-policy reinforcement learning (RL) over on-policy RL, a novel off-policy game Q-learning algorithm dealing with mixed competition and cooperation among players is developed, such that the H ∞ control problem can be finally solved for linear multi-player systems without the knowledge of system dynamics. Moreover, rigorous proofs of algorithm convergence and unbiasedness of solutions are presented. Finally, simulation results demonstrated the effectiveness of the proposed method.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2020.3038674