Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality

Multi-agent systems arise in several domains of engineering and can be used to solve problems which are difficult for an individual agent to solve. Strategies for team decision problems, including optimal control, N-player games (H-infinity control and non-zero sum), and so on are normally solved fo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Automatica (Oxford) 2012-08, Vol.48 (8), p.1598-1611
Hauptverfasser: Vamvoudakis, Kyriakos G., Lewis, Frank L., Hudas, Greg R.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multi-agent systems arise in several domains of engineering and can be used to solve problems which are difficult for an individual agent to solve. Strategies for team decision problems, including optimal control, N-player games (H-infinity control and non-zero sum), and so on are normally solved for off-line by solving associated matrix equations such as the coupled Riccati equations or coupled Hamilton–Jacobi equations. However, using that approach players cannot change their objectives online in real time without calling for a completely new off-line solution for the new strategies. Therefore, in this paper we bring together cooperative control, reinforcement learning, and game theory to present a multi-agent formulation for the online solution of team games. The notion of graphical games is developed for dynamical systems, where the dynamics and performance indices for each node depend only on local neighbor information. It is shown that standard definitions for Nash equilibrium are not sufficient for graphical games and a new definition of “Interactive Nash Equilibrium” is given. We give a cooperative policy iteration algorithm for graphical games that converges to the best response when the neighbors of each agent do not update their policies, and to the cooperative Nash equilibrium when all agents update their policies simultaneously. This is used to develop methods for online adaptive learning solutions of graphical games in real time along with proofs of stability and convergence.
ISSN:0005-1098
1873-2836
DOI:10.1016/j.automatica.2012.05.074