Solving Two-Player General-Sum Games Between Swarms
Hamilton-Jacobi-Isaacs (HJI) PDEs are the governing equations for the two-player general-sum games. Unlike Reinforcement Learning (RL) methods, which are data-intensive methods for learning value function, learning HJ PDEs provide a guaranteed convergence to the Nash Equilibrium value of the game wh...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Hamilton-Jacobi-Isaacs (HJI) PDEs are the governing equations for the
two-player general-sum games. Unlike Reinforcement Learning (RL) methods, which
are data-intensive methods for learning value function, learning HJ PDEs
provide a guaranteed convergence to the Nash Equilibrium value of the game when
it exists. However, a caveat is that solving HJ PDEs becomes intractable when
the state dimension increases. To circumvent the curse of dimensionality (CoD),
physics-informed machine learning methods with supervision can be used and have
been shown to be effective in generating equilibrial policies in two-player
general-sum games. In this work, we extend the existing work on agent-level
two-player games to a two-player swarm-level game, where two sub-swarms play a
general-sum game. We consider the \textit{Kolmogorov forward equation} as the
dynamic model for the evolution of the densities of the swarms. Results show
that policies generated from the physics-informed neural network (PINN) result
in a higher payoff than a Nash Double Deep Q-Network (Nash DDQN) agent and have
comparable performance with numerical solvers. |
---|---|
DOI: | 10.48550/arxiv.2310.01682 |