Pontryagin Neural Operator for Solving Parametric General-Sum Differential Games
The values of two-player general-sum differential games are viscosity solutions to Hamilton-Jacobi-Isaacs (HJI) equations. Value and policy approximations for such games suffer from the curse of dimensionality (CoD). Alleviating CoD through physics-informed neural networks (PINN) encounters converge...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The values of two-player general-sum differential games are viscosity
solutions to Hamilton-Jacobi-Isaacs (HJI) equations. Value and policy
approximations for such games suffer from the curse of dimensionality (CoD).
Alleviating CoD through physics-informed neural networks (PINN) encounters
convergence issues when differentiable values with large Lipschitz constants
are present due to state constraints. On top of these challenges, it is often
necessary to learn generalizable values and policies across a parametric space
of games, e.g., for game parameter inference when information is incomplete. To
address these challenges, we propose in this paper a Pontryagin-mode neural
operator that outperforms the current state-of-the-art hybrid PINN model on
safety performance across games with parametric state constraints. Our key
contribution is the introduction of a costate loss defined on the discrepancy
between forward and backward costate rollouts, which are computationally cheap.
We show that the costate dynamics, which can reflect state constraint
violation, effectively enables the learning of differentiable values with large
Lipschitz constants, without requiring manually supervised data as suggested by
the hybrid PINN model. More importantly, we show that the close relationship
between costates and policies makes the former critical in learning feedback
control policies with generalizable safety performance. |
---|---|
DOI: | 10.48550/arxiv.2401.01502 |