Score-Based Equilibrium Learning in Multi-Player Finite Games with Imperfect Information
Real-world games, which concern imperfect information, multiple players, and simultaneous moves, are less frequently discussed in the existing literature of game theory. While reinforcement learning (RL) provides a general framework to extend the game theoretical algorithms, the assumptions that gua...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Real-world games, which concern imperfect information, multiple players, and
simultaneous moves, are less frequently discussed in the existing literature of
game theory. While reinforcement learning (RL) provides a general framework to
extend the game theoretical algorithms, the assumptions that guarantee their
convergence towards Nash equilibria may no longer hold in real-world games.
Starting from the definition of the Nash distribution, we construct a
continuous-time dynamic named imperfect-information exponential-decay
score-based learning (IESL) to find approximate Nash equilibria in games with
the above-mentioned features. Theoretical analysis demonstrates that IESL
yields equilibrium-approaching policies in imperfect information simultaneous
games with the basic assumption of concavity. Experimental results show that
IESL manages to find approximate Nash equilibria in four canonical poker
scenarios and significantly outperforms three other representative algorithms
in 3-player Leduc poker, manifesting its equilibrium-finding ability even in
practical sequential games. Furthermore, related to the concept of game
hypomonotonicity, a trade-off between the convergence of the IESL dynamic and
the ultimate NashConv of the convergent policies is observed from the
perspectives of both theory and experiment. |
---|---|
DOI: | 10.48550/arxiv.2306.00350 |