Generalized Individual Q-learning for Polymatrix Games with Partial Observations
This paper addresses the challenge of limited observations in non-cooperative multi-agent systems where agents can have partial access to other agents' actions. We present the generalized individual Q-learning dynamics that combine belief-based and payoff-based learning for the networked interc...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper addresses the challenge of limited observations in non-cooperative
multi-agent systems where agents can have partial access to other agents'
actions. We present the generalized individual Q-learning dynamics that combine
belief-based and payoff-based learning for the networked interconnections of
more than two self-interested agents. This approach leverages access to
opponents' actions whenever possible, demonstrably achieving a faster
(guaranteed) convergence to quantal response equilibrium in multi-agent
zero-sum and potential polymatrix games. Notably, the dynamics reduce to the
well-studied smoothed fictitious play and individual Q-learning under full and
no access to opponent actions, respectively. We further quantify the
improvement in convergence rate due to observing opponents' actions through
numerical simulations. |
---|---|
DOI: | 10.48550/arxiv.2409.02663 |