Generating and Adapting to Diverse Ad Hoc Partners in Hanabi
Hanabi is a cooperative game that brings the problem of modeling other players to the forefront. In this game, coordinated groups of players can leverage preestablished conventions to great effect. In this article, we focus on ad hoc settings with no previous coordination between partners. We introd...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on games 2023-06, Vol.15 (2), p.228-241 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Hanabi is a cooperative game that brings the problem of modeling other players to the forefront. In this game, coordinated groups of players can leverage preestablished conventions to great effect. In this article, we focus on ad hoc settings with no previous coordination between partners. We introduce a "Bayesian Meta-Agent" that maintains a belief distribution over hypotheses of partner policies. The policies that serve as initial hypotheses are generated using MAP-Elites, to ensure behavioral diversity. We evaluate an "Adaptive" version of the agent, which selects a response policy based on the updated belief distribution and a "Generalist" version, which selects a response based on the uniform prior. In short episodes of ten games with a consistent partner, the "Adaptive" version outperforms the "Generalist" when the training and evaluation populations are the same. This presents a first step toward an agent that can model its partner and adapt within a time frame that is compatible with human interaction. |
---|---|
ISSN: | 2475-1502 2475-1510 |
DOI: | 10.1109/TG.2022.3169168 |