DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning
Recent years have witnessed the great breakthrough of deep reinforcement learning (DRL) in various perfect and imperfect information games. Among these games, DouDizhu, a popular card game in China, is very challenging due to the imperfect information, large state space, elements of collaboration an...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent years have witnessed the great breakthrough of deep reinforcement
learning (DRL) in various perfect and imperfect information games. Among these
games, DouDizhu, a popular card game in China, is very challenging due to the
imperfect information, large state space, elements of collaboration and a
massive number of possible moves from turn to turn. Recently, a DouDizhu AI
system called DouZero has been proposed. Trained using traditional Monte Carlo
method with deep neural networks and self-play procedure without the
abstraction of human prior knowledge, DouZero has outperformed all the existing
DouDizhu AI programs. In this work, we propose to enhance DouZero by
introducing opponent modeling into DouZero. Besides, we propose a novel coach
network to further boost the performance of DouZero and accelerate its training
process. With the integration of the above two techniques into DouZero, our
DouDizhu AI system achieves better performance and ranks top in the Botzone
leaderboard among more than 400 AI agents, including DouZero. |
---|---|
DOI: | 10.48550/arxiv.2204.02558 |