DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning

Recent years have witnessed the great breakthrough of deep reinforcement learning (DRL) in various perfect and imperfect information games. Among these games, DouDizhu, a popular card game in China, is very challenging due to the imperfect information, large state space, elements of collaboration an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Zhao, Youpeng, Zhao, Jian, Hu, Xunhan, Zhou, Wengang, Li, Houqiang
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Zhao, Youpeng Zhao, Jian Hu, Xunhan Zhou, Wengang Li, Houqiang
description	Recent years have witnessed the great breakthrough of deep reinforcement learning (DRL) in various perfect and imperfect information games. Among these games, DouDizhu, a popular card game in China, is very challenging due to the imperfect information, large state space, elements of collaboration and a massive number of possible moves from turn to turn. Recently, a DouDizhu AI system called DouZero has been proposed. Trained using traditional Monte Carlo method with deep neural networks and self-play procedure without the abstraction of human prior knowledge, DouZero has outperformed all the existing DouDizhu AI programs. In this work, we propose to enhance DouZero by introducing opponent modeling into DouZero. Besides, we propose a novel coach network to further boost the performance of DouZero and accelerate its training process. With the integration of the above two techniques into DouZero, our DouDizhu AI system achieves better performance and ranks top in the Botzone leaderboard among more than 400 AI agents, including DouZero.
doi_str_mv	10.48550/arxiv.2204.02558
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2204_02558</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2204_02558</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-3162fc265b5d3ef0d5c0c1c3545afcb9bd9bd42bb0e6d9135c3833e08f6ab1a83</originalsourceid><addsrcrecordid>eNotj01Lw0AYhPfiQao_wJN7l8T9yLtuvZXUj0C0l568hHe_2kC7G1ZTrL_etAoDAzPDwEPIDWdlpQHYPebv_lAKwaqSCQB9Sd6XafzwOd090mY_5HTo44ZO2bL_2Y500VBzpKthSNHHL_qWnN-dBhgdrRPabbEZe-cdbT3mODVX5CLg7tNf__uMrJ-f1vVr0a5emnrRFqgedCG5EsEKBQac9IE5sMxyK6ECDNbMjZtUCWOYV27OJVippfRMB4WGo5Yzcvt3ewbqhtzvMR-7E1h3BpO_CTJIeA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning</title><source>arXiv.org</source><creator>Zhao, Youpeng ; Zhao, Jian ; Hu, Xunhan ; Zhou, Wengang ; Li, Houqiang</creator><creatorcontrib>Zhao, Youpeng ; Zhao, Jian ; Hu, Xunhan ; Zhou, Wengang ; Li, Houqiang</creatorcontrib><description>Recent years have witnessed the great breakthrough of deep reinforcement learning (DRL) in various perfect and imperfect information games. Among these games, DouDizhu, a popular card game in China, is very challenging due to the imperfect information, large state space, elements of collaboration and a massive number of possible moves from turn to turn. Recently, a DouDizhu AI system called DouZero has been proposed. Trained using traditional Monte Carlo method with deep neural networks and self-play procedure without the abstraction of human prior knowledge, DouZero has outperformed all the existing DouDizhu AI programs. In this work, we propose to enhance DouZero by introducing opponent modeling into DouZero. Besides, we propose a novel coach network to further boost the performance of DouZero and accelerate its training process. With the integration of the above two techniques into DouZero, our DouDizhu AI system achieves better performance and ranks top in the Botzone leaderboard among more than 400 AI agents, including DouZero.</description><identifier>DOI: 10.48550/arxiv.2204.02558</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2022-04</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2204.02558$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2204.02558$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhao, Youpeng</creatorcontrib><creatorcontrib>Zhao, Jian</creatorcontrib><creatorcontrib>Hu, Xunhan</creatorcontrib><creatorcontrib>Zhou, Wengang</creatorcontrib><creatorcontrib>Li, Houqiang</creatorcontrib><title>DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning</title><description>Recent years have witnessed the great breakthrough of deep reinforcement learning (DRL) in various perfect and imperfect information games. Among these games, DouDizhu, a popular card game in China, is very challenging due to the imperfect information, large state space, elements of collaboration and a massive number of possible moves from turn to turn. Recently, a DouDizhu AI system called DouZero has been proposed. Trained using traditional Monte Carlo method with deep neural networks and self-play procedure without the abstraction of human prior knowledge, DouZero has outperformed all the existing DouDizhu AI programs. In this work, we propose to enhance DouZero by introducing opponent modeling into DouZero. Besides, we propose a novel coach network to further boost the performance of DouZero and accelerate its training process. With the integration of the above two techniques into DouZero, our DouDizhu AI system achieves better performance and ranks top in the Botzone leaderboard among more than 400 AI agents, including DouZero.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj01Lw0AYhPfiQao_wJN7l8T9yLtuvZXUj0C0l568hHe_2kC7G1ZTrL_etAoDAzPDwEPIDWdlpQHYPebv_lAKwaqSCQB9Sd6XafzwOd090mY_5HTo44ZO2bL_2Y500VBzpKthSNHHL_qWnN-dBhgdrRPabbEZe-cdbT3mODVX5CLg7tNf__uMrJ-f1vVr0a5emnrRFqgedCG5EsEKBQac9IE5sMxyK6ECDNbMjZtUCWOYV27OJVippfRMB4WGo5Yzcvt3ewbqhtzvMR-7E1h3BpO_CTJIeA</recordid><startdate>20220405</startdate><enddate>20220405</enddate><creator>Zhao, Youpeng</creator><creator>Zhao, Jian</creator><creator>Hu, Xunhan</creator><creator>Zhou, Wengang</creator><creator>Li, Houqiang</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220405</creationdate><title>DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning</title><author>Zhao, Youpeng ; Zhao, Jian ; Hu, Xunhan ; Zhou, Wengang ; Li, Houqiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-3162fc265b5d3ef0d5c0c1c3545afcb9bd9bd42bb0e6d9135c3833e08f6ab1a83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhao, Youpeng</creatorcontrib><creatorcontrib>Zhao, Jian</creatorcontrib><creatorcontrib>Hu, Xunhan</creatorcontrib><creatorcontrib>Zhou, Wengang</creatorcontrib><creatorcontrib>Li, Houqiang</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhao, Youpeng</au><au>Zhao, Jian</au><au>Hu, Xunhan</au><au>Zhou, Wengang</au><au>Li, Houqiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning</atitle><date>2022-04-05</date><risdate>2022</risdate><abstract>Recent years have witnessed the great breakthrough of deep reinforcement learning (DRL) in various perfect and imperfect information games. Among these games, DouDizhu, a popular card game in China, is very challenging due to the imperfect information, large state space, elements of collaboration and a massive number of possible moves from turn to turn. Recently, a DouDizhu AI system called DouZero has been proposed. Trained using traditional Monte Carlo method with deep neural networks and self-play procedure without the abstraction of human prior knowledge, DouZero has outperformed all the existing DouDizhu AI programs. In this work, we propose to enhance DouZero by introducing opponent modeling into DouZero. Besides, we propose a novel coach network to further boost the performance of DouZero and accelerate its training process. With the integration of the above two techniques into DouZero, our DouDizhu AI system achieves better performance and ranks top in the Botzone leaderboard among more than 400 AI agents, including DouZero.</abstract><doi>10.48550/arxiv.2204.02558</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2204.02558
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2204_02558
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Learning
title	DouZero+: Improving DouDizhu AI by Opponent Modeling and Coach-guided Learning
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T07%3A59%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DouZero+:%20Improving%20DouDizhu%20AI%20by%20Opponent%20Modeling%20and%20Coach-guided%20Learning&rft.au=Zhao,%20Youpeng&rft.date=2022-04-05&rft_id=info:doi/10.48550/arxiv.2204.02558&rft_dat=%3Carxiv_GOX%3E2204_02558%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true