Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information

Proceedings of Machine Learning Research 2021 This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Dinh, Le Cong, Tran-Thanh, Long, Nguyen, Tri-Dung, Zemkoho, Alain B
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Science and Game Theory
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Dinh, Le Cong Tran-Thanh, Long Nguyen, Tri-Dung Zemkoho, Alain B
description	Proceedings of Machine Learning Research 2021 This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseudo) regret. We develop a no-instant-regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms of the row player, including the multiplicative weight update algorithm, the online mirror descent method/follow-the-regularized-leader, the linear multiplicative weight update algorithm, and the optimistic multiplicative weight update.
doi_str_mv	10.48550/arxiv.2003.11727
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2003_11727</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2003_11727</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-a815ef92b20db8fc394af5b11cceb1ea05459652606231ea88b24cb10dfa2f993</originalsourceid><addsrcrecordid>eNotj0FOwzAURL1hgVoOwApfIMF24sRZVhGUSBGVUFdsom_nu1iqncoxhd6eUFjNPGk00iPknrO8VFKyR4jf7pwLxoqc81rUt-S9hznRt-kzjLSdwhnjAYNBCgu_TlkX5gRhGeAhYqIuLO2EkHCkW_A40y-XPuhmvniPKTpDu2Cn6CG5KazJjYXjjHf_uSL756d9-5L1u23XbvoMqrrOQHGJthFasFEra4qmBCs158ag5ghMlrKppKhYJYqFldKiNJqz0YKwTVOsyMPf7VVuOEXnIV6GX8nhKln8AJDrTUM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information</title><source>arXiv.org</source><creator>Dinh, Le Cong ; Tran-Thanh, Long ; Nguyen, Tri-Dung ; Zemkoho, Alain B</creator><creatorcontrib>Dinh, Le Cong ; Tran-Thanh, Long ; Nguyen, Tri-Dung ; Zemkoho, Alain B</creatorcontrib><description>Proceedings of Machine Learning Research 2021 This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseudo) regret. We develop a no-instant-regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms of the row player, including the multiplicative weight update algorithm, the online mirror descent method/follow-the-regularized-leader, the linear multiplicative weight update algorithm, and the optimistic multiplicative weight update.</description><identifier>DOI: 10.48550/arxiv.2003.11727</identifier><language>eng</language><subject>Computer Science - Computer Science and Game Theory</subject><creationdate>2020-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2003.11727$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2003.11727$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Dinh, Le Cong</creatorcontrib><creatorcontrib>Tran-Thanh, Long</creatorcontrib><creatorcontrib>Nguyen, Tri-Dung</creatorcontrib><creatorcontrib>Zemkoho, Alain B</creatorcontrib><title>Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information</title><description>Proceedings of Machine Learning Research 2021 This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseudo) regret. We develop a no-instant-regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms of the row player, including the multiplicative weight update algorithm, the online mirror descent method/follow-the-regularized-leader, the linear multiplicative weight update algorithm, and the optimistic multiplicative weight update.</description><subject>Computer Science - Computer Science and Game Theory</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj0FOwzAURL1hgVoOwApfIMF24sRZVhGUSBGVUFdsom_nu1iqncoxhd6eUFjNPGk00iPknrO8VFKyR4jf7pwLxoqc81rUt-S9hznRt-kzjLSdwhnjAYNBCgu_TlkX5gRhGeAhYqIuLO2EkHCkW_A40y-XPuhmvniPKTpDu2Cn6CG5KazJjYXjjHf_uSL756d9-5L1u23XbvoMqrrOQHGJthFasFEra4qmBCs158ag5ghMlrKppKhYJYqFldKiNJqz0YKwTVOsyMPf7VVuOEXnIV6GX8nhKln8AJDrTUM</recordid><startdate>20200325</startdate><enddate>20200325</enddate><creator>Dinh, Le Cong</creator><creator>Tran-Thanh, Long</creator><creator>Nguyen, Tri-Dung</creator><creator>Zemkoho, Alain B</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200325</creationdate><title>Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information</title><author>Dinh, Le Cong ; Tran-Thanh, Long ; Nguyen, Tri-Dung ; Zemkoho, Alain B</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-a815ef92b20db8fc394af5b11cceb1ea05459652606231ea88b24cb10dfa2f993</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computer Science and Game Theory</topic><toplevel>online_resources</toplevel><creatorcontrib>Dinh, Le Cong</creatorcontrib><creatorcontrib>Tran-Thanh, Long</creatorcontrib><creatorcontrib>Nguyen, Tri-Dung</creatorcontrib><creatorcontrib>Zemkoho, Alain B</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dinh, Le Cong</au><au>Tran-Thanh, Long</au><au>Nguyen, Tri-Dung</au><au>Zemkoho, Alain B</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information</atitle><date>2020-03-25</date><risdate>2020</risdate><abstract>Proceedings of Machine Learning Research 2021 This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseudo) regret. We develop a no-instant-regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms of the row player, including the multiplicative weight update algorithm, the online mirror descent method/follow-the-regularized-leader, the linear multiplicative weight update algorithm, and the optimistic multiplicative weight update.</abstract><doi>10.48550/arxiv.2003.11727</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2003.11727
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2003_11727
source	arXiv.org
subjects	Computer Science - Computer Science and Game Theory
title	Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T20%3A39%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Last%20Round%20Convergence%20and%20No-Instant%20Regret%20in%20Repeated%20Games%20with%20Asymmetric%20Information&rft.au=Dinh,%20Le%20Cong&rft.date=2020-03-25&rft_id=info:doi/10.48550/arxiv.2003.11727&rft_dat=%3Carxiv_GOX%3E2003_11727%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true