Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information
Proceedings of Machine Learning Research 2021 This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppo...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Dinh, Le Cong Tran-Thanh, Long Nguyen, Tri-Dung Zemkoho, Alain B |
description | Proceedings of Machine Learning Research 2021 This paper considers repeated games in which one player has more information
about the game than the other players. In particular, we investigate repeated
two-player zero-sum games where only the column player knows the payoff matrix
A of the game. Suppose that while repeatedly playing this game, the row player
chooses her strategy at each round by using a no-regret algorithm to minimize
her (pseudo) regret. We develop a no-instant-regret algorithm for the column
player to exhibit last round convergence to a minimax equilibrium. We show that
our algorithm is efficient against a large set of popular no-regret algorithms
of the row player, including the multiplicative weight update algorithm, the
online mirror descent method/follow-the-regularized-leader, the linear
multiplicative weight update algorithm, and the optimistic multiplicative
weight update. |
doi_str_mv | 10.48550/arxiv.2003.11727 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2003_11727</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2003_11727</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-a815ef92b20db8fc394af5b11cceb1ea05459652606231ea88b24cb10dfa2f993</originalsourceid><addsrcrecordid>eNotj0FOwzAURL1hgVoOwApfIMF24sRZVhGUSBGVUFdsom_nu1iqncoxhd6eUFjNPGk00iPknrO8VFKyR4jf7pwLxoqc81rUt-S9hznRt-kzjLSdwhnjAYNBCgu_TlkX5gRhGeAhYqIuLO2EkHCkW_A40y-XPuhmvniPKTpDu2Cn6CG5KazJjYXjjHf_uSL756d9-5L1u23XbvoMqrrOQHGJthFasFEra4qmBCs158ag5ghMlrKppKhYJYqFldKiNJqz0YKwTVOsyMPf7VVuOEXnIV6GX8nhKln8AJDrTUM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information</title><source>arXiv.org</source><creator>Dinh, Le Cong ; Tran-Thanh, Long ; Nguyen, Tri-Dung ; Zemkoho, Alain B</creator><creatorcontrib>Dinh, Le Cong ; Tran-Thanh, Long ; Nguyen, Tri-Dung ; Zemkoho, Alain B</creatorcontrib><description>Proceedings of Machine Learning Research 2021 This paper considers repeated games in which one player has more information
about the game than the other players. In particular, we investigate repeated
two-player zero-sum games where only the column player knows the payoff matrix
A of the game. Suppose that while repeatedly playing this game, the row player
chooses her strategy at each round by using a no-regret algorithm to minimize
her (pseudo) regret. We develop a no-instant-regret algorithm for the column
player to exhibit last round convergence to a minimax equilibrium. We show that
our algorithm is efficient against a large set of popular no-regret algorithms
of the row player, including the multiplicative weight update algorithm, the
online mirror descent method/follow-the-regularized-leader, the linear
multiplicative weight update algorithm, and the optimistic multiplicative
weight update.</description><identifier>DOI: 10.48550/arxiv.2003.11727</identifier><language>eng</language><subject>Computer Science - Computer Science and Game Theory</subject><creationdate>2020-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2003.11727$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2003.11727$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Dinh, Le Cong</creatorcontrib><creatorcontrib>Tran-Thanh, Long</creatorcontrib><creatorcontrib>Nguyen, Tri-Dung</creatorcontrib><creatorcontrib>Zemkoho, Alain B</creatorcontrib><title>Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information</title><description>Proceedings of Machine Learning Research 2021 This paper considers repeated games in which one player has more information
about the game than the other players. In particular, we investigate repeated
two-player zero-sum games where only the column player knows the payoff matrix
A of the game. Suppose that while repeatedly playing this game, the row player
chooses her strategy at each round by using a no-regret algorithm to minimize
her (pseudo) regret. We develop a no-instant-regret algorithm for the column
player to exhibit last round convergence to a minimax equilibrium. We show that
our algorithm is efficient against a large set of popular no-regret algorithms
of the row player, including the multiplicative weight update algorithm, the
online mirror descent method/follow-the-regularized-leader, the linear
multiplicative weight update algorithm, and the optimistic multiplicative
weight update.</description><subject>Computer Science - Computer Science and Game Theory</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj0FOwzAURL1hgVoOwApfIMF24sRZVhGUSBGVUFdsom_nu1iqncoxhd6eUFjNPGk00iPknrO8VFKyR4jf7pwLxoqc81rUt-S9hznRt-kzjLSdwhnjAYNBCgu_TlkX5gRhGeAhYqIuLO2EkHCkW_A40y-XPuhmvniPKTpDu2Cn6CG5KazJjYXjjHf_uSL756d9-5L1u23XbvoMqrrOQHGJthFasFEra4qmBCs158ag5ghMlrKppKhYJYqFldKiNJqz0YKwTVOsyMPf7VVuOEXnIV6GX8nhKln8AJDrTUM</recordid><startdate>20200325</startdate><enddate>20200325</enddate><creator>Dinh, Le Cong</creator><creator>Tran-Thanh, Long</creator><creator>Nguyen, Tri-Dung</creator><creator>Zemkoho, Alain B</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200325</creationdate><title>Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information</title><author>Dinh, Le Cong ; Tran-Thanh, Long ; Nguyen, Tri-Dung ; Zemkoho, Alain B</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-a815ef92b20db8fc394af5b11cceb1ea05459652606231ea88b24cb10dfa2f993</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computer Science and Game Theory</topic><toplevel>online_resources</toplevel><creatorcontrib>Dinh, Le Cong</creatorcontrib><creatorcontrib>Tran-Thanh, Long</creatorcontrib><creatorcontrib>Nguyen, Tri-Dung</creatorcontrib><creatorcontrib>Zemkoho, Alain B</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dinh, Le Cong</au><au>Tran-Thanh, Long</au><au>Nguyen, Tri-Dung</au><au>Zemkoho, Alain B</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information</atitle><date>2020-03-25</date><risdate>2020</risdate><abstract>Proceedings of Machine Learning Research 2021 This paper considers repeated games in which one player has more information
about the game than the other players. In particular, we investigate repeated
two-player zero-sum games where only the column player knows the payoff matrix
A of the game. Suppose that while repeatedly playing this game, the row player
chooses her strategy at each round by using a no-regret algorithm to minimize
her (pseudo) regret. We develop a no-instant-regret algorithm for the column
player to exhibit last round convergence to a minimax equilibrium. We show that
our algorithm is efficient against a large set of popular no-regret algorithms
of the row player, including the multiplicative weight update algorithm, the
online mirror descent method/follow-the-regularized-leader, the linear
multiplicative weight update algorithm, and the optimistic multiplicative
weight update.</abstract><doi>10.48550/arxiv.2003.11727</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2003.11727 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2003_11727 |
source | arXiv.org |
subjects | Computer Science - Computer Science and Game Theory |
title | Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T20%3A39%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Last%20Round%20Convergence%20and%20No-Instant%20Regret%20in%20Repeated%20Games%20with%20Asymmetric%20Information&rft.au=Dinh,%20Le%20Cong&rft.date=2020-03-25&rft_id=info:doi/10.48550/arxiv.2003.11727&rft_dat=%3Carxiv_GOX%3E2003_11727%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |