Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek

Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently l...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2019-11
Hauptverfasser:	Dong Quan Vu, Loiseau, Patrick, Silva, Alonso, Tran-Thanh, Long
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Computer & video games Path planning Resource allocation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Dong Quan Vu Loiseau, Patrick Silva, Alonso Tran-Thanh, Long
description	Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently learn in these games. In this work, we show that the online CB and HS games can be cast as path planning problems with side-observations (SOPPP): at each stage, a learner chooses a path on a directed acyclic graph and suffers the sum of losses that are adversarially assigned to the corresponding edges; and she then receives semi-bandit feedback with side-observations (i.e., she observes the losses on the chosen edges plus some others). We propose a novel algorithm, EXP3-OE, the first-of-its-kind with guaranteed efficient running time for SOPPP without requiring any auxiliary oracle. We provide an expected-regret bound of EXP3-OE in SOPPP matching the order of the best benchmark in the literature. Moreover, we introduce additional assumptions on the observability model under which we can further improve the regret bounds of EXP3-OE. We illustrate the benefit of using EXP3-OE in SOPPP by applying it to the online CB and HS games.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2316661019</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2316661019</sourcerecordid><originalsourceid>FETCH-proquest_journals_23166610193</originalsourceid><addsrcrecordid>eNqNi8sKwjAUBYMgWLT_EHAdyMNGXRelKy0ouCwpvdrWeKNJq_j3VvADXB2YmTMikVRKsNVCygmJQ2g551IvZZKoiOxy09U0twaxwQvNvSst3AJ9NQM-NBXQfRnAP03XOAzsVAPS1FmHYMP39qbZEDGDFTsAXGdkfDY2QPzbKZlvN8c0Y3fvHj2Ermhd73FQhVRCay24WKv_qg-LrT2E</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2316661019</pqid></control><display><type>article</type><title>Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek</title><source>Free E- Journals</source><creator>Dong Quan Vu ; Loiseau, Patrick ; Silva, Alonso ; Tran-Thanh, Long</creator><creatorcontrib>Dong Quan Vu ; Loiseau, Patrick ; Silva, Alonso ; Tran-Thanh, Long</creatorcontrib><description>Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently learn in these games. In this work, we show that the online CB and HS games can be cast as path planning problems with side-observations (SOPPP): at each stage, a learner chooses a path on a directed acyclic graph and suffers the sum of losses that are adversarially assigned to the corresponding edges; and she then receives semi-bandit feedback with side-observations (i.e., she observes the losses on the chosen edges plus some others). We propose a novel algorithm, EXP3-OE, the first-of-its-kind with guaranteed efficient running time for SOPPP without requiring any auxiliary oracle. We provide an expected-regret bound of EXP3-OE in SOPPP matching the order of the best benchmark in the literature. Moreover, we introduce additional assumptions on the observability model under which we can further improve the regret bounds of EXP3-OE. We illustrate the benefit of using EXP3-OE in SOPPP by applying it to the online CB and HS games.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Computer & video games ; Path planning ; Resource allocation</subject><ispartof>arXiv.org, 2019-11</ispartof><rights>2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Dong Quan Vu</creatorcontrib><creatorcontrib>Loiseau, Patrick</creatorcontrib><creatorcontrib>Silva, Alonso</creatorcontrib><creatorcontrib>Tran-Thanh, Long</creatorcontrib><title>Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek</title><title>arXiv.org</title><description>Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently learn in these games. In this work, we show that the online CB and HS games can be cast as path planning problems with side-observations (SOPPP): at each stage, a learner chooses a path on a directed acyclic graph and suffers the sum of losses that are adversarially assigned to the corresponding edges; and she then receives semi-bandit feedback with side-observations (i.e., she observes the losses on the chosen edges plus some others). We propose a novel algorithm, EXP3-OE, the first-of-its-kind with guaranteed efficient running time for SOPPP without requiring any auxiliary oracle. We provide an expected-regret bound of EXP3-OE in SOPPP matching the order of the best benchmark in the literature. Moreover, we introduce additional assumptions on the observability model under which we can further improve the regret bounds of EXP3-OE. We illustrate the benefit of using EXP3-OE in SOPPP by applying it to the online CB and HS games.</description><subject>Algorithms</subject><subject>Computer & video games</subject><subject>Path planning</subject><subject>Resource allocation</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNi8sKwjAUBYMgWLT_EHAdyMNGXRelKy0ouCwpvdrWeKNJq_j3VvADXB2YmTMikVRKsNVCygmJQ2g551IvZZKoiOxy09U0twaxwQvNvSst3AJ9NQM-NBXQfRnAP03XOAzsVAPS1FmHYMP39qbZEDGDFTsAXGdkfDY2QPzbKZlvN8c0Y3fvHj2Ermhd73FQhVRCay24WKv_qg-LrT2E</recordid><startdate>20191121</startdate><enddate>20191121</enddate><creator>Dong Quan Vu</creator><creator>Loiseau, Patrick</creator><creator>Silva, Alonso</creator><creator>Tran-Thanh, Long</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20191121</creationdate><title>Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek</title><author>Dong Quan Vu ; Loiseau, Patrick ; Silva, Alonso ; Tran-Thanh, Long</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_23166610193</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Computer & video games</topic><topic>Path planning</topic><topic>Resource allocation</topic><toplevel>online_resources</toplevel><creatorcontrib>Dong Quan Vu</creatorcontrib><creatorcontrib>Loiseau, Patrick</creatorcontrib><creatorcontrib>Silva, Alonso</creatorcontrib><creatorcontrib>Tran-Thanh, Long</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Dong Quan Vu</au><au>Loiseau, Patrick</au><au>Silva, Alonso</au><au>Tran-Thanh, Long</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek</atitle><jtitle>arXiv.org</jtitle><date>2019-11-21</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>Resource allocation games such as the famous Colonel Blotto (CB) and Hide-and-Seek (HS) games are often used to model a large variety of practical problems, but only in their one-shot versions. Indeed, due to their extremely large strategy space, it remains an open question how one can efficiently learn in these games. In this work, we show that the online CB and HS games can be cast as path planning problems with side-observations (SOPPP): at each stage, a learner chooses a path on a directed acyclic graph and suffers the sum of losses that are adversarially assigned to the corresponding edges; and she then receives semi-bandit feedback with side-observations (i.e., she observes the losses on the chosen edges plus some others). We propose a novel algorithm, EXP3-OE, the first-of-its-kind with guaranteed efficient running time for SOPPP without requiring any auxiliary oracle. We provide an expected-regret bound of EXP3-OE in SOPPP matching the order of the best benchmark in the literature. Moreover, we introduce additional assumptions on the observability model under which we can further improve the regret bounds of EXP3-OE. We illustrate the benefit of using EXP3-OE in SOPPP by applying it to the online CB and HS games.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2019-11
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2316661019
source	Free E- Journals
subjects	Algorithms Computer & video games Path planning Resource allocation
title	Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T18%3A16%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Path%20Planning%20Problems%20with%20Side%20Observations-When%20Colonels%20Play%20Hide-and-Seek&rft.jtitle=arXiv.org&rft.au=Dong%20Quan%20Vu&rft.date=2019-11-21&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2316661019%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2316661019&rft_id=info:pmid/&rfr_iscdi=true