Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme

We study the best-arm identification problem in multi-armed bandits with stochastic rewards when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a successive elimination algorithm for strictly optimal best-arm identification, show that it is...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE journal on selected areas in information theory 2021-06, Vol.2 (2), p.534-548
Hauptverfasser: Nikolakakis, Konstantinos E., Kalogerias, Dionysios S., Sheffet, Or, Sarwate, Anand D.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 548
container_issue 2
container_start_page 534
container_title IEEE journal on selected areas in information theory
container_volume 2
creator Nikolakakis, Konstantinos E.
Kalogerias, Dionysios S.
Sheffet, Or
Sarwate, Anand D.
description We study the best-arm identification problem in multi-armed bandits with stochastic rewards when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a successive elimination algorithm for strictly optimal best-arm identification, show that it is \delta -PAC and characterize its sample complexity. Further, we provide a lower bound on the expected number of pulls, showing that the proposed algorithm is essentially optimal up to logarithmic factors. Both upper and lower complexity bounds depend on a special definition of the associated suboptimality gap, designed in particular for the quantile bandit problem - as we show, when the gap approaches zero, best-arm identification is impossible. Second, motivated by applications where the rewards are private information, we provide a differentially private successive elimination algorithm whose sample complexity is finite even for distributions with infinite support and characterize its sample complexity. Our algorithms do not require prior knowledge of either the suboptimality gap or other statistical information related to the bandit problem at hand.
doi_str_mv 10.1109/JSAIT.2021.3081525
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2542500318</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9435774</ieee_id><sourcerecordid>2542500318</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2545-4629c99729935e07fb6fbec4804ba1abb728d4b9f2583ce7f2ea246ab6d7f6243</originalsourceid><addsrcrecordid>eNpNkF1LwzAUhosoOHR_QG8CXncmp0nTeLfNr8lkyuZ1SNsTzOi6mbbC_r2pG-LVOfC-7_l4ouiK0RFjVN2-LMez1QgosFFCMyZAnEQDSDmLMynp6b_-PBo2zZpSCsC4zOQgMu-dqVtXIXntqtbFY7_BkkxMXbq2uSOLXes2piITbNpeI7MSg926wrRuW5PgI4bcO2vR94Kpqj158-7btEiWxSdu8DI6s6ZqcHisF9HH48Nq-hzPF0-z6XgeFyC4iHkKqlBKglKJQCptntocC55Rnhtm8lxCVvJcWRBZUqC0gAZ4avK0lDYFnlxEN4e5O7_96sK9er3tfB1W6rAABKUJy4ILDq7Cb5vGo9U7Hz70e82o7mnqX5q6p6mPNEPo-hByiPgXUDwRUvLkB6hGcD4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2542500318</pqid></control><display><type>article</type><title>Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme</title><source>IEEE Electronic Library (IEL)</source><creator>Nikolakakis, Konstantinos E. ; Kalogerias, Dionysios S. ; Sheffet, Or ; Sarwate, Anand D.</creator><creatorcontrib>Nikolakakis, Konstantinos E. ; Kalogerias, Dionysios S. ; Sheffet, Or ; Sarwate, Anand D.</creatorcontrib><description>We study the best-arm identification problem in multi-armed bandits with stochastic rewards when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a successive elimination algorithm for strictly optimal best-arm identification, show that it is &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;\delta &lt;/tex-math&gt;&lt;/inline-formula&gt;-PAC and characterize its sample complexity. Further, we provide a lower bound on the expected number of pulls, showing that the proposed algorithm is essentially optimal up to logarithmic factors. Both upper and lower complexity bounds depend on a special definition of the associated suboptimality gap, designed in particular for the quantile bandit problem - as we show, when the gap approaches zero, best-arm identification is impossible. Second, motivated by applications where the rewards are private information, we provide a differentially private successive elimination algorithm whose sample complexity is finite even for distributions with infinite support and characterize its sample complexity. Our algorithms do not require prior knowledge of either the suboptimality gap or other statistical information related to the bandit problem at hand.</description><identifier>ISSN: 2641-8770</identifier><identifier>EISSN: 2641-8770</identifier><identifier>DOI: 10.1109/JSAIT.2021.3081525</identifier><identifier>CODEN: IJSTL5</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Approximation algorithms ; best-arm identification ; Complexity ; Complexity theory ; differential privacy ; Estimation ; Information theory ; Lower bounds ; Multi-armed bandit problems ; Quantile bandits ; sequential estimation ; Strain ; Time measurement ; Upper bound ; value at risk</subject><ispartof>IEEE journal on selected areas in information theory, 2021-06, Vol.2 (2), p.534-548</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2545-4629c99729935e07fb6fbec4804ba1abb728d4b9f2583ce7f2ea246ab6d7f6243</citedby><cites>FETCH-LOGICAL-c2545-4629c99729935e07fb6fbec4804ba1abb728d4b9f2583ce7f2ea246ab6d7f6243</cites><orcidid>0000-0001-5165-3317 ; 0000-0002-5182-0530 ; 0000-0002-3459-5044 ; 0000-0001-6123-5282</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9435774$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9435774$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Nikolakakis, Konstantinos E.</creatorcontrib><creatorcontrib>Kalogerias, Dionysios S.</creatorcontrib><creatorcontrib>Sheffet, Or</creatorcontrib><creatorcontrib>Sarwate, Anand D.</creatorcontrib><title>Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme</title><title>IEEE journal on selected areas in information theory</title><addtitle>JSAIT</addtitle><description>We study the best-arm identification problem in multi-armed bandits with stochastic rewards when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a successive elimination algorithm for strictly optimal best-arm identification, show that it is &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;\delta &lt;/tex-math&gt;&lt;/inline-formula&gt;-PAC and characterize its sample complexity. Further, we provide a lower bound on the expected number of pulls, showing that the proposed algorithm is essentially optimal up to logarithmic factors. Both upper and lower complexity bounds depend on a special definition of the associated suboptimality gap, designed in particular for the quantile bandit problem - as we show, when the gap approaches zero, best-arm identification is impossible. Second, motivated by applications where the rewards are private information, we provide a differentially private successive elimination algorithm whose sample complexity is finite even for distributions with infinite support and characterize its sample complexity. Our algorithms do not require prior knowledge of either the suboptimality gap or other statistical information related to the bandit problem at hand.</description><subject>Algorithms</subject><subject>Approximation algorithms</subject><subject>best-arm identification</subject><subject>Complexity</subject><subject>Complexity theory</subject><subject>differential privacy</subject><subject>Estimation</subject><subject>Information theory</subject><subject>Lower bounds</subject><subject>Multi-armed bandit problems</subject><subject>Quantile bandits</subject><subject>sequential estimation</subject><subject>Strain</subject><subject>Time measurement</subject><subject>Upper bound</subject><subject>value at risk</subject><issn>2641-8770</issn><issn>2641-8770</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkF1LwzAUhosoOHR_QG8CXncmp0nTeLfNr8lkyuZ1SNsTzOi6mbbC_r2pG-LVOfC-7_l4ouiK0RFjVN2-LMez1QgosFFCMyZAnEQDSDmLMynp6b_-PBo2zZpSCsC4zOQgMu-dqVtXIXntqtbFY7_BkkxMXbq2uSOLXes2piITbNpeI7MSg926wrRuW5PgI4bcO2vR94Kpqj158-7btEiWxSdu8DI6s6ZqcHisF9HH48Nq-hzPF0-z6XgeFyC4iHkKqlBKglKJQCptntocC55Rnhtm8lxCVvJcWRBZUqC0gAZ4avK0lDYFnlxEN4e5O7_96sK9er3tfB1W6rAABKUJy4ILDq7Cb5vGo9U7Hz70e82o7mnqX5q6p6mPNEPo-hByiPgXUDwRUvLkB6hGcD4</recordid><startdate>20210601</startdate><enddate>20210601</enddate><creator>Nikolakakis, Konstantinos E.</creator><creator>Kalogerias, Dionysios S.</creator><creator>Sheffet, Or</creator><creator>Sarwate, Anand D.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-5165-3317</orcidid><orcidid>https://orcid.org/0000-0002-5182-0530</orcidid><orcidid>https://orcid.org/0000-0002-3459-5044</orcidid><orcidid>https://orcid.org/0000-0001-6123-5282</orcidid></search><sort><creationdate>20210601</creationdate><title>Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme</title><author>Nikolakakis, Konstantinos E. ; Kalogerias, Dionysios S. ; Sheffet, Or ; Sarwate, Anand D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2545-4629c99729935e07fb6fbec4804ba1abb728d4b9f2583ce7f2ea246ab6d7f6243</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Approximation algorithms</topic><topic>best-arm identification</topic><topic>Complexity</topic><topic>Complexity theory</topic><topic>differential privacy</topic><topic>Estimation</topic><topic>Information theory</topic><topic>Lower bounds</topic><topic>Multi-armed bandit problems</topic><topic>Quantile bandits</topic><topic>sequential estimation</topic><topic>Strain</topic><topic>Time measurement</topic><topic>Upper bound</topic><topic>value at risk</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nikolakakis, Konstantinos E.</creatorcontrib><creatorcontrib>Kalogerias, Dionysios S.</creatorcontrib><creatorcontrib>Sheffet, Or</creatorcontrib><creatorcontrib>Sarwate, Anand D.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE journal on selected areas in information theory</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Nikolakakis, Konstantinos E.</au><au>Kalogerias, Dionysios S.</au><au>Sheffet, Or</au><au>Sarwate, Anand D.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme</atitle><jtitle>IEEE journal on selected areas in information theory</jtitle><stitle>JSAIT</stitle><date>2021-06-01</date><risdate>2021</risdate><volume>2</volume><issue>2</issue><spage>534</spage><epage>548</epage><pages>534-548</pages><issn>2641-8770</issn><eissn>2641-8770</eissn><coden>IJSTL5</coden><abstract>We study the best-arm identification problem in multi-armed bandits with stochastic rewards when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a successive elimination algorithm for strictly optimal best-arm identification, show that it is &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;\delta &lt;/tex-math&gt;&lt;/inline-formula&gt;-PAC and characterize its sample complexity. Further, we provide a lower bound on the expected number of pulls, showing that the proposed algorithm is essentially optimal up to logarithmic factors. Both upper and lower complexity bounds depend on a special definition of the associated suboptimality gap, designed in particular for the quantile bandit problem - as we show, when the gap approaches zero, best-arm identification is impossible. Second, motivated by applications where the rewards are private information, we provide a differentially private successive elimination algorithm whose sample complexity is finite even for distributions with infinite support and characterize its sample complexity. Our algorithms do not require prior knowledge of either the suboptimality gap or other statistical information related to the bandit problem at hand.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/JSAIT.2021.3081525</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0001-5165-3317</orcidid><orcidid>https://orcid.org/0000-0002-5182-0530</orcidid><orcidid>https://orcid.org/0000-0002-3459-5044</orcidid><orcidid>https://orcid.org/0000-0001-6123-5282</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2641-8770
ispartof IEEE journal on selected areas in information theory, 2021-06, Vol.2 (2), p.534-548
issn 2641-8770
2641-8770
language eng
recordid cdi_proquest_journals_2542500318
source IEEE Electronic Library (IEL)
subjects Algorithms
Approximation algorithms
best-arm identification
Complexity
Complexity theory
differential privacy
Estimation
Information theory
Lower bounds
Multi-armed bandit problems
Quantile bandits
sequential estimation
Strain
Time measurement
Upper bound
value at risk
title Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T17%3A08%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Quantile%20Multi-Armed%20Bandits:%20Optimal%20Best-Arm%20Identification%20and%20a%20Differentially%20Private%20Scheme&rft.jtitle=IEEE%20journal%20on%20selected%20areas%20in%20information%20theory&rft.au=Nikolakakis,%20Konstantinos%20E.&rft.date=2021-06-01&rft.volume=2&rft.issue=2&rft.spage=534&rft.epage=548&rft.pages=534-548&rft.issn=2641-8770&rft.eissn=2641-8770&rft.coden=IJSTL5&rft_id=info:doi/10.1109/JSAIT.2021.3081525&rft_dat=%3Cproquest_RIE%3E2542500318%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2542500318&rft_id=info:pmid/&rft_ieee_id=9435774&rfr_iscdi=true