Negatively correlated bandits

We analyse a two-player game of strategic experimentation with two-armed bandits. Either player has to decide in continuous time whether to use a safe arm with a known pay-off or a risky arm whose expected pay-off per unit of time is initially unknown. This pay-off can be high or low and is negative...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Review of economic studies 2011-04, Vol.78 (2), p.693-732
Hauptverfasser: Klein, Nicolas, Rady, Sven
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 732
container_issue 2
container_start_page 693
container_title The Review of economic studies
container_volume 78
creator Klein, Nicolas
Rady, Sven
description We analyse a two-player game of strategic experimentation with two-armed bandits. Either player has to decide in continuous time whether to use a safe arm with a known pay-off or a risky arm whose expected pay-off per unit of time is initially unknown. This pay-off can be high or low and is negatively correlated across players. We characterize the set of all Markov perfect equilibria in the benchmark case where the risky arms are known to be of opposite type and construct equilibria in cut-offf strategies for arbitrary negative correlation. All strategies and pay-offs are in closed form. In marked contrast to the case where both risky arms are of the same type, there always exists an equilibrium in cut-off strategies, and there always exists an equilibrium exhibiting efficient long-run patterns of learning. These results extend to a three-player game with common knowledge that exactly one risky arm is of the high pay-off type.
doi_str_mv 10.1093/restud/rdq025
format Article
fullrecord <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_865526074</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>23015871</jstor_id><oup_id>10.1093/restud/rdq025</oup_id><sourcerecordid>23015871</sourcerecordid><originalsourceid>FETCH-LOGICAL-c473t-20f852f2eb6e2e0e4cd670a57be1b7e9dc4c181d02735aedaaea588ef021d3353</originalsourceid><addsrcrecordid>eNqF0EtLxDAQB_AgCq6PoxdhYfGil7qTpGnSoyy-YNGLgreQJlPp0t12k1bYb28kouDF0xzmxzz-hJxRuKZQ8rnHMIxu7t0WmNgjE5oXMiu5fNsnEwCeZ4VgxSE5CmEFAFQpOSHTJ3w3Q_OB7W5mO--xNQO6WWU2rhnCCTmoTRvw9Lsek9e725fFQ7Z8vn9c3Cwzm0s-ZAxqJVjNsCqQIWBuXSHBCFkhrSSWzuaWKuqASS4MOmPQCKWwBkYd54Ifk8s0t_fddoxv6HUTLLat2WA3Bq0KEU8HmUd58UeuutFv4nG6BM4og1JGlCVkfReCx1r3vlkbv9MU9FdUOkWlU1TRXyXfjf2_dJroKgyd_8GMAxVK0tg_T33X9L9ry1wJzvkngDt9pQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>903212097</pqid></control><display><type>article</type><title>Negatively correlated bandits</title><source>Jstor Complete Legacy</source><source>Oxford University Press Journals All Titles (1996-Current)</source><source>Business Source Complete</source><creator>Klein, Nicolas ; Rady, Sven</creator><creatorcontrib>Klein, Nicolas ; Rady, Sven</creatorcontrib><description>We analyse a two-player game of strategic experimentation with two-armed bandits. Either player has to decide in continuous time whether to use a safe arm with a known pay-off or a risky arm whose expected pay-off per unit of time is initially unknown. This pay-off can be high or low and is negatively correlated across players. We characterize the set of all Markov perfect equilibria in the benchmark case where the risky arms are known to be of opposite type and construct equilibria in cut-offf strategies for arbitrary negative correlation. All strategies and pay-offs are in closed form. In marked contrast to the case where both risky arms are of the same type, there always exists an equilibrium in cut-off strategies, and there always exists an equilibrium exhibiting efficient long-run patterns of learning. These results extend to a three-player game with common knowledge that exactly one risky arm is of the high pay-off type.</description><identifier>ISSN: 0034-6526</identifier><identifier>ISSN: 1467-937X</identifier><identifier>ISSN: 0034-6527</identifier><identifier>EISSN: 1467-937X</identifier><identifier>DOI: 10.1093/restud/rdq025</identifier><language>eng</language><publisher>Oxford: Review of Economic Studies Ltd., Blackwell Publishing</publisher><subject>Approximation ; Correlation ; Correlation analysis ; Correlations ; Economic transitions ; Efficient strategies ; Entscheidung ; Equilibrium ; Erwartungshaltung ; Experimentation ; Game theory ; Gewinn ; Laws of Motion ; Learning ; Lernen ; Markov analysis ; Markovian processes ; Markowscher Prozess ; Odes ; Opportunity costs ; Pay-off ; Risiko ; Risikomanagement ; Risk ; Spieltheorie ; Strategic behaviour ; Studies ; Time ; Uniqueness</subject><ispartof>The Review of economic studies, 2011-04, Vol.78 (2), p.693-732</ispartof><rights>The Review of Economic Studies Ltd 2011</rights><rights>The Author 2011. Published by Oxford University Press on behalf of The Review of Economic Studies Limited. 2011</rights><rights>Copyright Blackwell Publishing Ltd. Apr 2011</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c473t-20f852f2eb6e2e0e4cd670a57be1b7e9dc4c181d02735aedaaea588ef021d3353</citedby><cites>FETCH-LOGICAL-c473t-20f852f2eb6e2e0e4cd670a57be1b7e9dc4c181d02735aedaaea588ef021d3353</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/23015871$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/23015871$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,776,780,799,1578,27901,27902,57992,58225</link.rule.ids><backlink>$$Uhttp://www.fachportal-paedagogik.de/fis_bildung/suche/fis_set.html?FId=948533$$DAccess content in the German Education Portal$$Hfree_for_read</backlink></links><search><creatorcontrib>Klein, Nicolas</creatorcontrib><creatorcontrib>Rady, Sven</creatorcontrib><title>Negatively correlated bandits</title><title>The Review of economic studies</title><description>We analyse a two-player game of strategic experimentation with two-armed bandits. Either player has to decide in continuous time whether to use a safe arm with a known pay-off or a risky arm whose expected pay-off per unit of time is initially unknown. This pay-off can be high or low and is negatively correlated across players. We characterize the set of all Markov perfect equilibria in the benchmark case where the risky arms are known to be of opposite type and construct equilibria in cut-offf strategies for arbitrary negative correlation. All strategies and pay-offs are in closed form. In marked contrast to the case where both risky arms are of the same type, there always exists an equilibrium in cut-off strategies, and there always exists an equilibrium exhibiting efficient long-run patterns of learning. These results extend to a three-player game with common knowledge that exactly one risky arm is of the high pay-off type.</description><subject>Approximation</subject><subject>Correlation</subject><subject>Correlation analysis</subject><subject>Correlations</subject><subject>Economic transitions</subject><subject>Efficient strategies</subject><subject>Entscheidung</subject><subject>Equilibrium</subject><subject>Erwartungshaltung</subject><subject>Experimentation</subject><subject>Game theory</subject><subject>Gewinn</subject><subject>Laws of Motion</subject><subject>Learning</subject><subject>Lernen</subject><subject>Markov analysis</subject><subject>Markovian processes</subject><subject>Markowscher Prozess</subject><subject>Odes</subject><subject>Opportunity costs</subject><subject>Pay-off</subject><subject>Risiko</subject><subject>Risikomanagement</subject><subject>Risk</subject><subject>Spieltheorie</subject><subject>Strategic behaviour</subject><subject>Studies</subject><subject>Time</subject><subject>Uniqueness</subject><issn>0034-6526</issn><issn>1467-937X</issn><issn>0034-6527</issn><issn>1467-937X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><recordid>eNqF0EtLxDAQB_AgCq6PoxdhYfGil7qTpGnSoyy-YNGLgreQJlPp0t12k1bYb28kouDF0xzmxzz-hJxRuKZQ8rnHMIxu7t0WmNgjE5oXMiu5fNsnEwCeZ4VgxSE5CmEFAFQpOSHTJ3w3Q_OB7W5mO--xNQO6WWU2rhnCCTmoTRvw9Lsek9e725fFQ7Z8vn9c3Cwzm0s-ZAxqJVjNsCqQIWBuXSHBCFkhrSSWzuaWKuqASS4MOmPQCKWwBkYd54Ifk8s0t_fddoxv6HUTLLat2WA3Bq0KEU8HmUd58UeuutFv4nG6BM4og1JGlCVkfReCx1r3vlkbv9MU9FdUOkWlU1TRXyXfjf2_dJroKgyd_8GMAxVK0tg_T33X9L9ry1wJzvkngDt9pQ</recordid><startdate>20110401</startdate><enddate>20110401</enddate><creator>Klein, Nicolas</creator><creator>Rady, Sven</creator><general>Review of Economic Studies Ltd., Blackwell Publishing</general><general>Oxford University Press</general><scope>9S6</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8BJ</scope><scope>FQK</scope><scope>JBE</scope></search><sort><creationdate>20110401</creationdate><title>Negatively correlated bandits</title><author>Klein, Nicolas ; Rady, Sven</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c473t-20f852f2eb6e2e0e4cd670a57be1b7e9dc4c181d02735aedaaea588ef021d3353</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Approximation</topic><topic>Correlation</topic><topic>Correlation analysis</topic><topic>Correlations</topic><topic>Economic transitions</topic><topic>Efficient strategies</topic><topic>Entscheidung</topic><topic>Equilibrium</topic><topic>Erwartungshaltung</topic><topic>Experimentation</topic><topic>Game theory</topic><topic>Gewinn</topic><topic>Laws of Motion</topic><topic>Learning</topic><topic>Lernen</topic><topic>Markov analysis</topic><topic>Markovian processes</topic><topic>Markowscher Prozess</topic><topic>Odes</topic><topic>Opportunity costs</topic><topic>Pay-off</topic><topic>Risiko</topic><topic>Risikomanagement</topic><topic>Risk</topic><topic>Spieltheorie</topic><topic>Strategic behaviour</topic><topic>Studies</topic><topic>Time</topic><topic>Uniqueness</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Klein, Nicolas</creatorcontrib><creatorcontrib>Rady, Sven</creatorcontrib><collection>FIS Bildung Literaturdatenbank</collection><collection>CrossRef</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>International Bibliography of the Social Sciences</collection><collection>International Bibliography of the Social Sciences</collection><jtitle>The Review of economic studies</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Klein, Nicolas</au><au>Rady, Sven</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Negatively correlated bandits</atitle><jtitle>The Review of economic studies</jtitle><date>2011-04-01</date><risdate>2011</risdate><volume>78</volume><issue>2</issue><spage>693</spage><epage>732</epage><pages>693-732</pages><issn>0034-6526</issn><issn>1467-937X</issn><issn>0034-6527</issn><eissn>1467-937X</eissn><abstract>We analyse a two-player game of strategic experimentation with two-armed bandits. Either player has to decide in continuous time whether to use a safe arm with a known pay-off or a risky arm whose expected pay-off per unit of time is initially unknown. This pay-off can be high or low and is negatively correlated across players. We characterize the set of all Markov perfect equilibria in the benchmark case where the risky arms are known to be of opposite type and construct equilibria in cut-offf strategies for arbitrary negative correlation. All strategies and pay-offs are in closed form. In marked contrast to the case where both risky arms are of the same type, there always exists an equilibrium in cut-off strategies, and there always exists an equilibrium exhibiting efficient long-run patterns of learning. These results extend to a three-player game with common knowledge that exactly one risky arm is of the high pay-off type.</abstract><cop>Oxford</cop><pub>Review of Economic Studies Ltd., Blackwell Publishing</pub><doi>10.1093/restud/rdq025</doi><tpages>40</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0034-6526
ispartof The Review of economic studies, 2011-04, Vol.78 (2), p.693-732
issn 0034-6526
1467-937X
0034-6527
1467-937X
language eng
recordid cdi_proquest_miscellaneous_865526074
source Jstor Complete Legacy; Oxford University Press Journals All Titles (1996-Current); Business Source Complete
subjects Approximation
Correlation
Correlation analysis
Correlations
Economic transitions
Efficient strategies
Entscheidung
Equilibrium
Erwartungshaltung
Experimentation
Game theory
Gewinn
Laws of Motion
Learning
Lernen
Markov analysis
Markovian processes
Markowscher Prozess
Odes
Opportunity costs
Pay-off
Risiko
Risikomanagement
Risk
Spieltheorie
Strategic behaviour
Studies
Time
Uniqueness
title Negatively correlated bandits
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T19%3A06%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Negatively%20correlated%20bandits&rft.jtitle=The%20Review%20of%20economic%20studies&rft.au=Klein,%20Nicolas&rft.date=2011-04-01&rft.volume=78&rft.issue=2&rft.spage=693&rft.epage=732&rft.pages=693-732&rft.issn=0034-6526&rft.eissn=1467-937X&rft_id=info:doi/10.1093/restud/rdq025&rft_dat=%3Cjstor_proqu%3E23015871%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=903212097&rft_id=info:pmid/&rft_jstor_id=23015871&rft_oup_id=10.1093/restud/rdq025&rfr_iscdi=true