Negatively correlated bandits

We analyse a two-player game of strategic experimentation with two-armed bandits. Either player has to decide in continuous time whether to use a safe arm with a known pay-off or a risky arm whose expected pay-off per unit of time is initially unknown. This pay-off can be high or low and is negative...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Review of economic studies 2011-04, Vol.78 (2), p.693-732
Hauptverfasser: Klein, Nicolas, Rady, Sven
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We analyse a two-player game of strategic experimentation with two-armed bandits. Either player has to decide in continuous time whether to use a safe arm with a known pay-off or a risky arm whose expected pay-off per unit of time is initially unknown. This pay-off can be high or low and is negatively correlated across players. We characterize the set of all Markov perfect equilibria in the benchmark case where the risky arms are known to be of opposite type and construct equilibria in cut-offf strategies for arbitrary negative correlation. All strategies and pay-offs are in closed form. In marked contrast to the case where both risky arms are of the same type, there always exists an equilibrium in cut-off strategies, and there always exists an equilibrium exhibiting efficient long-run patterns of learning. These results extend to a three-player game with common knowledge that exactly one risky arm is of the high pay-off type.
ISSN:0034-6526
1467-937X
0034-6527
1467-937X
DOI:10.1093/restud/rdq025