Binary Scoring Rules that Incentivize Precision

All proper scoring rules incentivize an expert to predict \emph{accurately} (report their true estimate), but not all proper scoring rules equally incentivize \emph{precision}. Rather than treating the expert's belief as exogenously given, we consider a model where a rational expert can endogen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2021-05
Hauptverfasser: Neyman, Eric, Noarov, Georgy, Weinberg, S Matthew
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:All proper scoring rules incentivize an expert to predict \emph{accurately} (report their true estimate), but not all proper scoring rules equally incentivize \emph{precision}. Rather than treating the expert's belief as exogenously given, we consider a model where a rational expert can endogenously refine their belief by repeatedly paying a fixed cost, and is incentivized to do so by a proper scoring rule. Specifically, our expert aims to predict the probability that a biased coin flipped tomorrow will land heads, and can flip the coin any number of times today at a cost of \(c\) per flip. Our first main result defines an \emph{incentivization index} for proper scoring rules, and proves that this index measures the expected error of the expert's estimate (where the number of flips today is chosen adaptively to maximize the predictor's expected payoff). Our second main result finds the unique scoring rule which optimizes the incentivization index over all proper scoring rules. We also consider extensions to minimizing the \(\ell^{th}\) moment of error, and again provide an incentivization index and optimal proper scoring rule. In some cases, the resulting scoring rule is differentiable, but not infinitely differentiable. In these cases, we further prove that the optimum can be uniformly approximated by polynomial scoring rules. Finally, we compare common scoring rules via our measure, and include simulations confirming the relevance of our measure even in domains outside where it provably applies.
ISSN:2331-8422
DOI:10.48550/arxiv.2002.10669