Online Learning Using Only Peer Prediction
This paper considers a variant of the classical online learning problem with expert predictions. Our model's differences and challenges are due to lacking any direct feedback on the loss each expert incurs at each time step $t$. We propose an approach that uses peer prediction and identify cond...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper considers a variant of the classical online learning problem with
expert predictions. Our model's differences and challenges are due to lacking
any direct feedback on the loss each expert incurs at each time step $t$. We
propose an approach that uses peer prediction and identify conditions where it
succeeds. Our techniques revolve around a carefully designed peer score
function $s()$ that scores experts' predictions based on the peer consensus. We
show a sufficient condition, that we call \emph{peer calibration}, under which
standard online learning algorithms using loss feedback computed by the
carefully crafted $s()$ have bounded regret with respect to the unrevealed
ground truth values. We then demonstrate how suitable $s()$ functions can be
derived for different assumptions and models. |
---|---|
DOI: | 10.48550/arxiv.1910.04382 |