No-Regret Learning with Unbounded Losses: The Case of Logarithmic Pooling
For each of $T$ time steps, $m$ experts report probability distributions over $n$ outcomes; we wish to learn to aggregate these forecasts in a way that attains a no-regret guarantee. We focus on the fundamental and practical aggregation method known as logarithmic pooling -- a weighted average of lo...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | For each of $T$ time steps, $m$ experts report probability distributions over
$n$ outcomes; we wish to learn to aggregate these forecasts in a way that
attains a no-regret guarantee. We focus on the fundamental and practical
aggregation method known as logarithmic pooling -- a weighted average of log
odds -- which is in a certain sense the optimal choice of pooling method if one
is interested in minimizing log loss (as we take to be our loss function). We
consider the problem of learning the best set of parameters (i.e. expert
weights) in an online adversarial setting. We assume (by necessity) that the
adversarial choices of outcomes and forecasts are consistent, in the sense that
experts report calibrated forecasts. Imposing this constraint creates a (to our
knowledge) novel semi-adversarial setting in which the adversary retains a
large amount of flexibility. In this setting, we present an algorithm based on
online mirror descent that learns expert weights in a way that attains
$O(\sqrt{T} \log T)$ expected regret as compared with the best weights in
hindsight. |
---|---|
DOI: | 10.48550/arxiv.2202.11219 |