Robust Online Classification: From Estimation to Denoising
We study online classification of features into labels with general hypothesis classes. In our setting, true labels are determined by some function within the hypothesis class but are corrupted by unknown stochastic noise, and the features are generated adversarially. Predictions are made using obse...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We study online classification of features into labels with general
hypothesis classes. In our setting, true labels are determined by some function
within the hypothesis class but are corrupted by unknown stochastic noise, and
the features are generated adversarially. Predictions are made using observed
noisy labels and noiseless features, while the performance is measured via
minimax risk when comparing against true labels. The noise mechanism is modeled
via a general noise kernel that specifies, for any individual data point, a set
of distributions from which the actual noisy label distribution is chosen. We
show that minimax risk is tightly characterized (up to a logarithmic factor of
the hypothesis class size) by the Hellinger gap of the noisy label
distributions induced by the kernel, independent of other properties such as
the means and variances of the noise. Our main technique is based on a novel
reduction to an online comparison scheme of two hypotheses, along with a new
conditional version of Le Cam-Birg\'e testing suitable for online settings. Our
work provides the first comprehensive characterization for noisy online
classification with guarantees with respect to the ground truth while
addressing general noisy observations. |
---|---|
DOI: | 10.48550/arxiv.2309.01698 |