Wasserstein Distributionally Robust Multiclass Support Vector Machine
We study the problem of multiclass classification for settings where data features \(\mathbf{x}\) and their labels \(\mathbf{y}\) are uncertain. We identify that distributionally robust one-vs-all (OVA) classifiers often struggle in settings with imbalanced data. To address this issue, we use Wasser...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2024-09 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We study the problem of multiclass classification for settings where data features \(\mathbf{x}\) and their labels \(\mathbf{y}\) are uncertain. We identify that distributionally robust one-vs-all (OVA) classifiers often struggle in settings with imbalanced data. To address this issue, we use Wasserstein distributionally robust optimization to develop a robust version of the multiclass support vector machine (SVM) characterized by the Crammer-Singer (CS) loss. First, we prove that the CS loss is bounded from above by a Lipschitz continuous function for all \(\mathbf{x} \in \mathcal{X}\) and \(\mathbf{y} \in \mathcal{Y}\), then we exploit strong duality results to express the dual of the worst-case risk problem, and we show that the worst-case risk minimization problem admits a tractable convex reformulation due to the regularity of the CS loss. Moreover, we develop a kernel version of our proposed model to account for nonlinear class separation, and we show that it admits a tractable convex upper bound. We also propose a projected subgradient method algorithm for a special case of our proposed linear model to improve scalability. Our numerical experiments demonstrate that our model outperforms state-of-the art OVA models in settings where the training data is highly imbalanced. We also show through experiments on popular real-world datasets that our proposed model often outperforms its regularized counterpart as the first accounts for uncertain labels unlike the latter. |
---|---|
ISSN: | 2331-8422 |