Exploring the Algorithm-Dependent Generalization of AUPRC Optimization with List Stability
Stochastic optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning. Although various algorithms have been extensively studied for AUPRC optimization, the generalization is only guaranteed in the multi-query case. In this work, we present the first...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Stochastic optimization of the Area Under the Precision-Recall Curve (AUPRC)
is a crucial problem for machine learning. Although various algorithms have
been extensively studied for AUPRC optimization, the generalization is only
guaranteed in the multi-query case. In this work, we present the first trial in
the single-query generalization of stochastic AUPRC optimization. For sharper
generalization bounds, we focus on algorithm-dependent generalization. There
are both algorithmic and theoretical obstacles to our destination. From an
algorithmic perspective, we notice that the majority of existing stochastic
estimators are biased only when the sampling strategy is biased, and is
leave-one-out unstable due to the non-decomposability. To address these issues,
we propose a sampling-rate-invariant unbiased stochastic estimator with
superior stability. On top of this, the AUPRC optimization is formulated as a
composition optimization problem, and a stochastic algorithm is proposed to
solve this problem. From a theoretical perspective, standard techniques of the
algorithm-dependent generalization analysis cannot be directly applied to such
a listwise compositional optimization problem. To fill this gap, we extend the
model stability from instancewise losses to listwise losses and bridge the
corresponding generalization and stability. Additionally, we construct state
transition matrices to describe the recurrence of the stability, and simplify
calculations by matrix spectrum. Practically, experimental results on three
image retrieval datasets on speak to the effectiveness and soundness of our
framework. |
---|---|
DOI: | 10.48550/arxiv.2209.13262 |