Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization
We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $L$-smooth, non-convex functions with a finite-sum structure. In essence, AdaSpider combines an AdaGrad-inspired [Duchi et al., 2011, McMahan & Streeter, 2010], but a fairly distinct, adaptive step-size sched...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose an adaptive variance-reduction method, called AdaSpider, for
minimization of $L$-smooth, non-convex functions with a finite-sum structure.
In essence, AdaSpider combines an AdaGrad-inspired [Duchi et al., 2011, McMahan
& Streeter, 2010], but a fairly distinct, adaptive step-size schedule with the
recursive stochastic path integrated estimator proposed in [Fang et al., 2018].
To our knowledge, Adaspider is the first parameter-free non-convex
variance-reduction method in the sense that it does not require the knowledge
of problem-dependent parameters, such as smoothness constant $L$, target
accuracy $\epsilon$ or any bound on gradient norms. In doing so, we are able to
compute an $\epsilon$-stationary point with $\tilde{O}\left(n +
\sqrt{n}/\epsilon^2\right)$ oracle-calls, which matches the respective lower
bound up to logarithmic factors. |
---|---|
DOI: | 10.48550/arxiv.2211.01851 |