Lower bounds for non-convex stochastic optimization

We lower bound the complexity of finding ϵ -stationary points (with gradient norm at most ϵ ) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Mathematical programming 2023-05, Vol.199 (1-2), p.165-214
Hauptverfasser:	Arjevani, Yossi, Carmon, Yair, Duchi, John C., Foster, Dylan J., Srebro, Nathan, Woodworth, Blake
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Analysis Calculus of Variations and Optimal Control Optimization Combinatorics Full Length Paper Lower bounds Mathematical and Computational Physics Mathematical Methods in Physics Mathematics Mathematics and Statistics Mathematics of Computing Minimax technique Numerical Analysis Optimization Queries Smoothness Theoretical
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We lower bound the complexity of finding ϵ -stationary points (with gradient norm at most ϵ ) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least ϵ - 4 queries to find an ϵ -stationary point. The lower bound is tight, and establishes that stochastic gradient descent is minimax optimal in this model. In a more restrictive model where the noisy gradient estimates satisfy a mean-squared smoothness property, we prove a lower bound of ϵ - 3 queries, establishing the optimality of recently proposed variance reduction techniques.
ISSN:	0025-5610 1436-4646
DOI:	10.1007/s10107-022-01822-7