Lower bounds for non-convex stochastic optimization

We lower bound the complexity of finding ϵ -stationary points (with gradient norm at most ϵ ) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Mathematical programming 2023-05, Vol.199 (1-2), p.165-214
Hauptverfasser: Arjevani, Yossi, Carmon, Yair, Duchi, John C., Foster, Dylan J., Srebro, Nathan, Woodworth, Blake
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We lower bound the complexity of finding ϵ -stationary points (with gradient norm at most ϵ ) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least ϵ - 4 queries to find an ϵ -stationary point. The lower bound is tight, and establishes that stochastic gradient descent is minimax optimal in this model. In a more restrictive model where the noisy gradient estimates satisfy a mean-squared smoothness property, we prove a lower bound of ϵ - 3 queries, establishing the optimality of recently proposed variance reduction techniques.
ISSN:0025-5610
1436-4646
DOI:10.1007/s10107-022-01822-7