Distributed Learning in Non-Convex Environments- Part II: Polynomial Escape From Saddle-Points

The diffusion strategy for distributed learning from streaming data employs local stochastic gradient updates along with exchange of iterates over neighborhoods. In Part I [3] of this work we established that agents cluster around a network centroid and proceeded to study the dynamics of this point....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on signal processing 2021, Vol.69, p.1257-1270
Hauptverfasser:	Vlaski, Stefan, Sayed, Ali H.
Format:	Artikel
Sprache:	eng
Schlagworte:	adaptation Annealing Approximation algorithms Centroids Convergence diffusion learning distributed optimization escape time gradient noise Learning non-convex costs Optimization Perturbation methods Polynomials saddle point Saddle points Signal processing algorithms stationary points Stochastic optimization Stochastic processes
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The diffusion strategy for distributed learning from streaming data employs local stochastic gradient updates along with exchange of iterates over neighborhoods. In Part I [3] of this work we established that agents cluster around a network centroid and proceeded to study the dynamics of this point. We established expected descent in non-convex environments in the large-gradient regime and introduced a short-term model to examine the dynamics over finite-time horizons. Using this model, we establish in this work that the diffusion strategy is able to escape from strict saddle-points in O(1/\mu) iterations, where \mu denotes the step-size; it is also able to return approximately second-order stationary points in a polynomial number of iterations. Relative to prior works on the polynomial escape from saddle-points, most of which focus on centralized perturbed or stochastic gradient descent, our approach requires less restrictive conditions on the gradient noise process.
ISSN:	1053-587X 1941-0476
DOI:	10.1109/TSP.2021.3050840