From the Ravine method to the Nesterov method and vice versa: a dynamical system perspective
We revisit the Ravine method of Gelfand and Tsetlin from a dynamical system perspective, study its convergence properties, and highlight its similarities and differences with the Nesterov accelerated gradient method. The two methods are closely related. They can be deduced from each other by reversi...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We revisit the Ravine method of Gelfand and Tsetlin from a dynamical system
perspective, study its convergence properties, and highlight its similarities
and differences with the Nesterov accelerated gradient method. The two methods
are closely related. They can be deduced from each other by reversing the order
of the extrapolation and gradient operations in their definitions. They benefit
from similar fast convergence of values and convergence of iterates for general
convex objective functions. We will also establish the high resolution ODE of
the Ravine and Nesterov methods, and reveal an additional geometric damping
term driven by the Hessian for both methods. This will allow us to prove fast
convergence towards zero of the gradients not only for the Ravine method but
also for the Nesterov method for the first time. We also highlight connections
to other algorithms stemming from more subtle discretization schemes, and
finally describe a Ravine version of the proximal-gradient algorithms for
general structured smooth + non-smooth convex optimization problems. |
---|---|
DOI: | 10.48550/arxiv.2201.11643 |