Fast convergence of inertial dynamics with Hessian-driven damping under geometry assumptions
First-order optimization algorithms can be considered as a discretization of ordinary differential equations (ODEs) \cite{su2014differential}. In this perspective, studying the properties of the corresponding trajectories may lead to convergence results which can be transfered to the numerical schem...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | First-order optimization algorithms can be considered as a discretization of
ordinary differential equations (ODEs) \cite{su2014differential}. In this
perspective, studying the properties of the corresponding trajectories may lead
to convergence results which can be transfered to the numerical scheme. In this
paper we analyse the following ODE introduced by Attouch et al. in
\cite{attouch2016fast}: \begin{equation*} \forall t\geqslant
t_0,~\ddot{x}(t)+\frac{\alpha}{t}\dot{x}(t)+\beta H_F(x(t))\dot{x}(t)+\nabla
F(x(t))=0,\end{equation*} where $\alpha>0$, $\beta>0$ and $H_F$ denotes the
Hessian of $F$. This ODE can be derived to build numerical schemes which do not
require $F$ to be twice differentiable as shown in
\cite{attouch2020first,attouch2021convergence}. We provide strong convergence
results on the error $F(x(t))-F^*$ and integrability properties on $\|\nabla
F(x(t))\|$ under some geometry assumptions on $F$ such as quadratic growth
around the set of minimizers. In particular, we show that the decay rate of the
error for a strongly convex function is $O(t^{-\alpha-\varepsilon})$ for any
$\varepsilon>0$. These results are briefly illustrated at the end of the paper. |
---|---|
DOI: | 10.48550/arxiv.2206.06853 |