Stochastic variance-reduced prox-linear algorithms for nonconvex composite optimization

We consider the problem of minimizing composite functions of the form f ( g ( x ) ) + h ( x ) , where f and h are convex functions (which can be nonsmooth) and g is a smooth vector mapping. In addition, we assume that g is the average of finite number of component mappings or the expectation over...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Mathematical programming 2022-09, Vol.195 (1-2), p.649-691
Hauptverfasser:	Zhang, Junyu, Xiao, Lin
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Calculus of Variations and Optimal Control Optimization Combinatorics Composite functions Full Length Paper Jacobians Mapping Mathematical and Computational Physics Mathematical Methods in Physics Mathematics Mathematics and Statistics Mathematics of Computing Numerical Analysis Optimization Theoretical Variance
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We consider the problem of minimizing composite functions of the form f ( g ( x ) ) + h ( x ) , where f and h are convex functions (which can be nonsmooth) and g is a smooth vector mapping. In addition, we assume that g is the average of finite number of component mappings or the expectation over a family of random component mappings. We propose a class of stochastic variance-reduced prox-linear algorithms for solving such problems and bound their sample complexities for finding an ϵ -stationary point in terms of the total number of evaluations of the component mappings and their Jacobians. When g is a finite average of N components, we obtain sample complexity O ( N + N 4 / 5 ϵ - 1 ) for both mapping and Jacobian evaluations. When g is a general expectation, we obtain sample complexities of O ( ϵ - 5 / 2 ) and O ( ϵ - 3 / 2 ) for component mappings and their Jacobians respectively. If in addition f is smooth, then improved sample complexities of O ( N + N 1 / 2 ϵ - 1 ) and O ( ϵ - 3 / 2 ) are derived for g being a finite average and a general expectation respectively, for both component mapping and Jacobian evaluations.
ISSN:	0025-5610 1436-4646
DOI:	10.1007/s10107-021-01709-z