Coordinate Descent Methods for DC Minimization: Optimality Conditions and Global Convergence
Difference-of-Convex (DC) minimization, referring to the problem of minimizing the difference of two convex functions, has been found rich applications in statistical learning and studied extensively for decades. However, existing methods are primarily based on multi-stage convex relaxation, only le...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Difference-of-Convex (DC) minimization, referring to the problem of
minimizing the difference of two convex functions, has been found rich
applications in statistical learning and studied extensively for decades.
However, existing methods are primarily based on multi-stage convex relaxation,
only leading to weak optimality of critical points. This paper proposes a
coordinate descent method for minimizing a class of DC functions based on
sequential nonconvex approximation. Our approach iteratively solves a nonconvex
one-dimensional subproblem globally, and it is guaranteed to converge to a
coordinate-wise stationary point. We prove that this new optimality condition
is always stronger than the standard critical point condition and directional
point condition under a mild \textit{locally bounded nonconvexity assumption}.
For comparisons, we also include a naive variant of coordinate descent methods
based on sequential convex approximation in our study. When the objective
function satisfies a \textit{globally bounded nonconvexity assumption} and
\textit{Luo-Tseng error bound assumption}, coordinate descent methods achieve
\textit{Q-linear} convergence rate. Also, for many applications of interest, we
show that the nonconvex one-dimensional subproblem can be computed exactly and
efficiently using a breakpoint searching method. Finally, we have conducted
extensive experiments on several statistical learning tasks to show the
superiority of our approach.
Keywords: Coordinate Descent, DC Minimization, DC Programming,
Difference-of-Convex Programs, Nonconvex Optimization, Sparse Optimization,
Binary Optimization. |
---|---|
DOI: | 10.48550/arxiv.2109.04228 |