Parallel Coordinate Descent Newton Method for Efficient [Formula Omitted]-Regularized Loss Minimization

The recent years have witnessed advances in parallel algorithms for large-scale optimization problems. Notwithstanding the demonstrated success, existing algorithms that parallelize over features are usually limited by divergence issues under high parallelism or require data preprocessing to allevia...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transaction on neural networks and learning systems 2019-01, Vol.30 (11), p.3233
Hauptverfasser:	Yatao An Bian, Li, Xiong, Liu, Yuncai, Ming-Hsuan Yang
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Convergence Data transfer (computers) Descent Divergence Iterative methods Newton methods Optimization Parallel processing Preprocessing Searching Support vector machines Synchronism Synchronization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The recent years have witnessed advances in parallel algorithms for large-scale optimization problems. Notwithstanding the demonstrated success, existing algorithms that parallelize over features are usually limited by divergence issues under high parallelism or require data preprocessing to alleviate these problems. In this paper, we propose a Parallel Coordinate Descent algorithm using approximate Newton steps (PCDN) that is guaranteed to converge globally without data preprocessing. The key component of the PCDN algorithm is the high-dimensional line search, which guarantees the global convergence with high parallelism. The PCDN algorithm randomly partitions the feature set into [Formula Omitted] subsets/bundles of size [Formula Omitted], and sequentially processes each bundle by first computing the descent directions for each feature in parallel and then conducting [Formula Omitted]-dimensional line search to compute the step size. We show that: 1) the PCDN algorithm is guaranteed to converge globally despite increasing parallelism and 2) the PCDN algorithm converges to the specified accuracy [Formula Omitted] within the limited iteration number of [Formula Omitted], and [Formula Omitted] decreases with increasing parallelism. In addition, the data transfer and synchronization cost of the [Formula Omitted]-dimensional line search can be minimized by maintaining intermediate quantities. For concreteness, the proposed PCDN algorithm is applied to [Formula Omitted]-regularized logistic regression and [Formula Omitted]-regularized [Formula Omitted]-loss support vector machine problems. Experimental evaluations on seven benchmark data sets show that the PCDN algorithm exploits parallelism well and outperforms the state-of-the-art methods.
ISSN:	2162-237X 2162-2388
DOI:	10.1109/TNNLS.2018.2889976