Quasi-Newton Methods for Machine Learning: Forget the Past, Just Sample
We present two sampled quasi-Newton methods (sampled LBFGS and sampled LSR1) for solving empirical risk minimization problems that arise in machine learning. Contrary to the classical variants of these methods that sequentially build Hessian or inverse Hessian approximations as the optimization prog...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We present two sampled quasi-Newton methods (sampled LBFGS and sampled LSR1)
for solving empirical risk minimization problems that arise in machine
learning. Contrary to the classical variants of these methods that sequentially
build Hessian or inverse Hessian approximations as the optimization progresses,
our proposed methods sample points randomly around the current iterate at every
iteration to produce these approximations. As a result, the approximations
constructed make use of more reliable (recent and local) information, and do
not depend on past iterate information that could be significantly stale. Our
proposed algorithms are efficient in terms of accessed data points (epochs) and
have enough concurrency to take advantage of parallel/distributed computing
environments. We provide convergence guarantees for our proposed methods.
Numerical tests on a toy classification problem as well as on popular
benchmarking binary classification and neural network training tasks reveal
that the methods outperform their classical variants. |
---|---|
DOI: | 10.48550/arxiv.1901.09997 |