Cross-Entropy Optimization for Hyperparameter Optimization in Stochastic Gradient-based Approaches to Train Deep Neural Networks
In this paper, we present a cross-entropy optimization method for hyperparameter optimization in stochastic gradient-based approaches to train deep neural networks. The value of a hyperparameter of a learning algorithm often has great impact on the performance of a model such as the convergence spee...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we present a cross-entropy optimization method for
hyperparameter optimization in stochastic gradient-based approaches to train
deep neural networks. The value of a hyperparameter of a learning algorithm
often has great impact on the performance of a model such as the convergence
speed, the generalization performance metrics, etc. While in some cases the
hyperparameters of a learning algorithm can be part of learning parameters, in
other scenarios the hyperparameters of a stochastic optimization algorithm such
as Adam [5] and its variants are either fixed as a constant or are kept
changing in a monotonic way over time. We give an in-depth analysis of the
presented method in the framework of expectation maximization (EM). The
presented algorithm of cross-entropy optimization for hyperparameter
optimization of a learning algorithm (CEHPO) can be equally applicable to other
areas of optimization problems in deep learning. We hope that the presented
methods can provide different perspectives and offer some insights for
optimization problems in different areas of machine learning and beyond. |
---|---|
DOI: | 10.48550/arxiv.2409.09240 |