BGADAM: Boosting based Genetic-Evolutionary ADAM for Neural Network Optimization

For various optimization methods, gradient descent-based algorithms can achieve outstanding performance and have been widely used in various tasks. Among those commonly used algorithms, ADAM owns many advantages such as fast convergence with both the momentum term and the adaptive learning rate. How...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Bai, Jiyang, Ren, Yuxiang, Zhang, Jiawei
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning Computer Science - Neural and Evolutionary Computing
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	For various optimization methods, gradient descent-based algorithms can achieve outstanding performance and have been widely used in various tasks. Among those commonly used algorithms, ADAM owns many advantages such as fast convergence with both the momentum term and the adaptive learning rate. However, since the loss functions of most deep neural networks are non-convex, ADAM also shares the drawback of getting stuck in local optima easily. To resolve such a problem, the idea of combining genetic algorithm with base learners is introduced to rediscover the best solutions. Nonetheless, from our analysis, the idea of combining genetic algorithm with a batch of base learners still has its shortcomings. The effectiveness of genetic algorithm can hardly be guaranteed if the unit models converge to close or the same solutions. To resolve this problem and further maximize the advantages of genetic algorithm with base learners, we propose to implement the boosting strategy for input model training, which can subsequently improve the effectiveness of genetic algorithm. In this paper, we introduce a novel optimization algorithm, namely Boosting based Genetic ADAM (BGADAM). With both theoretic analysis and empirical experiments, we will show that adding the boosting strategy into the BGADAM model can help models jump out the local optima and converge to better solutions.
DOI:	10.48550/arxiv.1908.08015