Improved Jacobian Eigen-Analysis Scheme for Accelerating Learning in Feedforward Neural Networks

An important problem in the learning process when training feedforward artificial neural networks is the occurrence of temporary minima which considerably slows down learning convergence. In a series of previous works, we analyzed this problem by deriving a dynamical system model which is valid in t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Cognitive computation 2015-02, Vol.7 (1), p.86-102
Hauptverfasser:	Ampazis, N., Perantonis, S. J., Drivaliaris, D.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial Intelligence Artificial neural networks Back propagation Back propagation networks Biomedical and Life Sciences Biomedicine Closed form solutions Computation by Abstract Devices Computational Biology/Bioinformatics Dynamic models Dynamical systems Eigenvalues Eigenvectors Equivalence Exact solutions Minima Neural networks Neurons Neurosciences Optimization Perturbation theory Propagation Redundancy System theory Systems analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	An important problem in the learning process when training feedforward artificial neural networks is the occurrence of temporary minima which considerably slows down learning convergence. In a series of previous works, we analyzed this problem by deriving a dynamical system model which is valid in the vicinity of temporary minima caused by redundancy of nodes in the hidden layer. We also demonstrated how to incorporate the characteristics of the dynamical model into a constrained optimization algorithm that allows prompt abandonment of temporary minima and acceleration of learning. In this work, we revisit the constrained optimization framework in order to develop a closed-form solution for the evolution of critical dynamical system model parameters during learning in the vicinity of temporary minima. We show that this formalism is equivalent to matrix perturbation theory which was discussed in a previous work, but that the closed-form solution presented in the present paper allows for a weight update rule which is linear to the number of the network’s weights. In terms of computational complexity, this is equivalent to that of the simple back-propagation weight update rule. Simulations demonstrate the computational efficiency and effectiveness of this approach in reducing the time spent in the vicinity of temporary minima as suggested by the analysis.
ISSN:	1866-9956 1866-9964
DOI:	10.1007/s12559-014-9263-2