Optimized injection of noise in activation functions to improve generalization of neural networks

This paper proposes a flexible probabilistic activation function that enhances the training and operation of artificial neural networks by intentionally injecting noise to gain additional control over the response of each neuron. During the learning phase, the level of injected noise is iteratively...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Chaos, solitons and fractals solitons and fractals, 2024-01, Vol.178, p.114363, Article 114363
Hauptverfasser: Duan, Fabing, Chapeau-Blondeau, François, Abbott, Derek
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper proposes a flexible probabilistic activation function that enhances the training and operation of artificial neural networks by intentionally injecting noise to gain additional control over the response of each neuron. During the learning phase, the level of injected noise is iteratively optimized by gradient-descent, realizing a form of adaptive stochastic resonance. From simple hard-threshold non-differentiable neuronal responses, controlled injection of noise gives access to a wide range of useful activation functions, with sufficient differentiability to enable gradient-descent learning for both the neuron and the injected-noise levels. Experimental results on function approximation demonstrate injected noise generally converging to non-vanishing optimal levels associated with improved generalization abilities in the neural networks. A theoretical explanation of the generalization improvement based on the path norm bound is presented. With injected noise in the deep neural network, experimental results on classifying images also obtain non-vanishing optimal noise levels to achieve better testing accuracies. The proposed probabilistic activation functions show the potential of adaptive stochastic resonance for useful applications in machine learning.
ISSN:0960-0779
1873-2887
DOI:10.1016/j.chaos.2023.114363