Towards activation function search for long short-term model network: A differential evolution based approach

In Deep Neural Networks (DNNs), several architectures had been proposed for the various complex tasks such as Machine Translation, Natural Language processing and time series forecasting. Long-Short Term Model (LSTM), a deep neural network became the popular architecture for solving sequential and t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of King Saud University. Computer and information sciences 2022-06, Vol.34 (6), p.2637-2650
Hauptverfasser: K., Vijayaprabakaran, K., Sathiyamurthy
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In Deep Neural Networks (DNNs), several architectures had been proposed for the various complex tasks such as Machine Translation, Natural Language processing and time series forecasting. Long-Short Term Model (LSTM), a deep neural network became the popular architecture for solving sequential and time series problems and achieved markable results. On building the LSTM model, many hyper-parameters like activation function, loss function, and optimizer need to be set in advance. These hyper-parameters play a significant role in the performance of the DNNs. This work concentrates on finding a novel activation function that can replace the existing activation function such as sigmoid and tanh in the LSTM. The Differential Evolution Algorithm (DEA) based search methodology is proposed in our work to discover the novel activation function for the LSTM network. Our proposed methodology finds an optimal activation function that outperforms than the traditional activation functions like sigmoid (σ), hyperbolic tangent (tanh) and Rectified Linear Unit (ReLU). In this work, the newly explored activation function based on DEA methodology is sinh(x)+sinh-1(x) named as Combined Hyperbolic Sine (comb-H-sine) function. The proposed comb-H-sine activation function outperforms the traditional functions in LSTM with accuracy of 98.83%,93.49% and 78.38% with MNIST, IMDB and UCI HAR datasets respectively.
ISSN:1319-1578
2213-1248
DOI:10.1016/j.jksuci.2020.04.015