ELSTM: An improved long short‐term memory network language model for sequence learning

The gated structure of the long short‐term memory (LSTM) alleviates the defects of gradient disappearance and explosion in the recurrent neural network (RNN). It has received widespread attention in sequence learning such as text analysis. Although LSTM has good performance in handling remote depend...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems 2024-06, Vol.41 (6), p.n/a
Hauptverfasser:	Li, Zhi, Wang, Qing, Wang, Jia‐Qiang, Qu, Han‐Bing, Dong, Jichang, Dong, Zhi
Format:	Artikel
Sprache:	eng
Schlagworte:	exponential linear unit long short‐term memory natural language processing neural network
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The gated structure of the long short‐term memory (LSTM) alleviates the defects of gradient disappearance and explosion in the recurrent neural network (RNN). It has received widespread attention in sequence learning such as text analysis. Although LSTM has good performance in handling remote dependencies, information loss often occurs in long‐distance transmission. We propose a new model called ELSTM based on the computational complexity and gradient dispersion in the traditional LSTM model. This model simplifies the input gate of LSTM, reduces some time complexity by reducing some components, and improves the output gate. By introducing the exponential linear unit activation layer, the problem of gradient dispersion is alleviated. Comparing the new model with multiple existing models, when predicting language sequences, the time used by the model has been greatly reduced, and the language confusion has been reduced, showing good performance.
ISSN:	0266-4720 1468-0394
DOI:	10.1111/exsy.13211