Convolutional-Recurrent Neural Networks With Multiple Attention Mechanisms for Speech Emotion Recognition
Speech emotion recognition (SER) aims to endow machines with the intelligence in perceiving latent affective components from speech. However, the existing works on deep-learning-based SER make it difficult to jointly consider time-frequency and sequential information in speech due to their structure...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on cognitive and developmental systems 2022-12, Vol.14 (4), p.1564-1573 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Speech emotion recognition (SER) aims to endow machines with the intelligence in perceiving latent affective components from speech. However, the existing works on deep-learning-based SER make it difficult to jointly consider time-frequency and sequential information in speech due to their structures, which may lead to deficiencies in exploring reasonable local emotional representations. In this regard, we propose a convolutional-recurrent neural network with multiple attention mechanisms (CRNN-MAs) for SER in this article, including the paralleled convolutional neural network (CNN) and long short-term memory (LSTM) modules, using extracted Mel-spectrums and frame-level features, respectively, in order to acquire time-frequency and sequential information simultaneously. Furthermore, we set three strategies for the proposed CRNN-MA: 1) a multiple self-attention layer in the CNN module on frame-level weights; 2) a multidimensional attention layer as the input features of the LSTM; and 3) a fusion layer summarizing the features of the two modules. Experimental results on three conventional SER corpora demonstrate the effectiveness of the proposed approach through using the convolutional-recurrent and multiple-attention modules, compared with other related models and existing state-of-the-art approaches. |
---|---|
ISSN: | 2379-8920 2379-8939 |
DOI: | 10.1109/TCDS.2021.3123979 |