Study on emotion recognition and companion Chatbot using deep neural network

With the development of technology, the importance of the research on speech emotion recognition and semantic analysis has increased. The research is primarily applied in companion robot, technology products and medical purpose. In this research, a communication system with speech emotion recognitio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2020-07, Vol.79 (27-28), p.19629-19657
Hauptverfasser: Lee, Ming-Che, Chiang, Shu-Yin, Yeh, Sheng-Cheng, Wen, Ting-Feng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the development of technology, the importance of the research on speech emotion recognition and semantic analysis has increased. The research is primarily applied in companion robot, technology products and medical purpose. In this research, a communication system with speech emotion recognition is proposed. The system pre-process speech with sound data enhancing method in speech emotion recognition and transform the sound into spectrogram by MFCC (Mel Frequency Cepstral Coefficient). Then, GoogLeNet of CNN (Convolutional Neural Network) is applied to recognize the five emotions, which are peace, happy, sad, angry and fear, and the top accuracy of recognition is 79.81%. When applying semantic analysis, the training texts are divided into two categories, positive and negative, and the chatting conversations are conducted in the framework Seq2Seq of RNN (Recurrent Neural Network). The systematic framework of this research has two parts, the client and the server. The former one is developed on Android system to be used in Application, and the latter one is established by Ubuntu Linux system and combined with the web server. With the bi-terminal framework system, the users can record voice in APP one his/her cellphone and upload the voice file to the server. Then, the voice undergoes speech emotion recognition by CNN and semantic analysis by RNN to function as a chatting machine that can respond positively or negatively based on the detected emotion and show the results on APP of the user’s cell phone. The main contributions of this research are: 1) This study introduces the Chinese word vector to the robot dialogue system, effectively improving dialogue tolerance and semantic interpretation, 2) The traditional method of emotion identification must first tokenize the Chinese words, analyze the clauses and part of speech, and capture the emotional keywords before being interpreted by the expert system. Different from the traditional method, this study classifies the input directly through the convolutional neural network after the input sentence is converted into a spectrogram by MFCC, and 3) in addition to implementing the companion robot, the user’s emotional index can be collected for analysis by the back-end care organization. In addition, compared with other commercial humanoid companion robots, this study is presented in an App, which is easier to use and economical.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-020-08841-6