Multimodal deep learning emotion classification method based on voice and video
The invention discloses a multi-mode deep learning sentiment classification method based on voice and video. The method comprises the following steps: step 1, acquiring a voice and video dual-mode sentiment data set; 2, preprocessing the voice data and the video data respectively; 3, taking the prep...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a multi-mode deep learning sentiment classification method based on voice and video. The method comprises the following steps: step 1, acquiring a voice and video dual-mode sentiment data set; 2, preprocessing the voice data and the video data respectively; 3, taking the preprocessed spectrogram and the preprocessed video image as input, and extracting a voice emotion feature vector and a video emotion feature vector through a voice feature extraction network and a video emotion feature extraction network respectively; and step 4, splicing the extracted voice emotion feature vector faaudio and the extracted video emotion feature vector fvideo to obtain a fused emotion feature vector fe, taking the fused feature fe as input, and classifying emotions through a full-connection neural network to obtain emotion tags. According to the method, emotion feature extraction is carried out on two different modes of data of voice and video, then the obtained features are spliced, and finally emotio |
---|