Speech emotion recognition based on genetic algorithm–decision tree fusion of deep and acoustic features

Although researchers have proposed numerous techniques for speech emotion recognition, its performance remains unsatisfactory in many application scenarios. In this study, we propose a speech emotion recognition model based on a genetic algorithm (GA)–decision tree (DT) fusion of deep and acoustic f...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ETRI journal 2022, 44(3), , pp.462-475
Hauptverfasser: Sun, Linhui, Li, Qiu, Fu, Sheng, Li, Pingan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Although researchers have proposed numerous techniques for speech emotion recognition, its performance remains unsatisfactory in many application scenarios. In this study, we propose a speech emotion recognition model based on a genetic algorithm (GA)–decision tree (DT) fusion of deep and acoustic features. To more comprehensively express speech emotional information, first, frame‐level deep and acoustic features are extracted from a speech signal. Next, five kinds of statistic variables of these features are calculated to obtain utterance‐level features. The Fisher feature selection criterion is employed to select high‐performance features, removing redundant information. In the feature fusion stage, the GA is is used to adaptively search for the best feature fusion weight. Finally, using the fused feature, the proposed speech emotion recognition model based on a DT support vector machine model is realized. Experimental results on the Berlin speech emotion database and the Chinese emotion speech database indicate that the proposed model outperforms an average weight fusion method.
ISSN:1225-6463
2233-7326
DOI:10.4218/etrij.2020-0458