Speech emotion recognition based on genetic algorithm–decision tree fusion of deep and acoustic features
Although researchers have proposed numerous techniques for speech emotion recognition, its performance remains unsatisfactory in many application scenarios. In this study, we propose a speech emotion recognition model based on a genetic algorithm (GA)–decision tree (DT) fusion of deep and acoustic f...
Gespeichert in:
Veröffentlicht in: | ETRI journal 2022, 44(3), , pp.462-475 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Although researchers have proposed numerous techniques for speech emotion recognition, its performance remains unsatisfactory in many application scenarios. In this study, we propose a speech emotion recognition model based on a genetic algorithm (GA)–decision tree (DT) fusion of deep and acoustic features. To more comprehensively express speech emotional information, first, frame‐level deep and acoustic features are extracted from a speech signal. Next, five kinds of statistic variables of these features are calculated to obtain utterance‐level features. The Fisher feature selection criterion is employed to select high‐performance features, removing redundant information. In the feature fusion stage, the GA is is used to adaptively search for the best feature fusion weight. Finally, using the fused feature, the proposed speech emotion recognition model based on a DT support vector machine model is realized. Experimental results on the Berlin speech emotion database and the Chinese emotion speech database indicate that the proposed model outperforms an average weight fusion method. |
---|---|
ISSN: | 1225-6463 2233-7326 |
DOI: | 10.4218/etrij.2020-0458 |