Multi-Attribute Feature Extraction and Selection for Emotion Recognition from Speech through Machine Learning
Speech-based emotion recognition is still challenging due to its complexity despite being widely used in applications relating to emotions. In this paper, we developed a framework by considering three features: Prosodic features, Wavelet, and Spectral features. Under Prosodic, pitch and energy are c...
Gespeichert in:
Veröffentlicht in: | Traitement du signal 2023-02, Vol.40 (1), p.265-275 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Speech-based emotion recognition is still challenging due to its complexity despite being widely used in applications relating to emotions. In this paper, we developed a framework by considering three features: Prosodic features, Wavelet, and Spectral features. Under Prosodic, pitch and energy are considered, while under wavelet features, the approximation and detailed sub-bands ate fur scales are considered. Mel-Frequency Cepstral Coefficients (MFCC), Formants, and Long-Term Average Spectrum (LTAS) are all measured from speech signals as part of spectral features. Further, the significant features are selected based on nonlinear statistics, and dimensionality reduction is accomplished through Fisher Criterion. Spearman Rank Correlation is employed to find the nonlinear statistics under correlation analysis. For categorization, a Support Vector Machine and Decision Tree are used. The proposed method is simulated over RAVDESS, SAVEE, EMOVO, and URDU databases, and the observed recognition rates are approximately 79.66%, 88.99%, 87.68%, and 95.78%, respectively. |
---|---|
ISSN: | 0765-0019 1958-5608 |
DOI: | 10.18280/ts.400126 |