Mexican Emotional Speech Database (MESD)
The Mexican Emotional Speech Database (MESD) provides single-word utterances for anger, disgust, fear, happiness, neutral, and sadness affective prosodies with Mexican cultural shaping. The MESD has been uttered by both adult and child non-professional actors: 3 female, 2 male, and 6 child voices ar...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Dataset |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Mexican Emotional Speech Database (MESD) provides single-word utterances for anger, disgust, fear, happiness, neutral, and sadness affective prosodies with Mexican cultural shaping. The MESD has been uttered by both adult and child non-professional actors: 3 female, 2 male, and 6 child voices are available (female mean age ± SD = 23.33 ± 1.53, male mean age ± SD = 24 ± 1.41, and children mean age ± SD = 9.83 ± 1.17). Words for emotional and neutral utterances come from two corpora: (corpus A) composed of nouns and adjectives that are repeated across emotional prosodies and types of voice (female, male, child), and (corpus B) which consists of words controlled for age-of-acquisition, frequency of use, familiarity, concreteness, valence, arousal, and discrete emotion dimensionality ratings. Particularly, words from corpus B are nouns and adjectives which subjective age of acquisition is under 9-year-old. Neutral-uttered words have valence and arousal ratings strictly greater than 4, but lower than 6 (in a 9-point-scale). Emotional-uttered words have valence and arousal ratings ranging from 1 to 4, or from 6 to 9. Furthermore, ratings for discrete emotional dimension greater than 2.5 (on a 5-point scale) allowed the emotional utterance with the corresponding anger, disgust, fear, happiness, or sadness prosody. Finally, words from corpus B were selected so that emotional prosodies do not differ as regards frequency of use, familiarity, and concreteness dimensions.
The audio recordings took place in a professional studio with the following materials: (1) a Sennheiser e835 microphone with a flat frequency response (100 Hz to 10 kHz), (2) a Focusrite Scarlett 2i4 audio interface connected to the microphone with an XLR cable and to the computer, and (3) the digital audio workstation REAPER (Rapid Environment for Audio Production, Engineering, and Recording). Audio files were stored as a sequence of 24-bit with a sample rate of 48000Hz.
Utterances are shared as 864 audio files in WAV format that are named according to the following pattern: ___.
Anger, Disgust, Fear, Happiness, Neutral, or Sadness
F: female, M: male, C: child
A: corpus A, B: corpus B
Entire word in lowercase letters
The MESD seems to be the first set of single-word emotional utterances that includes both adult and child voices for the Mexican population.
Citation
M. M. Duville, L. M. Alonso-Valerdi, and D. Ibarra-Zarat |
---|---|
DOI: | 10.17632/cy34mh68j9.3 |