Automatic music emotion classification model for movie soundtrack subtitling based on neuroscientific premises
The ability of music to induce emotions has been arousing a lot of interest in recent years, especially due to the boom in music streaming platforms and the use of automatic music recommenders. Music Emotion Recognition approaches are based on combining multiple audio features extracted from digital...
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-11, Vol.53 (22), p.27096-27109 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The ability of music to induce emotions has been arousing a lot of interest in recent years, especially due to the boom in music streaming platforms and the use of automatic music recommenders. Music Emotion Recognition approaches are based on combining multiple audio features extracted from digital audio samples and different machine learning techniques. In these approaches, neuroscience results on musical emotion perception are not considered. The main goal of this research is to facilitate the automatic subtitling of music. The authors approached the problem of automatic musical emotion detection in movie soundtracks considering these characteristics and using scientific musical databases, which have become a reference in neuroscience research. In the experiments, the Constant-Q-Transform spectrograms, the ones that best represent the relationships between musical tones from the point of view of human perception, are combined with Convolutional Neural Networks. Results show an efficient emotion classification model for 2-second musical audio fragments representative of intense basic feelings of happiness, sadness, and fear. Those emotions are the most interesting to be identified in the case of movie music captioning. The quality metrics have demonstrated that the results of the different models differ significantly and show no homogeneity. Finally, these results pave the way for an accessible and automatic captioning of music, which could automatically identify the emotional intent of the different segments of the movie soundtrack. |
---|---|
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-023-04967-w |