Music Mood Detection Based On Audio And Lyrics With Deep Neural Net
We consider the task of multimodal music mood prediction based on the audio signal and the lyrics of a track. We reproduce the implementation of traditional feature engineering based approaches and propose a new model based on deep learning. We compare the performance of both approaches on a databas...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We consider the task of multimodal music mood prediction based on the audio
signal and the lyrics of a track. We reproduce the implementation of
traditional feature engineering based approaches and propose a new model based
on deep learning. We compare the performance of both approaches on a database
containing 18,000 tracks with associated valence and arousal values and show
that our approach outperforms classical models on the arousal detection task,
and that both approaches perform equally on the valence prediction task. We
also compare the a posteriori fusion with fusion of modalities optimized
simultaneously with each unimodal model, and observe a significant improvement
of valence prediction. We release part of our database for comparison purposes. |
---|---|
DOI: | 10.48550/arxiv.1809.07276 |