Multimodal Music Emotion Recognition Method Based on the Combination of Knowledge Distillation and Transfer Learning
The main difficulty of music emotion recognition is the lack of sufficient labeled data. Only the labeled data with unbalanced categories are used to train the emotion recognition model. Not only is accurate labeling of emotion categories costly and time-consuming, but it also requires extensive mus...
Gespeichert in:
Veröffentlicht in: | Scientific programming 2022-02, Vol.2022, p.1-13 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The main difficulty of music emotion recognition is the lack of sufficient labeled data. Only the labeled data with unbalanced categories are used to train the emotion recognition model. Not only is accurate labeling of emotion categories costly and time-consuming, but it also requires extensive musical background for labelers At the same time, the emotion of music is often affected by many factors. Singing methods, music styles, arrangement methods, lyrics, and other factors will affect the expression of music emotions. This paper proposes a multimodal method based on the combination of knowledge distillation and music style transfer learning and verifies the effectiveness of the method on 20,000 songs. Experiments show that compared with traditional methods, such as single audio, single lyric, and single audio with multimodal lyric methods, the method proposed in this paper has significantly improved the accuracy of emotion recognition, and the generalization ability has been significantly improved. |
---|---|
ISSN: | 1058-9244 1875-919X |
DOI: | 10.1155/2022/2802573 |