A multimodal fusion method for sarcasm detection based on late fusion
Information on social media is multi-modal, most of which contains the meaning of sarcasm. In recent years, many people have studied the problem of sarcasm detection. Many traditional methods have been proposed in this field, but the study of deep learning methods to detect sarcasm is still insuffic...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2022-03, Vol.81 (6), p.8597-8616 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Information on social media is multi-modal, most of which contains the meaning of sarcasm. In recent years, many people have studied the problem of sarcasm detection. Many traditional methods have been proposed in this field, but the study of deep learning methods to detect sarcasm is still insufficient. It is necessary to comprehensively consider the information of the text,the changes of the tone of the audio signal,the facial expressions and the body posture in the image to detect sarcasm. This paper proposes a multi-level late-fusion learning framework with residual connections, a more reasonable experimental data-set split and two model variants based on different experimental settings. Extensive experiments on the MUStARD show that our methods are better than other fusion models. In our speaker-independent experimental split, the multi-modality has a 4.85% improvement over the single-modality, and the Error rate reduction has an improvement of 11.8%. The latest code will be updated to this URL later:
https://github.com/DingNing123/m_fusion |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-022-12122-9 |