Multi-level context extraction and attention-based contextual inter-modal fusion for multimodal sentiment analysis and emotion classification
The recent advancements in the Internet technology and its associated services, led the users to post a large amount of multimodal data into social media Web sites, online shopping portals, video repositories, etc. The availability of the huge amount of multimodal content, multimodal sentiment class...
Gespeichert in:
Veröffentlicht in: | International journal of multimedia information retrieval 2020-06, Vol.9 (2), p.103-112 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The recent advancements in the Internet technology and its associated services, led the users to post a large amount of multimodal data into social media Web sites, online shopping portals, video repositories, etc. The availability of the huge amount of multimodal content, multimodal sentiment classification, and affective computing has become the most researched topic. The extraction of context among the neighboring utterances and considering the importance of inter-modal utterances before multimodal fusion are the most important research issues in this field. This article presents a novel approach to extract the context at multiple levels and to understand the importance of inter-modal utterances in sentiment and emotion classification. Experiments are conducted on two publically accepted datasets such as CMU-MOSI for sentiment analysis and IEMOCAP for emotion classification. By incorporating the utterance-level contextual information and importance of inter-modal utterances, the proposed model outperforms the standard baselines by over 3% in classification accuracy. |
---|---|
ISSN: | 2192-6611 2192-662X |
DOI: | 10.1007/s13735-019-00185-8 |