Multi-level context extraction and attention-based contextual inter-modal fusion for multimodal sentiment analysis and emotion classification

The recent advancements in the Internet technology and its associated services, led the users to post a large amount of multimodal data into social media Web sites, online shopping portals, video repositories, etc. The availability of the huge amount of multimodal content, multimodal sentiment class...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of multimedia information retrieval 2020-06, Vol.9 (2), p.103-112
Hauptverfasser: Huddar, Mahesh G., Sannakki, Sanjeev S., Rajpurohit, Vijay S.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The recent advancements in the Internet technology and its associated services, led the users to post a large amount of multimodal data into social media Web sites, online shopping portals, video repositories, etc. The availability of the huge amount of multimodal content, multimodal sentiment classification, and affective computing has become the most researched topic. The extraction of context among the neighboring utterances and considering the importance of inter-modal utterances before multimodal fusion are the most important research issues in this field. This article presents a novel approach to extract the context at multiple levels and to understand the importance of inter-modal utterances in sentiment and emotion classification. Experiments are conducted on two publically accepted datasets such as CMU-MOSI for sentiment analysis and IEMOCAP for emotion classification. By incorporating the utterance-level contextual information and importance of inter-modal utterances, the proposed model outperforms the standard baselines by over 3% in classification accuracy.
ISSN:2192-6611
2192-662X
DOI:10.1007/s13735-019-00185-8