Sending or not? A multimodal framework for Danmaku comment prediction
•We propose a multimodal framework to predict the behavior of users sending Danmaku comments.•We offer personalized analysis of multimodal data to precisely capture users’ points of interest.•Our model combines multimodal fusion part and attention mechanism to ensure high accuracy.•Our research cont...
Gespeichert in:
Veröffentlicht in: | Information processing & management 2021-11, Vol.58 (6), p.102687, Article 102687 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •We propose a multimodal framework to predict the behavior of users sending Danmaku comments.•We offer personalized analysis of multimodal data to precisely capture users’ points of interest.•Our model combines multimodal fusion part and attention mechanism to ensure high accuracy.•Our research contributes to understanding the emerging behavior of users watching online videos.
Danmaku is an emerging comment design for videos that allows real-time, interactive comments from viewers. Danmaku increases viewers’ interaction with other viewers and streamers, thereby raising viewers’ loyalty and sense of belonging. Sending Danmaku comments demonstrates a higher degree of viewer involvement than traditional static comments below the videos. Therefore, it is necessary and meaningful to learn about viewers’ preferences by observing their behavior, as this may benefit the platform as well as the streamers. However, research on how the multimodal environment affects viewers’ behavior in sending Danmaku comments is quite limited. To fill this gap, we propose a new dataset and a deep neural network integrating multimodal information to predict whether viewers will send Danmaku comments (Deep Multimodal network for Danmaku Forecasting, DMDF) in order to evaluate the impact of the interaction of textual features, audio features and visual features on the behavior of viewers sending Danmaku comments. A series of experimental results based on a real dataset of 249657 samples from Bilibili (a leading Chinese video streaming Website) demonstrate the effectiveness of the proposed DMDF and the helpfulness of all modalities, especially visual and acoustic features, in behavior forecasting. DMDF with the multimodal squeeze-and-excitation (MSE) module we proposed achieves 90.14% on accuracy and 83.60% on F1-score, and it reveals the extent to which a user-generated video can influence viewers to send Danmaku comments, which helps predict viewers’ online viewing behavior. Furthermore, our model contributes to the current work on the video understanding task. |
---|---|
ISSN: | 0306-4573 1873-5371 |
DOI: | 10.1016/j.ipm.2021.102687 |