Video recommendation method based on multi-modal video content and multi-task learning

The invention discloses a video recommendation method based on multi-modal video content and multi-task learning. The method comprises the following steps: extracting visual, audio and text features of a short video through a pre-trained model; fusing the multi-modal features of the video by adoptin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: SHI JINGLUN, LIANG KEHONG, LIN YANGCHENG, FU QIANSHUAN, DENG LI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a video recommendation method based on multi-modal video content and multi-task learning. The method comprises the following steps: extracting visual, audio and text features of a short video through a pre-trained model; fusing the multi-modal features of the video by adopting an attention mechanism method; learning feature representation of the social relationship of the user by adopting a deep walk method; proposing a deep neural network model based on an attention mechanism to learn multi-domain feature representation; embedding the features generated based on the above steps into a sharing layer as a multi-task model, and generating prediction results through a multi-layer perceptron. According to the method, the attention mechanism is combined with the user features to fuse the video multi-modal features, so that the whole recommendation is richer and more personalized; meanwhile, because of multi-domain features and with consideration of the importance ofinteraction features in r