A knowledge distilled attention-based latent information extraction network for sequential user behavior

When modeling user-item interaction sequences to extract sequential patterns, current recommender systems face the dual issues of a) long-distance dependencies in conjunction with b) high levels of noise. In addition, with the complexity of current recommendation model architectures there is a signi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2023, Vol.82 (1), p.1017-1043
Hauptverfasser:	Huang, Ruo, McIntyre, Shelby, Song, Meina, E, Haihong, Ou, Zhonghong
Format:	Artikel
Sprache:	eng
Schlagworte:	Acceleration Complexity Compressive strength Computer Communication Networks Computer Science Data Structures and Information Theory Feature extraction Information retrieval Modules Multimedia Information Systems Recommender systems Sampling methods Sequences Special Purpose and Application-Based Systems Teachers User behavior
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	When modeling user-item interaction sequences to extract sequential patterns, current recommender systems face the dual issues of a) long-distance dependencies in conjunction with b) high levels of noise. In addition, with the complexity of current recommendation model architectures there is a significant increase in computation time. Therefore, these models cannot meet the requirement of fast response needed in application scenarios such as online advertising. To deal with these issues, we propose a Knowledge Distilled Attention-based Latent Information Extraction Network for Sequential user behavior (KD-ALIENS). In this model structure, user and item attributes and history are utilized to model the latent information from high-order feature interactions in conjunction with user sequential historical behavior. With regard to the issues of long-distance dependency and noise, we have adopted the self-attention mechanism to learn the sequential patterns between items in a user-item interaction history. With regard to the issue of a complex model architecture which cannot meet the requirement of fast response, the use of model compression and acceleration is realized by: (a) use of a knowledge-distilled teacher and student module, wherein the complex teacher module extracts a user’s general preference from high-order feature interactions and sequential patterns of long history sequences; and (b) a sampling method to sample both the relatively long-term and short-term item histories. Experimental studies on two real-world datasets demonstrate considerable improvements for click-through rate (CTR) prediction accuracy relative to strong baseline models and also show the effectiveness of the student-model compression and acceleration for speed.
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-022-12513-y