Empowering Long-tail Item Recommendation through Cross Decoupling Network (CDN)

Industry recommender systems usually suffer from highly-skewed long-tail item distributions where a small fraction of the items receives most of the user feedback. This skew hurts recommender quality especially for the item slices without much user feedback. While there have been many research advan...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Zhang, Yin, Wang, Ruoxi, Yao, Tiansheng, Yi, Xinyang, Hong, Lichan, Caverlee, James, Chi, Ed H, Cheng, Derek Zhiyuan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Industry recommender systems usually suffer from highly-skewed long-tail item distributions where a small fraction of the items receives most of the user feedback. This skew hurts recommender quality especially for the item slices without much user feedback. While there have been many research advances made in academia, deploying these methods in production is very difficult and very few improvements have been made in industry. One challenge is that these methods often hurt overall performance; additionally, they could be complex and expensive to train and serve. In this work, we aim to improve tail item recommendations while maintaining the overall performance with less training and serving cost. We first find that the predictions of user preferences are biased under long-tail distributions. The bias comes from the differences between training and serving data in two perspectives: 1) the item distributions, and 2) user's preference given an item. Most existing methods mainly attempt to reduce the bias from the item distribution perspective, ignoring the discrepancy from user preference given an item. This leads to a severe forgetting issue and results in sub-optimal performance. To address the problem, we design a novel Cross Decoupling Network (CDN) (i) decouples the learning process of memorization and generalization on the item side through a mixture-of-expert architecture; (ii) decouples the user samples from different distributions through a regularized bilateral branch network. Finally, a new adapter is introduced to aggregate the decoupled vectors, and softly shift the training attention to tail items. Extensive experimental results show that CDN significantly outperforms state-of-the-art approaches on benchmark datasets. We also demonstrate its effectiveness by a case study of CDN in a large-scale recommendation system at Google.
DOI:10.48550/arxiv.2210.14309