Empowering Long-tail Item Recommendation through Cross Decoupling Network (CDN)
Industry recommender systems usually suffer from highly-skewed long-tail item distributions where a small fraction of the items receives most of the user feedback. This skew hurts recommender quality especially for the item slices without much user feedback. While there have been many research advan...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Industry recommender systems usually suffer from highly-skewed long-tail item
distributions where a small fraction of the items receives most of the user
feedback. This skew hurts recommender quality especially for the item slices
without much user feedback. While there have been many research advances made
in academia, deploying these methods in production is very difficult and very
few improvements have been made in industry. One challenge is that these
methods often hurt overall performance; additionally, they could be complex and
expensive to train and serve. In this work, we aim to improve tail item
recommendations while maintaining the overall performance with less training
and serving cost. We first find that the predictions of user preferences are
biased under long-tail distributions. The bias comes from the differences
between training and serving data in two perspectives: 1) the item
distributions, and 2) user's preference given an item. Most existing methods
mainly attempt to reduce the bias from the item distribution perspective,
ignoring the discrepancy from user preference given an item. This leads to a
severe forgetting issue and results in sub-optimal performance.
To address the problem, we design a novel Cross Decoupling Network (CDN) (i)
decouples the learning process of memorization and generalization on the item
side through a mixture-of-expert architecture; (ii) decouples the user samples
from different distributions through a regularized bilateral branch network.
Finally, a new adapter is introduced to aggregate the decoupled vectors, and
softly shift the training attention to tail items. Extensive experimental
results show that CDN significantly outperforms state-of-the-art approaches on
benchmark datasets. We also demonstrate its effectiveness by a case study of
CDN in a large-scale recommendation system at Google. |
---|---|
DOI: | 10.48550/arxiv.2210.14309 |