An Incremental Update Framework for Online Recommenders with Data-Driven Prior
Online recommenders have attained growing interest and created great revenue for businesses. Given numerous users and items, incremental update becomes a mainstream paradigm for learning large-scale models in industrial scenarios, where only newly arrived data within a sliding window is fed into the...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Online recommenders have attained growing interest and created great revenue
for businesses. Given numerous users and items, incremental update becomes a
mainstream paradigm for learning large-scale models in industrial scenarios,
where only newly arrived data within a sliding window is fed into the model,
meeting the strict requirements of quick response. However, this strategy would
be prone to overfitting to newly arrived data. When there exists a significant
drift of data distribution, the long-term information would be discarded, which
harms the recommendation performance. Conventional methods address this issue
through native model-based continual learning methods, without analyzing the
data characteristics for online recommenders. To address the aforementioned
issue, we propose an incremental update framework for online recommenders with
Data-Driven Prior (DDP), which is composed of Feature Prior (FP) and Model
Prior (MP). The FP performs the click estimation for each specific value to
enhance the stability of the training process. The MP incorporates previous
model output into the current update while strictly following the Bayes rules,
resulting in a theoretically provable prior for the robust update. In this way,
both the FP and MP are well integrated into the unified framework, which is
model-agnostic and can accommodate various advanced interaction models.
Extensive experiments on two publicly available datasets as well as an
industrial dataset demonstrate the superior performance of the proposed
framework. |
---|---|
DOI: | 10.48550/arxiv.2312.15903 |