Wukong: Towards a Scaling Law for Large-Scale Recommendation
Scaling laws play an instrumental role in the sustainable improvement in model quality. Unfortunately, recommendation models to date do not exhibit such laws similar to those observed in the domain of large language models, due to the inefficiencies of their upscaling mechanisms. This limitation pos...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Scaling laws play an instrumental role in the sustainable improvement in
model quality. Unfortunately, recommendation models to date do not exhibit such
laws similar to those observed in the domain of large language models, due to
the inefficiencies of their upscaling mechanisms. This limitation poses
significant challenges in adapting these models to increasingly more complex
real-world datasets. In this paper, we propose an effective network
architecture based purely on stacked factorization machines, and a synergistic
upscaling strategy, collectively dubbed Wukong, to establish a scaling law in
the domain of recommendation. Wukong's unique design makes it possible to
capture diverse, any-order of interactions simply through taller and wider
layers. We conducted extensive evaluations on six public datasets, and our
results demonstrate that Wukong consistently outperforms state-of-the-art
models quality-wise. Further, we assessed Wukong's scalability on an internal,
large-scale dataset. The results show that Wukong retains its superiority in
quality over state-of-the-art models, while holding the scaling law across two
orders of magnitude in model complexity, extending beyond 100 GFLOP/example,
where prior arts fall short. |
---|---|
DOI: | 10.48550/arxiv.2403.02545 |