Exploring and Exploiting Data Heterogeneity in Recommendation
Massive amounts of data are the foundation of data-driven recommendation models. As an inherent nature of big data, data heterogeneity widely exists in real-world recommendation systems. It reflects the differences in the properties among sub-populations. Ignoring the heterogeneity in recommendation...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Massive amounts of data are the foundation of data-driven recommendation
models. As an inherent nature of big data, data heterogeneity widely exists in
real-world recommendation systems. It reflects the differences in the
properties among sub-populations. Ignoring the heterogeneity in recommendation
data could limit the performance of recommendation models, hurt the
sub-populational robustness, and make the models misled by biases. However,
data heterogeneity has not attracted substantial attention in the
recommendation community. Therefore, it inspires us to adequately explore and
exploit heterogeneity for solving the above problems and assisting data
analysis. In this work, we focus on exploring two representative categories of
heterogeneity in recommendation data that is the heterogeneity of prediction
mechanism and covariate distribution and propose an algorithm that explores the
heterogeneity through a bilevel clustering method. Furthermore, the uncovered
heterogeneity is exploited for two purposes in recommendation scenarios which
are prediction with multiple sub-models and supporting debias. Extensive
experiments on real-world data validate the existence of heterogeneity in
recommendation data and the effectiveness of exploring and exploiting data
heterogeneity in recommendation. |
---|---|
DOI: | 10.48550/arxiv.2305.15431 |