Iterative Collaborative Filtering for Sparse Matrix Estimation
Matrix estimation or completion has served as a canonical mathematical model for recommendation systems. More recently, it has emerged as a fundamental building block for data analysis as a first step to denoise the observations and predict missing values. Since the dawn of e-commerce, similarity-ba...
Gespeichert in:
Veröffentlicht in: | Operations research 2022-11, Vol.70 (6), p.3143-3175 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Matrix estimation or completion has served as a canonical mathematical model for recommendation systems. More recently, it has emerged as a fundamental building block for data analysis as a first step to denoise the observations and predict missing values. Since the dawn of e-commerce, similarity-based collaborative filtering has been used as a heuristic for matrix etimation. At its core, it encodes typical human behavior: you ask your
friends
to recommend what you may
like
or
dislike
. Algorithmically,
friends
are
similar
“rows” or “columns” of the underlying matrix. The traditional heuristic for computing similarities between rows has costly requirements on the density of observed entries. In “Iterative Collaborative Filtering for Sparse Matrix Estimation” by Christian Borgs, Jennifer T. Chayes, Devavrat Shah, and Christina Lee Yu, the authors introduce an algorithm that computes similarities in
sparse
datasets by comparing expanded local neighborhoods in the associated data graph: in effect, you ask friends of your friends to recommend what you may like or dislike. This work provides bounds on the max entry-wise error of their estimate for low rank and approximately low rank matrices, which is stronger than the aggregate mean squared error bounds found in classical works. The algorithm is also interpretable, scalable, and amenable to distributed implementation.
We consider sparse matrix estimation where the goal is to estimate an
n
-by-
n
matrix from noisy observations of a small subset of its entries. We analyze the estimation error of the popularly used collaborative filtering algorithm for the sparse regime. Specifically, we propose a novel iterative variant of the algorithm, adapted to handle the setting of sparse observations. We establish that as long as the number of entries observed at random scale logarithmically larger than linear in
n
, the estimation error with respect to the entry-wise max norm decays to zero as
n
goes to infinity, assuming the underlying matrix of interest has constant rank
r
. Our result is robust to model misspecification in that if the underlying matrix is approximately rank
r
, then the estimation error decays to the approximation error with respect to the
max
-norm. In the process, we establish the algorithm’s ability to handle arbitrary bounded noise in the observations.
Funding:
This work was supported in part by Microsoft Research New England; the National Science Foundation’s Division of Computing and Communicat |
---|---|
ISSN: | 0030-364X 1526-5463 |
DOI: | 10.1287/opre.2021.2193 |