Iterative Collaborative Filtering for Sparse Matrix Estimation

Matrix estimation or completion has served as a canonical mathematical model for recommendation systems. More recently, it has emerged as a fundamental building block for data analysis as a first step to denoise the observations and predict missing values. Since the dawn of e-commerce, similarity-ba...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Operations research 2022-11, Vol.70 (6), p.3143-3175
1. Verfasser: Borgs, Christian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Matrix estimation or completion has served as a canonical mathematical model for recommendation systems. More recently, it has emerged as a fundamental building block for data analysis as a first step to denoise the observations and predict missing values. Since the dawn of e-commerce, similarity-based collaborative filtering has been used as a heuristic for matrix etimation. At its core, it encodes typical human behavior: you ask your friends to recommend what you may like or dislike . Algorithmically, friends are similar “rows” or “columns” of the underlying matrix. The traditional heuristic for computing similarities between rows has costly requirements on the density of observed entries. In “Iterative Collaborative Filtering for Sparse Matrix Estimation” by Christian Borgs, Jennifer T. Chayes, Devavrat Shah, and Christina Lee Yu, the authors introduce an algorithm that computes similarities in sparse datasets by comparing expanded local neighborhoods in the associated data graph: in effect, you ask friends of your friends to recommend what you may like or dislike. This work provides bounds on the max entry-wise error of their estimate for low rank and approximately low rank matrices, which is stronger than the aggregate mean squared error bounds found in classical works. The algorithm is also interpretable, scalable, and amenable to distributed implementation. We consider sparse matrix estimation where the goal is to estimate an n -by- n matrix from noisy observations of a small subset of its entries. We analyze the estimation error of the popularly used collaborative filtering algorithm for the sparse regime. Specifically, we propose a novel iterative variant of the algorithm, adapted to handle the setting of sparse observations. We establish that as long as the number of entries observed at random scale logarithmically larger than linear in n , the estimation error with respect to the entry-wise max norm decays to zero as n goes to infinity, assuming the underlying matrix of interest has constant rank r . Our result is robust to model misspecification in that if the underlying matrix is approximately rank r , then the estimation error decays to the approximation error with respect to the max   -norm. In the process, we establish the algorithm’s ability to handle arbitrary bounded noise in the observations. Funding: This work was supported in part by Microsoft Research New England; the National Science Foundation’s Division of Computing and Communicat
ISSN:0030-364X
1526-5463
DOI:10.1287/opre.2021.2193