Fast Matrix Factorization With Nonuniform Weights on Missing Data

Matrix factorization (MF) has been widely used to discover the low-rank structure and to predict the missing entries of data matrix. In many real-world learning systems, the data matrix can be very high dimensional but sparse. This poses an imbalanced learning problem since the scale of missing entr...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transaction on neural networks and learning systems 2020-08, Vol.31 (8), p.2791-2804
Hauptverfasser:	He, Xiangnan, Tang, Jinhui, Du, Xiaoyu, Hong, Richang, Ren, Tongwei, Chua, Tat-Seng
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Bar codes Benchmarks Complexity theory Data models Downstream effects Efficiency Elementwise alternating least squares (eALS) Factorization Learning Learning systems Machine learning Matrix decomposition matrix factorization (MF) Missing data Optimization Predictive models recommendation system Singular value decomposition Task analysis Weight
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Matrix factorization (MF) has been widely used to discover the low-rank structure and to predict the missing entries of data matrix. In many real-world learning systems, the data matrix can be very high dimensional but sparse. This poses an imbalanced learning problem since the scale of missing entries is usually much larger than that of the observed entries, but they cannot be ignored due to the valuable negative signal. For efficiency concern, existing work typically applies a uniform weight on missing entries to allow a fast learning algorithm. However, this simplification will decrease modeling fidelity, resulting in suboptimal performance for downstream applications. In this paper, we weight the missing data nonuniformly, and more generically, we allow any weighting strategy on the missing data. To address the efficiency challenge, we propose a fast learning method, for which the time complexity is determined by the number of observed entries in the data matrix rather than the matrix size. The key idea is twofold: 1) we apply truncated singular value decomposition on the weight matrix to get a more compact representation of the weights and 2) we learn MF parameters with elementwise alternating least squares (eALS) and memorize the key intermediate variables to avoid repeating computations that are unnecessary. We conduct extensive experiments on two recommendation benchmarks, demonstrating the correctness, efficiency, and effectiveness of our fast eALS method.
ISSN:	2162-237X 2162-2388
DOI:	10.1109/TNNLS.2018.2890117