$SGD \Tucker: A Novel Stochastic Optimization Strategy for Parallel Sparse Tucker Decomposition$

SGD \Tucker: A Novel Stochastic Optimization Strategy for Parallel Sparse Tucker Decomposition

Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a group of factor matrices to obtain an optimal low-rank representation feature for the H igh- O rder, H igh- D imension, and S parse T ensor (HOHDST). However, existing STD algorithms face the problem of intermediate variables exp...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on parallel and distributed systems 2021-07, Vol.32 (7), p.1828-1841
Hauptverfasser:	Li, Hao, Li, Zixuan, Li, Kenli, Rellermeyer, Jan S., Chen, Lydia, Li, Keqin
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Algorithms Data models Decomposition high-dimension and sparse tensor High-order Indexes low-rank representation learning machine learning algorithm Matrix decomposition Multiplication Optimization Parallel processing parallel strategy Sparse matrices sparse tucker decomposition stochastic optimization Stochastic processes Tensors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a group of factor matrices to obtain an optimal low-rank representation feature for the H igh- O rder, H igh- D imension, and S parse T ensor (HOHDST). However, existing STD algorithms face the problem of intermediate variables explosion which results from the fact that the formation of those variables, i.e., matrices Khatri-Rao product, Kronecker product, and matrix-matrix multiplication, follows the whole elements in sparse tensor. The above problems prevent deep fusion of efficient computation and big data platforms. To overcome the bottleneck, a novel stochastic optimization strategy (SGD\_ _ Tucker) is proposed for STD which can automatically divide the high-dimension intermediate variables into small batches of intermediate matrices. Specifically, SGD\_ _ Tucker only follows the randomly selected small samples rather than the whole elements, while maintaining the overall accuracy and convergence rate. In practice, SGD\_ _ Tucker features the two distinct advancements over the state of the art. First, SGD\_ _ Tucker can prune the communication overhead for the core tensor in distributed settings. Second, the low data-dependence of SGD\_ _ Tucker enables fine-grained parallelization, which makes SGD\_ _ Tucker obtaining lower computational overheads with the same accuracy. Experimental results show that SGD\_ _
ISSN:	1045-9219 1558-2183
DOI:	10.1109/TPDS.2020.3047460