High Performance GPU Tensor Completion With Tubal-Sampling Pattern

Data completion is a problem of filling missing or unobserved elements of partially observed datasets. Data completion algorithms have received wide attention and achievements in diverse domains including data mining, signal processing, and computer vision. We observe a ubiquitous tubal-sampling pat...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on parallel and distributed systems 2020-07, Vol.31 (7), p.1724-1739
Hauptverfasser:	Zhang, Tao, Liu, Xiao-Yang, Wang, Xiaodong
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Algorithms alternating minimization Big Data Cameras Computer vision Convolution Data compression Data mining Data transfer (computers) GPU Graphics processing units Internet of Things Least squares Libraries low-tubal-rank tensor model Mathematical analysis Performance evaluation Recovery Sampling Signal processing Tensile stress Tensor completion Tensors tubal-sampling Video transmission Wireless communication Wireless sensor networks
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Data completion is a problem of filling missing or unobserved elements of partially observed datasets. Data completion algorithms have received wide attention and achievements in diverse domains including data mining, signal processing, and computer vision. We observe a ubiquitous tubal-sampling pattern in big data and Internet of Things (IoT) applications, which is introduced by many reasons such as high data acquisition cost, downsampling for data compression, sensor node failures, and packet losses in low-power wireless transmissions. To meet the time and accuracy requirements of applications, data completion methods are expected to be accurate as well as fast. However, the existing methods for data completion with the tubal-sampling pattern are either accurate or fast, but not both. In this article, we propose high-performance graphics processing unit (GPU) tensor completion for data completion with the tubal-sampling pattern. First, by exploiting the convolution theorem, we split a tensor least-squares minimization problem into multiple least-squares sub-problems in the frequency domain. In this way, massive parallelisms are exposed for many-core GPU architectures while still preserving high recovery accuracy. Second, we propose computing slice-level and tube-level tasks in batches to improve GPU utilization. Third, we reduce the data transfer cost by eliminating the accesses to the CPU memory inside algorithm loop structures. The experimental results show that the proposed tensor completion is both fast and accurate. Using synthetic data of varying sizes, the proposed GPU tensor completion achieves maximum 248.18 \times 248.18× , 7,403.27 \times 7,403.27× , and 33.27 \times 33.27× speedups over the CPU MATLAB implementation, GPU elem
ISSN:	1045-9219 1558-2183
DOI:	10.1109/TPDS.2020.2975196