Dynamic Tensor Linearization and Time Slicing for Efficient Factorization of Infinite Data Streams

Soh Y, Helal AE, Checconi F, Laukemann J, Tithi JJ, Ranadive T, Petrini F, Choi JW (2023)


Publication Type: Conference contribution

Publication year: 2023

Conference Proceedings Title: 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Event location: St. Petersburg, FL US

DOI: 10.1109/IPDPS54959.2023.00048

Abstract

Streaming tensor factorization is an effective tool for unsupervised analysis of time-evolving sparse data, which emerge in many critical domains such as cybersecurity and trend analysis. In contrast to traditional tensors, time-evolving tensors demonstrate extreme sparsity and sparsity variation over time, resulting in irregular memory access and inefficient use of parallel computing resources. Additionally, due to the prohibitive cost of dynamically generating compressed sparse tensor formats, the state-of-the-art approaches process streaming tensors in a raw form that fails to capture data locality and suffers from high synchronization cost. To address these challenges, we propose a new dynamic tensor linearization framework that quickly encodes streaming multi-dimensional data on-the-fly in a compact representation, which has substantially lower memory usage and higher data reuse and parallelism than the original raw data. This is achieved by using a spatial sketching algorithm that keeps all incoming nonzero elements but remaps them into a tensor sketch with considerably reduced multi-dimensional image space. Moreover, we present a dynamic time slicing mechanism that uses variable-width time slices (instead of the traditional fixed-width) to balance the frequency of factor updates and the utilization of computing resources. We demonstrate the efficacy of our framework by accelerating two high-performance streaming tensor algorithms, namely, CP-stream and spCP-stream, and significantly improve their performance for a range of real-world streaming tensors. On a modern 56-core CPU, our framework achieves 10.3 − 11× and 6.4 − 7.2× geometric-mean speedup for the CP-stream and spCP-stream algorithms, respectively.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Soh, Y., Helal, A.E., Checconi, F., Laukemann, J., Tithi, J.J., Ranadive, T.,... Choi, J.W. (2023). Dynamic Tensor Linearization and Time Slicing for Efficient Factorization of Infinite Data Streams. In 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS). St. Petersburg, FL, US.

MLA:

Soh, Yongseok, et al. "Dynamic Tensor Linearization and Time Slicing for Efficient Factorization of Infinite Data Streams." Proceedings of the International Symposium on Parallel and Distributed Processing (IPDPS), St. Petersburg, FL 2023.

BibTeX: Download