Efficient query processing with optimistically compressed hash tables & strings in the USSR

Gubner T, Leis V, Boncz P (2020)


Publication Type: Conference contribution

Publication year: 2020

Publisher: IEEE Computer Society

Book Volume: 2020-April

Pages Range: 301-312

Conference Proceedings Title: Proceedings - International Conference on Data Engineering

Event location: Dallas, TX US

ISBN: 9781728129037

DOI: 10.1109/ICDE48307.2020.00033

Abstract

Modern query engines rely heavily on hash tables for query processing. Overall query performance and memory footprint is often determined by how hash tables and the tuples within them are represented. In this work, we propose three complementary techniques to improve this representation: Domain-Guided Prefix Suppression bit-packs keys and values tightly to reduce hash table record width. Optimistic Splitting decomposes values (and operations on them) into (operations on) frequently-accessed and infrequently-accessed value slices. By removing the infrequently-accessed value slices from the hash table record, it improves cache locality. The Unique Strings Self-aligned Region (USSR) accelerates handling frequently-occurring strings, which are very common in real-world data sets, by creating an on-the-fly dictionary of the most frequent strings. This allows executing many string operations with integer logic and reduces memory pressure.We integrated these techniques into Vectorwise. On the TPC-H benchmark, our approach reduces peak memory consumption by 2-4× and improves performance by up to 1.5×. On a real-world BI workload, we measured a 2× improvement in performance and in micro-benchmarks we observed speedups of up to 25×.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Gubner, T., Leis, V., & Boncz, P. (2020). Efficient query processing with optimistically compressed hash tables & strings in the USSR. In Proceedings - International Conference on Data Engineering (pp. 301-312). Dallas, TX, US: IEEE Computer Society.

MLA:

Gubner, Tim, Viktor Leis, and Peter Boncz. "Efficient query processing with optimistically compressed hash tables & strings in the USSR." Proceedings of the 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, TX IEEE Computer Society, 2020. 301-312.

BibTeX: Download