Robust and Efficient Sorting with Offset-Value Coding

09/17/2022
by   Thanh Do, et al.
0

Sorting and searching are large parts of database query processing, e.g., in the forms of index creation, index maintenance, and index lookup; and comparing pairs of keys is a substantial part of the effort in sorting and searching. We have worked on simple, efficient implementations of decades-old, neglected, effective techniques for fast comparisons and fast sorting, in particular offset-value coding. In the process, we happened upon its mutually beneficial relationship with prefix truncation in run files as well as the duality of compression techniques in row- and column-format storage structures, namely prefix truncation and run-length encoding of leading key columns. We also found a beneficial relationship with consumers of sorted streams, e.g., merging parallel streams, in-stream aggregation, and merge join. We report on our implementation in the context of Google's Napa and F1 Query systems as well as an experimental evaluation of performance and scalability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2022

Offset-value coding in database query processing

Recent work shows how offset-value coding speeds up database query execu...
research
10/01/2020

Sort-based grouping and aggregation

Database query processing requires algorithms for duplicate removal, gro...
research
05/08/2023

Parallel External Sorting of ASCII Records Using Learned Models

External sorting is at the core of many operations in large-scale databa...
research
09/21/2020

Space/time-efficient RDF stores based on circular suffix sorting

In recent years, RDF has gained popularity as a format for the standardi...
research
12/10/2021

FLiMS: a Fast Lightweight 2-way Merger for Sorting

In this paper, we present FLiMS, a highly-efficient and simple parallel ...
research
07/12/2023

WiscSort: External Sorting For Byte-Addressable Storage

We present WiscSort, a new approach to high-performance concurrent sorti...
research
09/14/2022

Multiway Powersort

Powersort (Munro Wild, ESA2018) has recently replaced Timsort's subo...

Please sign up or login with your details

Forgot password? Click here to reset