Non-Empty Bins with Simple Tabulation Hashing

10/31/2018
by   Anders Aamand, et al.
0

We consider the hashing of a set X⊆ U with |X|=m using a simple tabulation hash function h:U→ [n]={0,...,n-1} and analyse the number of non-empty bins, that is, the size of h(X). We show that the expected size of h(X) matches that with fully random hashing to within low-order terms. We also provide concentration bounds. The number of non-empty bins is a fundamental measure in the balls and bins paradigm, and it is critical in applications such as Bloom filters and Filter hashing. For example, normally Bloom filters are proportioned for a desired low false-positive probability assuming fully random hashing (see <en.wikipedia.org/wiki/Bloom_filter>). Our results imply that if we implement the hashing with simple tabulation, we obtain the same low false-positive probability for any possible input.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2013

Sparse similarity-preserving hashing

In recent years, a lot of attention has been devoted to efficient neares...
research
05/01/2019

Fast hashing with Strong Concentration Bounds

Previous work on tabulation hashing of Pǎtraşcu and Thorup from STOC'11 ...
research
02/09/2018

Convolutional Hashing for Automated Scene Matching

We present a powerful new loss function and training scheme for learning...
research
09/24/2020

A Case for Partitioned Bloom Filters

In a partitioned Bloom Filter the m bit vector is split into k disjoint ...
research
10/05/2020

Note on Generalized Cuckoo Hashing with a Stash

Cuckoo hashing is a common hashing technique, guaranteeing constant-time...
research
10/10/2018

CRH: A Simple Benchmark Approach to Continuous Hashing

In recent years, the distinctive advancement of handling huge data promo...
research
07/16/2018

A Lyra2 FPGA Implementation for Lyra2REv2-Based Cryptocurrencies

Lyra2REv2 is a hashing algorithm that consists of a chain of individual ...

Please sign up or login with your details

Forgot password? Click here to reset