Distributed mining of time--faded heavy hitters

12/01/2018
by   Marco Pulimeno, et al.
0

We present P2PTFHH (Peer--to--Peer Time--Faded Heavy Hitters) which, to the best of our knowledge, is the first distributed algorithm for mining time--faded heavy hitters on unstructured P2P networks. P2PTFHH is based on the FDCMSS (Forward Decay Count--Min Space-Saving) sequential algorithm, and efficiently exploits an averaging gossip protocol, by merging in each interaction the involved peers' underlying data structures. We formally prove the convergence and correctness properties of our distributed algorithm and show that it is fast and simple to implement. Extensive experimental results confirm that P2PTFHH retains the extreme accuracy and error bound provided by FDCMSS whilst showing excellent scalability. Our contributions are three-fold: (i) we prove that the averaging gossip protocol can be used jointly with our augmented sketch data structure for mining time--faded heavy hitters; (ii) we prove the error bounds on frequency estimation; (iii) we experimentally prove that P2PTFHH is extremely accurate and fast, allowing near real time processing of large datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2018

Mining frequent items in unstructured P2P networks

Large scale decentralized systems, such as P2P, sensor or IoT device net...
research
11/04/2021

Count-Less: A Counting Sketch for the Data Plane of High Speed Switches

Demands are increasing to measure per-flow statistics in the data plane ...
research
01/17/2021

Data stream fusion for accurate quantile tracking and analysis

UDDSKETCH is a recent algorithm for accurate tracking of quantiles in da...
research
02/10/2023

Count-min sketch with variable number of hash functions: an experimental study

Conservative Count-Min, an improved version of Count-Min sketch [Cormode...
research
09/13/2018

A Self-Stabilizing Hashed Patricia Trie

While a lot of research in distributed computing has covered solutions f...
research
04/18/2018

Nearly Optimal Separation Between Partially And Fully Retroactive Data Structures

Since the introduction of retroactive data structures at SODA 2004, a ma...

Please sign up or login with your details

Forgot password? Click here to reset