GraphTango: A Hybrid Representation Format for Efficient Streaming Graph Updates and Analysis

by   Alif Ahmed, et al.

Streaming graph processing involves performing updates and analytics on a time-evolving graph. The underlying representation format largely determines the throughputs of these updates and analytics phases. Existing formats usually employ variations of hash tables or adjacency lists. However, adjacency-list-based approaches perform poorly on heavy-tailed graphs, and the hash-based approaches suffer on short-tailed graphs. We propose GraphTango, a hybrid format that provides excellent update and analytics throughput regardless of the graph's degree distribution. GraphTango switches among three different formats based on a vertex's degree: i) Low-degree vertices store the edges directly with the neighborhood metadata, confining accesses to a single cache line, ii) Medium-degree vertices use adjacency lists, and iii) High-degree vertices use hash tables as well as adjacency lists. In this case, adjacency list provides fast traversal during the analytics phase, while the hash table provides constant-time lookups during the update phase. We further optimized the performance by designing an open-addressing-based hash table that fully utilizes every fetched cache line. In addition, we developed a thread-local lock-free memory pool that allows fast growing/shrinking of the adjacency lists and hash tables in a multi-threaded environment. We evaluated GraphTango with the help of the SAGA-Bench framework and compared it with four other representation formats. On average, GraphTango provides 4.5x higher insertion throughput, 3.2x higher deletion throughput, and 1.1x higher analytics throughput over the next best format. Furthermore, we integrated GraphTango with the state-of-the-art graph processing frameworks DZiG and RisGraph. Compared to the vanilla DZiG and vanilla RisGraph, [GraphTango + DZiG] and [GraphTango + RisGraph] reduces the average batch processing time by 2.3x and 1.5x, respectively.


page 1

page 7


A Dynamic Hash Table for the GPU

We design and implement a fully concurrent dynamic hash table for GPUs w...

A High Throughput Parallel Hash Table on FPGA using XOR-based Memory

Hash table is a fundamental data structure for quick search and retrieva...

Lock-Free Transactional Adjacency List

Adjacency lists are frequently used in graphing or map based application...

IcebergHT: High Performance PMEM Hash Tables Through Stability and Low Associativity

Modern hash table designs strive to minimize space while maximizing spee...

A+ Indexes: Lightweight and Highly Flexible Adjacency Lists for Graph Database Management Systems

Graph database management systems (GDBMSs) are highly optimized to perfo...

RisGraph: A Real-Time Streaming System for Evolving Graphs

Graphs in the real world are constantly changing and of large scale. In ...

A Closer Look at Lightweight Graph Reordering

Graph analytics power a range of applications in areas as diverse as fin...

Please sign up or login with your details

Forgot password? Click here to reset