Graphsurge: Graph Analytics on View Collections Using Differential Computation

04/11/2020
by   Siddhartha Sahu, et al.
0

This paper presents the design and implementation of a new open-source view-based graph analytics system called Graphsurge. Graphsurge is designed to support applications that analyze multiple snapshots or views of a large-scale graph. Users program Graphsurge through a declarative graph view definition language to create views over input graphs and a Differential Dataflow-based programming API to write analytics computations. A key feature of GVDL is the ability to organize views into view collections, which allows Graphsurge to share computation across views by performing computations differentially. We then introduce two optimization problems that naturally arises in our setting. First is the view ordering problem to determine the order of views that leads to minimum differences across consecutive views. We prove this problem is NP-hard and show a constant-factor approximation algorithm drawn from literature. Second is the collection splitting problem to decide on which views to run computations differentially vs from scratch, for which we present an adaptive solution that makes decisions at runtime. Graphsurge is implemented on top of the Timely and Differential Dataflow systems. We present extensive experiments to demonstrate the benefits of running computations differentially for view collections and our view ordering and collection splitting optimizations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2019

Kaskade: Graph Views for Efficient Graph Analytics

Graphs are an increasingly popular way to model real-world entities and ...
research
06/19/2022

Privacy-Preserving Analytics on Decentralized Social Graphs: The Case of Eigendecomposition

Analytics over social graphs allows to extract valuable knowledge and in...
research
02/04/2023

Self-supervised Multi-view Disentanglement for Expansion of Visual Collections

Image search engines enable the retrieval of images relevant to a query ...
research
01/21/2014

Domain Views for Constraint Programming

Views are a standard abstraction in constraint programming: They make it...
research
12/20/2019

Online Analysis of Distributed Dataflows with Timely Dataflow

We present ST2, an end-to-end solution to analyze distributed dataflows ...
research
07/30/2022

Optimizing Differentially-Maintained Recursive Queries on Dynamic Graphs

Differential computation (DC) is a highly general incremental computatio...
research
12/10/2022

Adore: Differentially Oblivious Relational Database Operators

There has been a recent effort in applying differential privacy on memor...

Please sign up or login with your details

Forgot password? Click here to reset