A Foundation of Lazy Streaming Graphs
A streaming graph system continuously processes a stream of operations over a large graph. In the big data processing ecosystem, this performance-critical data processing paradigm is emerging with increasing relevance. Lazy processing is a collection of important optimization techniques for streaming graphs, but designing correct, expressive, and efficient lazy streaming graphs is challenging. In this paper, we lay a foundation for lazy streaming graph processing. The resulting DG Calculus features fine-grained in-data lazy processing, endowed with expressive optimizations such as batching, fusion, and splicing. We establish the soundness of DG Calculus through bisimulation with a system for eager graph processing. To the best of our knowledge, DG Calculus is the first foundational calculus for streaming graphs.
READ FULL TEXT