Sub-O(log n) Out-of-Order Sliding-Window Aggregation

by   Kanat Tangwongsan, et al.

Sliding-window aggregation summarizes the most recent information in a data stream. Users specify how that summary is computed, usually as an associative binary operator because this is the most general known form for which it is possible to avoid naively scanning every window. For strictly in-order arrivals, there are algorithms with O(1) time per window change assuming associative operators. Meanwhile, it is common in practice for streams to have data arriving slightly out of order, for instance, due to clock drifts or communication delays. Unfortunately, for out-of-order streams, one has to resort to latency-prone buffering or pay O( n) time per insert or evict, where n is the window size. This paper presents the design, analysis, and implementation of FiBA, a novel sliding-window aggregation algorithm with an amortized upper bound of O( d) time per insert or evict, where d is the distance of the inserted or evicted value to the closer end of the window. This means O(1) time for in-order arrivals and nearly O(1) time for slightly out-of-order arrivals, with a smooth transition towards O( n) as d approaches n. We also prove a matching lower bound on running time, showing optimality. Our algorithm is as general as the prior state-of-the-art: it requires associativity, but not invertibility nor commutativity. At the heart of the algorithm is a careful combination of finger-searching techniques, lazy rebalancing, and position-aware partial aggregates. We further show how to answer range queries that aggregate subwindows for window sharing. Finally, our experimental evaluation shows that FiBA performs well in practice and supports the theoretical findings.


In-Order Sliding-Window Aggregation in Worst-Case Constant Time

Sliding-window aggregation is a widely-used approach for extracting insi...

Out-of-Order Sliding-Window Aggregation with Efficient Bulk Evictions and Insertions (Extended Version)

Sliding-window aggregation is a foundational stream processing primitive...

Smoothness of Schatten Norms and Sliding-Window Matrix Streams

Large matrices are often accessed as a row-order stream. We consider the...

Low-Latency Sliding Window Algorithms for Formal Languages

Low-latency sliding window algorithms for regular and context-free langu...

Real-time Stream-based Monitoring

We introduce RTLola, a new stream-based specification language for the d...

Implementing Window Functions in a Column-Store with Late Materialization (Extended Version)

A window function is a generalization of the aggregation operation. Unli...

Differentially Private L_2-Heavy Hitters in the Sliding Window Model

The data management of large companies often prioritize more recent data...

Please sign up or login with your details

Forgot password? Click here to reset