Approximate Convex Hull of Data Streams
Given a finite set of points P ⊆R^d, we would like to find a small subset S ⊆ P such that the convex hull of S approximately contains P. More formally, every point in P is within distance ϵ from the convex hull of S. Such a subset S is called an ϵ-hull. Computing an ϵ-hull is an important problem in computational geometry, machine learning, and approximation algorithms. In many real world applications, the set P is too large to fit in memory. We consider the streaming model where the algorithm receives the points of P sequentially and strives to use a minimal amount of memory. Existing streaming algorithms for computing an ϵ-hull require O(ϵ^-(d-1)/2) space, which is optimal for a worst-case input. However, this ignores the structure of the data. The minimal size of an ϵ-hull of P, which we denote by OPT, can be much smaller. A natural question is whether a streaming algorithm can compute an ϵ-hull using only O(OPT) space. We begin with lower bounds that show that it is not possible to have a single-pass streaming algorithm that computes an ϵ-hull with O(OPT) space. We instead propose three relaxations of the problem for which we can compute ϵ-hulls using space near-linear to the optimal size. Our first algorithm for points in R^2 that arrive in random-order uses O( n·OPT) space. Our second algorithm for points in R^2 makes O((1/ϵ)) passes before outputting the ϵ-hull and requires O(OPT) space. Our third algorithm for points in R^d for any fixed dimension d outputs an ϵ-hull for all but δ-fraction of directions and requires O(OPT·OPT) space.
READ FULL TEXT