Coresets for Data Discretization and Sine Wave Fitting

03/06/2022
by   Alaa Maalouf, et al.
58

In the monitoring problem, the input is an unbounded stream P=p_1,p_2⋯ of integers in [N]:={1,⋯,N}, that are obtained from a sensor (such as GPS or heart beats of a human). The goal (e.g., for anomaly detection) is to approximate the n points received so far in P by a single frequency sin, e.g. min_c∈ Ccost(P,c)+λ(c), where cost(P,c)=∑_i=1^n sin^2(2π/N p_ic), C⊆ [N] is a feasible set of solutions, and λ is a given regularization function. For any approximation error ε>0, we prove that every set P of n integers has a weighted subset S⊆ P (sometimes called core-set) of cardinality |S|∈ O(log(N)^O(1)) that approximates cost(P,c) (for every c∈ [N]) up to a multiplicative factor of 1±ε. Using known coreset techniques, this implies streaming algorithms using only O((log(N)log(n))^O(1)) memory. Our results hold for a large family of functions. Experimental results and open source code are provided.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro