New Nearly-Optimal Coreset for Kernel Density Estimation

07/15/2020
āˆ™
by   Wai Ming Tai, et al.
āˆ™
0
āˆ™

Given a point set PāŠ‚ā„^d, kernel density estimation for Gaussian kernel is defined as š’¢_P(x) = 1/|P|āˆ‘_pāˆˆ Pe^-ā€– x-p ā€–^2 for any xāˆˆā„^d. We study how to construct a small subset Q of P such that the kernel density estimation of P can be approximated by the kernel density estimation of Q. This subset Q is called coreset. The primary technique in this work is to construct Ā± 1 coloring on the point set P by the discrepancy theory and apply this coloring algorithm recursively. Our result leverages Banaszczyk's Theorem. When d>1 is constant, our construction gives a coreset of size O(1/Īµāˆš(loglog1/Īµ)) as opposed to the best-known result of O(1/Īµāˆš(log1/Īµ)). It is the first to give a breakthrough on the barrier of āˆš(log) factor even when d=2.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset