Approximating Text-to-Pattern Distance via Dimensionality Reduction

02/09/2020
by   Przemysław Uznański, et al.
0

Text-to-pattern distance is a fundamental problem in string matching, where given a pattern of length m and a text of length n, over integer alphabet, we are asked to compute the distance between pattern and text at every location. The distance function can be e.g. Hamming distance or ℓ_p distance for some parameter p > 0. Almost all state-of-the-art exact and approximate algorithms developed in the past ∼ 40 years were using FFT as a black-box. In this work we present O(n/ε^2) time algorithms for (1±ε)-approximation of ℓ_2 distances, and O(n/ε^3) algorithm for approximation of Hamming and ℓ_1 distances, all without use of FFT. This is independent to the very recent development by Chan et al. [STOC 2020], where O(n/ε^2) algorithm for Hamming distances not using FFT was presented – although their algorithm is much more "combinatorial", our techniques apply to other norms than Hamming.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset