Optimal Document Exchange and New Codes for Small Number of Insertions and Deletions

04/10/2018
by   Bernhard Haeupler, et al.
0

This paper gives a communication-optimal document exchange protocol and an efficient near optimal derandomization. This also implies drastically improved error correcting codes for small number of adversarial insertions and deletions. For any n and k < n our randomized hashing scheme takes any n-bit file F and computes a O(k n/k)-bit summary from which one can reconstruct F given a related file F' with edit distance ED(F,F') ≤ k. The size of our summary is information-theoretically order optimal for all values of k, positively answering a question of Orlitsky. It also is the first non-trivial solution when a small constant fraction of symbols have been edited, producing an optimal summary of size O(H(δ)n) for k=δ n. This concludes a long series of better-and-better protocols which produce larger summaries for sub-linear values of k. In particular, the recent break-through of [Belazzougi, Zhang; STOC'16] assumes that k < n^ϵ and produces a summary of size O(k^2 k + k n). We also give an efficient derandomization with near optimal summary size O(k ^2 n/k) improving, for every k, over a deterministic O(k^2 + k ^2 n) document exchange scheme by Belazzougi. This directly leads to near optimal systematic error correcting codes which efficiently can recover from any k insertions and deletions while having Θ(k ^2 n/k) bits of redundancy. For the setting of k=nϵ this O(ϵ^2 1/ϵ· n)-bit redundancy is near optimal and a quadratic improvement over the binary codes of Guruswami and Li and Haeupler, Shahrasbi and Vitercik which have redundancy Θ(√(ϵ)^O(1)1/ϵ· n).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset