Faster Attractor-Based Indexes

11/30/2018
by   Gonzalo Navarro, et al.
0

String attractors are a novel combinatorial object encompassing most known compressibility measures for highly-repetitive texts. Recently, the first index building on an attractor of size γ of a text T[1..n] was obtained. It uses O(γ(n/γ)) space and finds the occ occurrences of a pattern P[1..m] in time O(m n + occ ^ϵ n) for any constant ϵ>0. We now show how to reduce the search time to O(m + (occ+1) ^ϵ n) within the same space, and ultimately obtain the optimal O(m + occ) time within O(γ(n/γ) n) space. Further, we show how to count the number of occurrences of P in time O(m+^3+ϵ n) within O(γ(n/γ)) space, or the optimal O(m) time within O(γ(n/γ) n) space. These turn out to be the first optimal-time indexes within grammar- and Lempel-Ziv-bounded space. As a byproduct of independent interest, we show how to build, in O(n n) expected time and without knowing the size γ of the smallest attractor, a run-length context-free grammar of size O(γ(n/γ)) generating (only) T.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/08/2018

Fully-Functional Suffix Trees and Optimal Text Searching in BWT-runs Bounded Space

Indexing highly repetitive texts --- such as genomic databases, software...
research
12/05/2022

Space-efficient conversions from SLPs

Given a straight-line program with g rules for a text T [1..n], we can b...
research
06/01/2022

Near-Optimal Search Time in δ-Optimal Space

Two recent lower bounds on the compressiblity of repetitive sequences, δ...
research
02/28/2018

Fast Lempel-Ziv Decompression in Linear Space

We consider the problem of decompressing the Lempel-Ziv 77 representatio...
research
05/08/2021

Construction of Sparse Suffix Trees and LCE Indexes in Optimal Time and Space

The notions of synchronizing and partitioning sets are recently introduc...
research
02/16/2018

Online LZ77 Parsing and Matching Statistics with RLBWTs

Lempel-Ziv 1977 (LZ77) parsing, matching statistics and the Burrows-Whee...
research
10/18/2022

Computing MEMs on Repetitive Text Collections

We consider the problem of computing the Maximal Exact Matches (MEMs) of...

Please sign up or login with your details

Forgot password? Click here to reset