Optimal-Time Queries on BWT-runs Compressed Indexes

06/09/2020
by   Takaaki Nishimoto, et al.
0

Although a significant number of compressed indexes for highly repetitive strings have been proposed thus far, developing compressed indexes that support faster queries remains a challenge. Run-length Burrows-Wheeler transform (RLBWT) is a lossless data compression by a reversible permutation of an input string and run-length encoding, and it has become a popular research topic in string processing. R-index[Gagie et al., ACM'20] is an efficient compressed index on RLBWT whose space usage depends not on string length but the number of runs in an RLBWT, and it supports locate queries in an optimal time with ω(r) words for the number r of runs in the RLBWT of an input string. Following this line of research, we present the first compressed index on RLBWT, which we call r-index-f, that supports various queries including locate, count, extract queries, decompression and prefix search in the optimal time with smaller working space of O(r) words for small alphabets in this paper. We present efficient data structures for computing two important functions of LF and ϕ^-1 in constant time with O(r) words of space, which is a bit step forward in computation time from the previous best result of O(loglog n) time for string length n and O(r) words of space. Finally, We present algorithms for computing queries on RLBWT by leveraging those two data structures in optimal time with O(r) words of space.

READ FULL TEXT
research
06/09/2020

Faster Queries on BWT-runs Compressed Indexes

Although a significant number of compressed indexes for highly repetitiv...
research
02/16/2022

An Optimal-Time RLBWT Construction in BWT-runs Bounded Space

The compression of highly repetitive strings (i.e., strings with many re...
research
02/14/2019

Conversion from RLBWT to LZ77

Converting a compressed format of a string into another compressed forma...
research
08/07/2023

Collapsing the Hierarchy of Compressed Data Structures: Suffix Arrays in Optimal Compressed Space

In the last decades, the necessity to process massive amounts of textual...
research
04/03/2020

Enumeration of LCP values, LCP intervals and Maximal repeats in BWT-runs Bounded Space

Lcp-values, lcp-intervals, and maximal repeats are powerful tools in var...
research
01/29/2019

Fully-functional bidirectional Burrows-Wheeler indexes

Given a string T on an alphabet of size σ, we describe a bidirectional B...
research
04/11/2020

Grammar-compressed Self-index with Lyndon Words

We introduce a new class of straight-line programs (SLPs), named the Lyn...

Please sign up or login with your details

Forgot password? Click here to reset