Error-correcting Codes for Short Tandem Duplication and Substitution Errors

11/11/2020
by   Yuanyuan Tang, et al.
0

Due to its high data density and longevity, DNA is considered a promising medium for satisfying ever-increasing data storage needs. However, the diversity of errors that occur in DNA sequences makes efficient error-correction a challenging task. This paper aims to address simultaneously correcting two types of errors, namely, short tandem duplication and substitution errors. We focus on tandem repeats of length at most 3 and design codes for correcting an arbitrary number of duplication errors and one substitution error. Because a substituted symbol can be duplicated many times (as part of substrings of various lengths), a single substitution can affect an unbounded substring of the retrieved word. However, we show that with appropriate preprocessing, the effect may be limited to a substring of finite length, thus making efficient error-correction possible. We construct a code for correcting the aforementioned errors and provide lower bounds for its rate. Compared to optimal codes correcting only duplication errors, numerical results show that the asymptotic cost of protecting against an additional substitution is only 0.003 bits/symbol when the alphabet has size 4, an important case corresponding to data storage in DNA.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2020

Error-correcting Codes for Noisy Duplication Channels

Because of its high data density and longevity, DNA is emerging as a pro...
research
08/05/2019

Addressing multiple bit/symbol errors in DRAM subsystem

As DRAM technology continues to evolve towards smaller feature sizes and...
research
08/03/2022

Low-redundancy codes for correcting multiple short-duplication and edit errors

Due to its higher data density, longevity, energy efficiency, and ease o...
research
04/07/2023

Iterative Soft Decoding Algorithm for DNA Storage Using Quality Score and Redecoding

Ever since deoxyribonucleic acid (DNA) was considered as a next-generati...
research
11/13/2019

Single-Error Detection and Correction for Duplication and Substitution Channels

Motivated by mutation processes occurring in in-vivo DNA-storage applica...
research
02/27/2015

Error-Correcting Factorization

Error Correcting Output Codes (ECOC) is a successful technique in multi-...
research
04/06/2017

Associative content-addressable networks with exponentially many robust stable states

The brain must robustly store a large number of memories, corresponding ...

Please sign up or login with your details

Forgot password? Click here to reset