LinearCoFold and LinearCoPartition: Linear-Time Algorithms for Secondary Structure Prediction of Interacting RNA molecules

10/26/2022
by   He Zhang, et al.
0

Many ncRNAs function through RNA-RNA interactions. Fast and reliable RNA structure prediction with consideration of RNA-RNA interaction is useful. Some existing tools are less accurate due to omitting the competing of intermolecular and intramolecular base pairs, or focus more on predicting the binding region rather than predicting the complete secondary structure of two interacting strands. Vienna RNAcofold, which reduces the problem into the classical single sequence folding by concatenating two strands, scales in cubic time against the combined sequence length, and is slow for long sequences. To address these issues, we present LinearCoFold, which predicts the complete minimum free energy structure of two strands in linear runtime, and LinearCoPartition, which calculates the cofolding partition function and base pairing probabilities in linear runtime. LinearCoFold and LinearCoPartition follows the concatenation strategy of RNAcofold, but are orders of magnitude faster than RNAcofold. For example, on a sequence pair with combined length of 26,190 nt, LinearCoFold is 86.8x faster than RNAcofold MFE mode (0.6 minutes vs. 52.1 minutes), and LinearCoPartition is 642.3x faster than RNAcofold partition function mode (1.8 minutes vs. 1156.2 minutes). Different from the local algorithms, LinearCoFold and LinearCoPartition are global cofolding algorithms without restriction on base pair length. Surprisingly, LinearCoFold and LinearCoPartition's predictions have higher PPV and sensitivity of intermolecular base pairs. Furthermore, we apply LinearCoFold to predict the RNA-RNA interaction between SARS-CoV-2 gRNA and human U4 snRNA, which has been experimentally studied, and observe that LinearCoFold's prediction correlates better to the wet lab results.

READ FULL TEXT

page 1

page 7

research
12/31/2019

LinearPartition: Linear-Time Approximation of RNA Folding Partition Function and Base Pairing Probabilities

RNA secondary structure prediction is widely used to understand RNA func...
research
06/29/2022

LinearAlifold: Linear-Time Consensus Structure Prediction for RNA Alignments

Predicting the consensus structure of a set of aligned RNA homologs is a...
research
07/18/2023

LinearSankoff: Linear-time Simultaneous Folding and Alignment of RNA Homologs

The classical Sankoff algorithm for the simultaneous folding and alignme...
research
12/22/2019

LinearFold: linear-time approximate RNA folding by 5'-to-3' dynamic programming and beam search

Motivation: Predicting the secondary structure of an RNA sequence is use...
research
10/31/2017

Designing RNA Secondary Structures is Hard

An RNA sequence is a word over an alphabet on four elements {A,C,G,U} ca...
research
05/09/2018

A Click Sequence Model for Web Search

Getting a better understanding of user behavior is important for advanci...
research
05/26/2017

Predicting Human Interaction via Relative Attention Model

Predicting human interaction is challenging as the on-going activity has...

Please sign up or login with your details

Forgot password? Click here to reset