Literature DB >> 34816262

LinearTurboFold: Linear-Time Global Prediction of Conserved Structures for RNA Homologs with Applications to SARS-CoV-2.

Sizhen Li1, He Zhang2,1, Liang Zhang1,2, Kaibo Liu2,1, Boxiang Liu2, David H Mathews3, Liang Huang1,2.   

Abstract

The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in SARS-CoV-2 genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length, and are thus infeasible for coronaviruses, which possess the longest genomes (∼30,000 nt ) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions. Here we present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Surprisingly, on a group of SARS-CoV-2 and SARS-related genomes, LinearTurbo-Fold's purely in silico prediction not only is close to experimentally-guided models for local structures, but also goes far beyond them by capturing the end-to-end pairs between 5' and 3' UTRs (∼29,800 nt apart) that match perfectly with a purely experimental work. Furthermore, LinearTurboFold identifies novel conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, siRNAs, CRISPR-Cas13 guide RNAs and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies, and will be a useful tool in fighting the current and future pandemics. SIGNIFICANCE STATEMENT: Conserved RNA structures are critical for designing diagnostic and therapeutic tools for many diseases including COVID-19. However, existing algorithms are much too slow to model the global structures of full-length RNA viral genomes. We present LinearTurboFold, a linear-time algorithm that is orders of magnitude faster, making it the first method to simultaneously fold and align whole genomes of SARS-CoV-2 variants, the longest known RNA virus (∼30 kilobases). Our work enables unprecedented global structural analysis and captures long-range interactions that are out of reach for existing algorithms but crucial for RNA functions. LinearTurboFold is a general technique for full-length genome studies and can help fight the current and future pandemics.

Entities:  

Year:  2021        PMID: 34816262      PMCID: PMC8609897          DOI: 10.1101/2020.11.23.393488

Source DB:  PubMed          Journal:  bioRxiv


  62 in total

1.  A statistical sampling algorithm for RNA secondary structure prediction.

Authors:  Ye Ding; Charles E Lawrence
Journal:  Nucleic Acids Res       Date:  2003-12-15       Impact factor: 16.971

2.  ProbCons: Probabilistic consistency-based multiple sequence alignment.

Authors:  Chuong B Do; Mahathi S P Mahabhashyam; Michael Brudno; Serafim Batzoglou
Journal:  Genome Res       Date:  2005-02       Impact factor: 9.043

3.  Local RNA target structure influences siRNA efficacy: systematic analysis of intentionally designed binding regions.

Authors:  Steffen Schubert; Arnold Grünweller; Volker A Erdmann; Jens Kurreck
Journal:  J Mol Biol       Date:  2005-05-13       Impact factor: 5.469

4.  Local similarity in RNA secondary structures.

Authors:  Matthias Höchsmann; Thomas Töller; Robert Giegerich; Stefan Kurtz
Journal:  Proc IEEE Comput Soc Bioinform Conf       Date:  2003

5.  Secondary structure of the 5' nontranslated regions of hepatitis C virus and pestivirus genomic RNAs.

Authors:  E A Brown; H Zhang; L H Ping; S M Lemon
Journal:  Nucleic Acids Res       Date:  1992-10-11       Impact factor: 16.971

6.  RNA targeting with CRISPR-Cas13.

Authors:  Omar O Abudayyeh; Jonathan S Gootenberg; Patrick Essletzbichler; Shuo Han; Julia Joung; Joseph J Belanto; Vanessa Verdine; David B T Cox; Max J Kellner; Aviv Regev; Eric S Lander; Daniel F Voytas; Alice Y Ting; Feng Zhang
Journal:  Nature       Date:  2017-10-04       Impact factor: 49.962

7.  Structural and functional conservation of the programmed -1 ribosomal frameshift signal of SARS coronavirus 2 (SARS-CoV-2).

Authors:  Jamie A Kelly; Alexandra N Olson; Krishna Neupane; Sneha Munshi; Josue San Emeterio; Lois Pollack; Michael T Woodside; Jonathan D Dinman
Journal:  J Biol Chem       Date:  2020-06-22       Impact factor: 5.157

8.  Efficient siRNA selection using hybridization thermodynamics.

Authors:  Zhi John Lu; David H Mathews
Journal:  Nucleic Acids Res       Date:  2007-12-10       Impact factor: 16.971

9.  mRNAs and lncRNAs intrinsically form secondary structures with short end-to-end distances.

Authors:  Wan-Jung C Lai; Mohammad Kayedkhordeh; Erica V Cornell; Elie Farah; Stanislav Bellaousov; Robert Rietmeijer; Enea Salsi; David H Mathews; Dmitri N Ermolenko
Journal:  Nat Commun       Date:  2018-10-18       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.