Literature DB >> 17526527

RNAbor: a web server for RNA structural neighbors.

Eva Freyhult1, Vincent Moulton, Peter Clote.   

Abstract

RNAbor provides a new tool for researchers in the biological and related sciences to explore important aspects of RNA secondary structure and folding pathways. RNAbor computes statistics concerning delta-neighbors of a given input RNA sequence and structure (the structure can, for example, be the minimum free energy (MFE) structure). A delta-neighbor is a structure that differs from the input structure by exactly delta base pairs, that is, it can be obtained from the input structure by adding and/or removing exactly delta base pairs. For each distance delta RNAbor computes the density of delta-neighbors, the number of delta-neighbors, and the MFE structure, or MFE (delta) structure, among all delta-neighbors. RNAbor can be used to study possible folding pathways, to determine alternate low-energy structures, to predict potential nucleation sites and to explore structural neighbors of an intermediate, biologically active structure. The web server is available at http://bioinformatics.bc.edu/clotelab/RNAbor.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17526527      PMCID: PMC1933207          DOI: 10.1093/nar/gkm255

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

RNA plays a surprising and previously unsuspected role in many biological processes, such as post-transcriptional regulation, conformational switches, expansion of the genetic code (such as selenocysteine insertion), ribosomal frameshift, metabolite-binding and chemical modification of specific nucleotides in the ribosome. Apart from its catalytic role as a ribonucleic enzyme (ribozyme) (1), RNA can regulate genes in several ways. For example, by hybridizing to a portion of messenger RNA, small ∼ 22 nt RNA molecules perform post-transcriptional gene regulation by RNA interference (RNAi), a process so important that for its discovery the 2006 Nobel Prize in Physiology or Medicine was awarded to A. Z. Fire and C. C. Mello. In addition, by very different means, RNA can perform transcriptional and translational gene regulation by allostery, where a portion of the 5′ untranslated region (5′ UTR) of mRNA, known as a riboswitch (2,3), undergoes a conformational change upon binding a specific ligand such as adenine, guanine or lysine. As the field of RNomics matures, many sophisticated computational tools, e.g. RNA structure prediction, alignment and gene finding, have been developed—see (4,5) for recent overviews. Recently developed programs that are of most relevance here include the program Sfold (6,7), that computes a low energy ensemble of structures by sampling from the partition function (8), and an earlier program RNAsubopt (9) that computes all suboptimal structures within a user-specified number of kcal/mol of the minimum free energy (MFE). In addition, the program RNAshapes (10–12) provides a useful description of RNA branching structure by computing the Boltzmann probability of various shapes and also the MFE structure for various shapes. Here, an RNA shape is an equivalence class of secondary structures, describing the overall branching; for instance the shape of a typical cloverleaf tRNA would be [ [ ] [ ] [ ] ]. In this article, we describe the web server RNAbor, which computes the Boltzmann probability and MFE structures which differ by δ base pairs from a given initial structure. Unlike most of the tools just described, which focus on the MFE structure or a low energy ensemble, RNAbor yields information concerning the secondary structure folding landscape. Potential applications of RNAbor include the design of RNA aptamers (see (13) for a suggestion how RNA might be designed to inhibit the function of the viral enzymes such as HIV-1 reverse transcriptase and hepatitis C NS3 protease), detection of conformational switches, understanding the role played by biologically active structural intermediates and improvement in secondary structure prediction.

MATERIALS AND METHODS

Let denote a given RNA nucleotide sequence, and let be any given secondary structure of . The structure could be the MFE structure of , it could be the secondary structure obtained from the 3-dimensional X-ray conformation or by comparative sequence analysis, or it could be an arbitrary intermediate structure of particular biological significance. For an integer δ, a secondary structure of is a δ-neighbor of , if and differ by exactly δ base pairs [14]. In (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication), we describe new algorithms, which compute the number Nδ of δ-neighbors, the partition function Zδ for δ-neighbors and the MFEδ, and the corresponding MFEδ structure over all δ-neighbors of a fixed structure .

Computing structural neighbors

To give the reader a feeling for how the algorithms work, we present the recurrence relations to compute the number Nδ of δ-neighbors of . Let . If denotes the number of δ-neighbors of the substructure S [, the restriction of to interval [i,j] of , then the number of δ-neighbors of , , can be computed by the following recursion: where (i.e. the set of Watson-Crick base pairs together with wobbles), b0 = 1 if j is base-paired in S [ and 0 otherwise and b is the base pair distance between S [i,j] and a structure on the same interval [i,j] where a base pair between k and j has been added (taking into account all the base pairs in S [ that need to be broken to allow the addition of this base pair). This approach for computing Nδ can be extended to compute the partition function contribution, Zδ, of the set of δ-neighbors and also to compute the MFEδ and the MFEδ structure. Computations are made with respect to the Turner energy model (15,16); treatment of the dangle is similar to that in Vienna RNA Package (option -d2). The algorithms employ dynamic programming, and run in O(Δ · n3) time and O(Δ · n2) space, where n is the sequence length and Δ is the maximum value of δ. Since Δ can be at most n, the run time cannot be worse than O(n4) and space no worse that O(n3), even if the user does not specify a value of Δ. Full details of the algorithms are given in (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication).

Web server

The web server available at runs on a Linux cluster with 20 computational nodes, each with double processors of between 1300 and 3000 MHz and 2 GB RAM (6 Dell PowerEdge 1650, 2 × 1300 MHz Pentium III, 2 GB RAM; 11 Dell PowerEdge 1850, 2 × 2800 + MHz Xeon EM64T, 2 GB RAM; 5 Dell PowerEdge 1850, 2 × 3000 MHz Xeon EM64T, 2 GB RAM).

RESULTS

Due to the time and space constraints of the algorithm, RNA sequences may be of length up to 300 nucleotides. Sequences of length up to 60 are processed interactively and output is displayed in the user's browser window. For sequences of length 61–300, the computation is done off-line and the results are returned to the user by email; for this, the email address is required. The user can either paste an input sequence (with optional secondary structure), or upload a file of the same. The full input consists of up to four lines, illustrated by the following example.The temperature is set to a default value of 37∘ C; however the user can enter any integer temperature between 0 and 100. The only required input is an RNA sequence of length at most 300 nucleotides; the FASTA comment, initial secondary structure and upper bound Δ are optional inputs. If no secondary structure is given, then the initial structure is taken to be the MFE structure, as computed by RNAfold -d2. If the optional input Δ is missing, then Δ is defined to equal the length n of the input sequence ; otherwise Δ is the minimum of the input value and n. For each 0 ≤ δ ≤ Δ, RNAbor computes the Boltzmann probability pδ = Zδ /Z, where the partition function is defined by where R is the universal gas constant and T is temperature in degrees Kelvin. Here, the summation is made over all secondary structures of which are δ-neighbors of . The full partition function Z = ∑ δ Zδ is computed by McCaskill's algorithm (8) if Δ ≥ n. In addition to computing probability pδ, RNAbor computes the number Nδ of δ-neighbors of , the MFEδ over all δ-neighbors of and the MFEδ secondary structure. Tables of the values Nδ and pδ, as well as their graphs, are made available as downloadable files. The five-column text file output, consisting of δ, pδ, Nδ, MFEδ and the MFEδ structure, is depicted in Figure 1.
Figure 1.

Text output of RNAbor on the 51 nt 3 ′ UTR of a mRNA with NCBI accession number MUSGBPS. The five columns in the entire text output from RNAbor are given by the following, in order: (i) value of δ, (ii) Boltzmann probability pδ, (iii) number of δ-neighbors, (iv) MFE of δ-neighbors, denoted MFEδ, (v) MFE secondary structure among all δ-neighbors. In this case, the initial structure is the MFE structure, as determined by RNAfold -d2.

Text output of RNAbor on the 51 nt 3 ′ UTR of a mRNA with NCBI accession number MUSGBPS. The five columns in the entire text output from RNAbor are given by the following, in order: (i) value of δ, (ii) Boltzmann probability pδ, (iii) number of δ-neighbors, (iv) MFE of δ-neighbors, denoted MFEδ, (v) MFE secondary structure among all δ-neighbors. In this case, the initial structure is the MFE structure, as determined by RNAfold -d2.

EXAMPLES

RNAbor can be used to generate alternative low energy structures, which differ markedly from the MFE structure, or from any initially given structure. Figure 1 shows the RNAbor output for a short 3 ′-UTR sequence of an mRNA with NCBI accession number MUSGBPS. The input structure in this example is the MFE structure (as predicted by RNAfold -d2). The RNAbor output indicates two ranges of δ that show higher probabilities than the rest, 0–9 and 20–24. The MFEδ structures at distance δ between 0 and 9 from the MFE structure all have very similar folds and the probability of finding the RNA in a structure at δ between 0 and 9 is 0.63. The probability of finding a structure at δ 20–24 is also relatively high, 0.35, and the MFEδ structures in this range are similar to each other but completely different from the MFE structure. Thus the two highly probable δ ranges represent two possible alternative folds of the RNA. Analyzing the same sequence with Sfold gives similar results. Sfold finds three types of structures (three clusters), with probabilities 0.65, 0.22 and 0.13, respectively. One cluster contains the MFE structure corresponding to the folds at δ values from 0 to 9, another cluster has a centroid structure resembling the structures at δ between 20 and 24, and the third cluster has a centroid structure similar to the MFE19 structure. RNAshapes on the other hand is less successful for this example since the alternative folds as predicted by RNAbor have the same shape [ ], even though the folds are very different. Figure 2 displays the MFE structure and the MFE30 structure of the 101 nt SAM riboswitch with EMBL accession number AP004597.1/118941-119041, with sequence taken from Rfam (17). The MFE structure over all 30-neighbors, the MFE30 structure, is clearly much closer to the real structure than the global MFE structure. Figure 3 displays the Boltzmann probability density, showing a peak for the value δ = 30.
Figure 2.

Two alternative low energy secondary structures for the 101 nt SAM riboswitch with EMBL accession number AP004597.1 from position 118941 to position 119041. This riboswitch has nucleotide sequence UACUUAUCAA GAGAGGUGGA GGGACUGGCC CGCUGAAACC UCAGCAACAG AACGCAUCUG UCUGUGCUAA AUCCUGCAAG CAAUAGCUUG AAAGAUAAGU U. Panel (a) displays the MFE structure with free energy −33.30 kcal/mol, while (b) displays the 30-neighbor of the MFE with free energy of −32.10 kcal/mol. The only significant Boltzmann probabilities were for values around δ = 0 and δ = 30, where p0 = 0.056238 and p30 = 0.151751. Note that the MFE30 structure more closely resembles the expected structure for a riboswitch shown in (c), as determined from the Rfam (17) consensus structure.

Figure 3.

Boltzmann probability density plot for the 101 nt SAM riboswitch (EMBL accession number AP004597.1/118941-119041). The curve shows the probability, pδ = Zδ / Z, for all secondary structures of RNA sequence having base pair distance δ from the MFE structure .

Two alternative low energy secondary structures for the 101 nt SAM riboswitch with EMBL accession number AP004597.1 from position 118941 to position 119041. This riboswitch has nucleotide sequence UACUUAUCAA GAGAGGUGGA GGGACUGGCC CGCUGAAACC UCAGCAACAG AACGCAUCUG UCUGUGCUAA AUCCUGCAAG CAAUAGCUUG AAAGAUAAGU U. Panel (a) displays the MFE structure with free energy −33.30 kcal/mol, while (b) displays the 30-neighbor of the MFE with free energy of −32.10 kcal/mol. The only significant Boltzmann probabilities were for values around δ = 0 and δ = 30, where p0 = 0.056238 and p30 = 0.151751. Note that the MFE30 structure more closely resembles the expected structure for a riboswitch shown in (c), as determined from the Rfam (17) consensus structure. Boltzmann probability density plot for the 101 nt SAM riboswitch (EMBL accession number AP004597.1/118941-119041). The curve shows the probability, pδ = Zδ / Z, for all secondary structures of RNA sequence having base pair distance δ from the MFE structure .

DISCUSSION

In this article, we have introduced the web server RNAbor, which computes the Boltzmann probability and MFE structure over all δ-neighbors for a given RNA sequence and initial secondary structure . The underlying algorithms, described in the forthcoming paper (Freyhult, E., Moulton, V. and Clote, P. Boltzmann probability of RNA structural neighbors and riboswitch detection, submitted for publication), use dynamic programming, involve the Turner energy model (15,16), and require considerable time O(Δ · n3) and space O(Δ · n2) resources. Figures 2 and 3 illustrate the use of RNAbor in better understanding structural aspects of a SAM riboswitch, and indicate that RNAbor should provide a useful complementary tool to programs such as Sfold and RNAshapes for analyzing the ensemble of possible secondary structures on a given RNA sequence.
  17 in total

1.  Metrics on RNA secondary structures.

Authors:  V Moulton; M Zuker; M Steel; R Pointon; D Penny
Journal:  J Comput Biol       Date:  2000 Feb-Apr       Impact factor: 1.479

Review 2.  Nucleic acid and polypeptide aptamers: a powerful approach to ligand discovery.

Authors:  W James
Journal:  Curr Opin Pharmacol       Date:  2001-10       Impact factor: 5.547

3.  A statistical sampling algorithm for RNA secondary structure prediction.

Authors:  Ye Ding; Charles E Lawrence
Journal:  Nucleic Acids Res       Date:  2003-12-15       Impact factor: 16.971

Review 4.  The chemical repertoire of natural ribozymes.

Authors:  Jennifer A Doudna; Thomas R Cech
Journal:  Nature       Date:  2002-07-11       Impact factor: 49.962

5.  Rfam: an RNA family database.

Authors:  Sam Griffiths-Jones; Alex Bateman; Mhairi Marshall; Ajay Khanna; Sean R Eddy
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

6.  Abstract shapes of RNA.

Authors:  Robert Giegerich; Björn Voss; Marc Rehmsmeier
Journal:  Nucleic Acids Res       Date:  2004-09-15       Impact factor: 16.971

7.  The equilibrium partition function and base pair binding probabilities for RNA secondary structure.

Authors:  J S McCaskill
Journal:  Biopolymers       Date:  1990 May-Jun       Impact factor: 2.505

8.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs.

Authors:  T Xia; J SantaLucia; M E Burkard; R Kierzek; S J Schroeder; X Jiao; C Cox; D H Turner
Journal:  Biochemistry       Date:  1998-10-20       Impact factor: 3.162

9.  An mRNA structure that controls gene expression by binding FMN.

Authors:  Wade C Winkler; Smadar Cohen-Chalamish; Ronald R Breaker
Journal:  Proc Natl Acad Sci U S A       Date:  2002-11-27       Impact factor: 11.205

10.  Complete probabilistic analysis of RNA shapes.

Authors:  Björn Voss; Robert Giegerich; Marc Rehmsmeier
Journal:  BMC Biol       Date:  2006-02-15       Impact factor: 7.431

View more
  12 in total

1.  Efficient sampling of RNA secondary structures from the Boltzmann ensemble of low-energy: the boustrophedon method.

Authors:  Yann Ponty
Journal:  J Math Biol       Date:  2007-10-12       Impact factor: 2.259

2.  Fast, approximate kinetics of RNA folding.

Authors:  Evan Senter; Peter Clote
Journal:  J Comput Biol       Date:  2015-02       Impact factor: 1.479

3.  Secondary structural entropy in RNA switch (Riboswitch) identification.

Authors:  Amirhossein Manzourolajdad; Jonathan Arnold
Journal:  BMC Bioinformatics       Date:  2015-04-28       Impact factor: 3.169

4.  RNAmutants: a web server to explore the mutational landscape of RNA secondary structures.

Authors:  Jerome Waldispühl; Srinivas Devadas; Bonnie Berger; Peter Clote
Journal:  Nucleic Acids Res       Date:  2009-06-16       Impact factor: 16.971

5.  Maximum expected accuracy structural neighbors of an RNA secondary structure.

Authors:  Peter Clote; Feng Lou; William A Lorenz
Journal:  BMC Bioinformatics       Date:  2012-04-12       Impact factor: 3.169

6.  Efficient calculation of exact probability distributions of integer features on RNA secondary structures.

Authors:  Ryota Mori; Michiaki Hamada; Kiyoshi Asai
Journal:  BMC Genomics       Date:  2014-12-12       Impact factor: 3.969

7.  Capturing alternative secondary structures of RNA by decomposition of base-pairing probabilities.

Authors:  Taichi Hagio; Shun Sakuraba; Junichi Iwakiri; Ryota Mori; Kiyoshi Asai
Journal:  BMC Bioinformatics       Date:  2018-02-19       Impact factor: 3.169

8.  Characterization and visualization of RNA secondary structure Boltzmann ensemble via information theory.

Authors:  Luan Lin; Wilson H McKerrow; Bryce Richards; Chukiat Phonsom; Charles E Lawrence
Journal:  BMC Bioinformatics       Date:  2018-03-05       Impact factor: 3.169

9.  Random versus Deterministic Descent in RNA Energy Landscape Analysis.

Authors:  Luke Day; Ouala Abdelhadi Ep Souki; Andreas A Albrecht; Kathleen Steinhöfel
Journal:  Adv Bioinformatics       Date:  2016-03-02

10.  Changes in the Plasticity of HIV-1 Nef RNA during the Evolution of the North American Epidemic.

Authors:  Amirhossein Manzourolajdad; Mileidy Gonzalez; John L Spouge
Journal:  PLoS One       Date:  2016-09-29       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.