Literature DB >> 25215508

Vfold: a web server for RNA structure and folding thermodynamics prediction.

Xiaojun Xu1, Peinan Zhao1, Shi-Jie Chen1.   

Abstract

BACKGROUND: The ever increasing discovery of non-coding RNAs leads to unprecedented demand for the accurate modeling of RNA folding, including the predictions of two-dimensional (base pair) and three-dimensional all-atom structures and folding stabilities. Accurate modeling of RNA structure and stability has far-reaching impact on our understanding of RNA functions in human health and our ability to design RNA-based therapeutic strategies.
RESULTS: The Vfold server offers a web interface to predict (a) RNA two-dimensional structure from the nucleotide sequence, (b) three-dimensional structure from the two-dimensional structure and the sequence, and (c) folding thermodynamics (heat capacity melting curve) from the sequence. To predict the two-dimensional structure (base pairs), the server generates an ensemble of structures, including loop structures with the different intra-loop mismatches, and evaluates the free energies using the experimental parameters for the base stacks and the loop entropy parameters given by a coarse-grained RNA folding model (the Vfold model) for the loops. To predict the three-dimensional structure, the server assembles the motif scaffolds using structure templates extracted from the known PDB structures and refines the structure using all-atom energy minimization.
CONCLUSIONS: The Vfold-based web server provides a user friendly tool for the prediction of RNA structure and stability. The web server and the source codes are freely accessible for public use at "http://rna.physics.missouri.edu".

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25215508      PMCID: PMC4162592          DOI: 10.1371/journal.pone.0107504

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The increasing discoveries of noncoding RNAs demand more than ever the information about RNA structures [1]–[5]. However, laborious, time-consuming X-ray crystallographic or NMR spectroscopic measurements alone cannot catch up the pace with the rapidly increasing number of biologically significant RNAs such as noncoding regulatory RNAs. As a result, RNA structural genomics cannot just rely on the experimental determination of the structures. This underscores the request for accurate computational models of RNA structure prediction. RNA structures can be described at the two-dimensional (2D) and three-dimensional (3D) levels, respectively. A 2D structure is defined by the base pairs contained in the structure. Helices and loops, as defined by the base pairs contained in the structure, can be diagrammatically depicted by an RNA 2D structure. The 2D structure of an RNA provides the structural constraints to the formation of the 3D structure [6]–[9], where helices and loops are assembled in the 3D space. RNA free energy landscape can have multiple free energy minima [10]–[13]. Therefore, an RNA can often adopt multiple stable and metastable structures. Computational prediction for RNA 2D structures falls into two general categories [14]: sequence comparison [15]–[18] and free energy minimization [19]–[25]. Sequence comparison-based methods rely on base covariation and can usually only infer the information about the canonical base pairs. The inclusion of non-canonical base pairs can cause covariation analysis much more convoluted [26]. However, non-canonical base pairs such as mismatched base pairs in the loop regions, may be crucial for folding stability and 3D structure folding. For example, non-canonical base pairs can influence the loop and junction structures and thus play an critical role in determining helix orientations. The accuracy of computational prediction is usually better for methods that consider “fold recognition” [27]: structure is usually more conserved than sequence and the functional core regions are usually more conserved at all levels. Therefore, computational methods are highly useful and reliable for structures with known homologous folds or structures with sufficient auxiliary structural data. However, these methods depend on the availability of homologous sequences, which significantly limits their applicability. Structure prediction algorithms based on free energy minimization search for the structure or suboptimal structures with the lowest free energy from an ensemble of possible structures. Most of the algorithms employ the same empirical thermodynamic parameters (the Turner parameters [28]) for the different secondary structural elements based on the nearest-neighbor model. However, unlike the entropy (free-energy) parameters for simple loops (hairpin, bulge, and internal loops), which have been determined from thermodynamic experiments [28], quantitative understanding of many other interactions remains very limited. Moreover, because of the possible conformational coupling between the loops, the loop entropies are not additive for tertiary motifs such as loop-loop kissing contacts [29], [30]. For such cases, thermodynamic experiments alone are not sufficient to directly provide loop entropies and free energies due to the complexity of the problem. Current RNA folding algorithms for 3D structures are generally limited to simple (short) structures. Further development of the models is hampered by several challenges including conformational sampling and evaluation of the energies for the tertiary contacts. Combined with discrete molecular dynamics (DMD) [31], coarse-grained approaches [31]–[33] can be used to predict structures as well as folding mechanisms with knowledge-based potentials derived from known structures. Structure assembly approaches [26], [34]–[36], based on the assumption that 3D fold can be recognized by the alignment of sequences and secondary structure patterns, have shown promising results in RNA 3D structure predictions. However, one of the common limitations to the structure assembly approaches is the degree of divergence of the fragment library [37], [38]. The recently developed Vfold model is a statistical mechanics-based RNA folding model [36], [39]–[42] that can predict RNA 2D and 3D structures as well as RNA folding thermodynamic stabilities from RNA sequence. In this report, we briefly describe the underlying algorithm and the practical usage of a web server for the Vfold model (http://rna.physics.missouri.edu). The server provides predictions for the structure and melting thermodynamics for user-provided RNA sequences. The results from the server, in combination with experimental data, may offer useful insights into RNA structure and function.

Methods

The Vfold model was first reported in 2005 for RNA secondary structure prediction [39]. Since then, the model has been extended to predict the structures and folding thermodynamics of H-type pseudoknots and RNA/RNA complexes [40]–[42]. Furthermore, Vfold was developed to predict 3D all-atom structures using a physics-based de novo method [36]. Below we describe several unique features of the Vfold model. The detailed underlining algorithms can be found in the published papers [36], [39]–[42] and in the Supporting Information (file Data S1) of this paper.

Features of the Vfold algorithm

One of the unique features of the Vfold model for 2D structure (base pairs) prediction is its ability to compute the RNA motif-based loop entropies. Using the virtual bonds to represent the backbone conformations, the model samples fluctuations of loops/junction conformations in the 3D space through conformational enumeration [39] (see Figure S1 in Data S1 for details). By calculating the probability of loop formation, the model can give the conformational entropy parameters for the formation of the different types of loops such as pseudoknot loops. The model has the advantage of accounting for chain connectivity, exclude volume and the completeness of conformational ensemble. Studies by us and other groups show that an accurate entropy parameter improves the prediction of RNA secondary structures and thermodynamic stabilities [39]–[43]. Another notable feature of Vfold model is its ability to model intraloop mismatched base pairs for RNA loops (see Figure S2 in Data S1 for details). By enumerating all the possible (sequence-dependent) intra-loop mismatches, the Vfold model can partially account for the sequence-dependence of the loop free energy. Therefore, the Vfold-predicted loop free energy is not only loop size-dependent but also sequence-dependent. The model provides a unique tool for predicting many important information that cannot be obtained through traditional methods. For example, the model can calculate the dramatic decrease in loop entropy upon the formation of mismatched base pairs in a loop. The model can predict the populational distribution of the different loop conformations that contain the different intra-loop mismatches. The predicted mismatched base pairs provide constraints to otherwise flexible loop structures. For a given 2D structure, the Vfold-based 3D structure prediction method [36] searches for the appropriate template for each loop/junction in the structure, and assembles the 3D template structures into a scaffold for further structure refinement. In comparison with other template-based (structure assembly) methods such as FARNA/FARFAR [34], [35] and MC-Sym [26], which sample structures from small fragments of the known RNA structures, the Vfold-based method uses motif-based instead of fragment-based templates. The main advantage of the multi-scale approach used in the Vfold 3D modeling [36] is that the virtual bond tertiary structure as the initial state may already lie in the free energy basin, so the structure refinement can avoid large structural rearrangements for the effective prediction of the final native structure.

Energy parameters

The Vfold model provides pre-tabulated entropy parameters (available in the Vfold web server) for hairpin loops [39], internal/bulge loops [39], H-type pseudoknots with/without inter-helix junction [40], [41] and hairpin-hairpin kissing motifs [42]. For free energy-based RNA structure modeling, the predicted structures and thermodynamic stabilities could be sensitive to the choice of energy parameters. Therefore, the server provides predictions based on two different sets of the thermodynamic parameters for base stacks, including mismatched base stacks: (1) from the Turner parameters 04 version [28], and (2) from the MFOLD 2.3 version [20].

3D template library

To construct the template library, Vfold classifies all the known structures into different motifs, such as helices, hairpin loops, internal/bulge loops, pseudoknots, N-way junctions (N3) (see Figure.1). The motif-based template library was built from 2621 PDB structures, including all the PDB entries released before January of 2014. It includes RNA-involved complexes except RNA/DNA hybrids. The redundant templates for those with root mean square deviation (RMSD) 1.5 for the same motif, same size and identical sequence are removed. The complete list of the non-redundant 3D template list can be found in the Vfold web server.
Figure 1

RNA 3D template structure database.

Results

The Vfold server contains three parts: (a) Vfold2D predicts the RNA 2D structure (pseudoknotted or non-pseudoknotted) from the sequence, (b) VfoldThermal predicts the melting curve (folding thermodynamics) from the sequence, and (c) Vfold3D predicts RNA 3D structure for a given 2D structure and the sequence. The computational time scales with the chain length N as O(N 6) and the memory scales as O(N 2) for Vfold2D and VfoldThermal. To avoid long computational time, the current version of the Vfold server restricts the RNA sequence length up to 140 nts.

Vfold2D: Predicting RNA 2D structures from the sequence

The input of Vfold2D is the sequence in plain text form (see the snapshot of Vfold2D web server in Fig. 2). The default temperature for Vfold2D is 37°C. Users have the option to change the temperature to other values. Users have the option to use the base stacking energy parameters either from Turner's parameters or from the MFOLD. Users also have the option to choose the type of structures:
Figure 2

An example of Vfold2D prediction: the input information highlighted in the snapshot of the Vfold2D web server are the sequence (32 nts in this example), the temperature (25°C), the energy parameters used for base stacks (from MFOLD in this example) the structural type (non-pseudoknotted in this example).

(1) Vfold2D gives a list of base pair probabilities P (in txt format) between nucleotides i and j. For example, the probability of forming G1-C12 base pair is 0.22. (2) The most probable 2D structure is derived from the base pairs with P>0.5. In this example, the predicted most probable 2D structure (plotted by VARNA in the figure) has the probability of 0.78. (3) Vfold2D also predicts all the possible helices from the predicted base pair probabilities. (4) Possible alternative structures can be found from the helix and base pair probabilities ( in this example).

An example of Vfold2D prediction: the input information highlighted in the snapshot of the Vfold2D web server are the sequence (32 nts in this example), the temperature (25°C), the energy parameters used for base stacks (from MFOLD in this example) the structural type (non-pseudoknotted in this example).

(1) Vfold2D gives a list of base pair probabilities P (in txt format) between nucleotides i and j. For example, the probability of forming G1-C12 base pair is 0.22. (2) The most probable 2D structure is derived from the base pairs with P>0.5. In this example, the predicted most probable 2D structure (plotted by VARNA in the figure) has the probability of 0.78. (3) Vfold2D also predicts all the possible helices from the predicted base pair probabilities. (4) Possible alternative structures can be found from the helix and base pair probabilities ( in this example). Excluding pseudoknot: Only non-pseudoknotted secondary structures are included in the structure prediction; Including pseudoknots with inter-helix junction length 1 nt: All the possible non-pseudoknotted secondary structures and H-type pseudoknots with inter-helix junction of length 1 nt are considered in the calculation. It may take a much longer computational time than the pseudoknot-free calculations. Including pseudoknots with longer inter-helix junctions: all the possible non-pseudoknotted secondary structures and H-type pseudoknots with inter-helix junction of any length are considered in the calculation. The computation may take much longer time than the calculation with pseudoknots of inter-helix junction length 1 nt. The Vfold2D server generates three files: Base pair probabilities (in txt format). Probabilities for the formation of the possible helices (including the native and alternative helices) (in txt format). Predicted 2D structures (in eps format) plotted by VARNA [44]. We recommend users to consider the possible alternative structures from the base pair probabilities and helix probabilities (the first two output files above). Fig. 2 shows an example of Vfold2D prediction for a 32-nt sequence [45]. With conformational sampling for the non-pseudoknotted structures, Vfold2D predicts the possible (including the alternative) helices from the base pair probabilities P based on the premise that base pairs (helices) in the same structure have the same level of probabilities of formation. The dominant 2D structure is identified from the base pairs of the largest probability. Fig. 2 shows an RNA that has two sets of helices. One set shown in magenta has the probability of 0.78. This is the most probable structure. Another set of helices in cyan with probability 0.22 gives an alternative structure. The predicted bistable structures agree with the NMR results [45].

VfoldThermal: predicting RNA melting curves

VfoldThermal predicts the heat capacity C(T) melting curves from the temperature-dependence of the partition function Q(T) for the conformational ensemble chosen by the user. The server provides the results in text format as well as in eps format plotted by Gnuplot. The input of VfoldThermal is the same as those for the Vfold2D, except for the temperature range in VfoldThermal (see the snapshot of VfoldThermal web server in Fig. 3).
Figure 3

An example of the VfoldThermal prediction: the inputs highlighted in the snapshot of VfoldThermal web server are the sequence (32 nts in this example) with the temperature range of 0°C–100°C, the energy parameters used for base stacks (from MFOLD in this example) and the structure type (non-pseudoknotted in this example).

From the temperature dependence of the partition function Q(T), VfoldThermal gives a list of temperature-dependent heat capacity C(T), with temperature interval of 0.5°C. The eps format of melting curve is generated by Gnuplot.

An example of the VfoldThermal prediction: the inputs highlighted in the snapshot of VfoldThermal web server are the sequence (32 nts in this example) with the temperature range of 0°C–100°C, the energy parameters used for base stacks (from MFOLD in this example) and the structure type (non-pseudoknotted in this example).

From the temperature dependence of the partition function Q(T), VfoldThermal gives a list of temperature-dependent heat capacity C(T), with temperature interval of 0.5°C. The eps format of melting curve is generated by Gnuplot. For the example shown in Fig. 3, with the same input as for Vfold2D in Fig. 2, VfoldThermal calculates the partition function Q(T) for all the non-pseudoknotted structures for temperature range 0°C–100°C with the temperature step of 0.5°C. The predicted heat capacity (melting curve) shows two peaks around 60 and 90°C, respectively. The peaks correspond to the melting of the two helices in the predicted structures in Fig. 2, respectively.

Vfold3D: Predicting RNA 3D structure

The input data of Vfold3D are the RNA sequence and the 2D structure (base pairs) (see the snapshot of the Vfold3D web server in Fig. 4). The output of Vfold3D is a PDB file for the predicted all-atom 3D structure(s). Because the current version of Vfold3D is template-based, no 3D structure will be predicted if a proper template cannot be found.
Figure 4

An example of the Vfold3D prediction: the snapshot of Vfold3D web server highlights the input sequence (32 nts for this example) and the 2D structures as defined by the base pairs.

(a) For the most probable 2D structure shown in Fig. 2, Vfold3D predicts 3D structure based on the templates from the known structures. (b) For the predicted alternative structure shown in Fig. 2, Vfold3D cannot predict the 3D structure due to the lack of the available template for the single-stranded chain between the helices.

An example of the Vfold3D prediction: the snapshot of Vfold3D web server highlights the input sequence (32 nts for this example) and the 2D structures as defined by the base pairs.

(a) For the most probable 2D structure shown in Fig. 2, Vfold3D predicts 3D structure based on the templates from the known structures. (b) For the predicted alternative structure shown in Fig. 2, Vfold3D cannot predict the 3D structure due to the lack of the available template for the single-stranded chain between the helices. Currently, due to the limited structural template database, the current version of Vfold3D can only predict the 3D structures with hairpin loops, internal/bulge loops, N-way (2 For the RNA in Fig. 2, Vfold2D predicts two alternative 2D structures. As shown in Fig. 4, for the most probable 2D structure, Vfold3D predicts one 3D structure. For the alternative 2D structure, which consists of two hairpins connected by a single-strand loop, Vfold3D yields no 3D structure because of the lack of the templates for the UUCG single-stranded open junction between the two hairpins.

Vfold output

Once a calculation is submitted, a notification page containing the job information (job name, e-mail address (optional) and the job status) is displayed. When the calculation is completed, the Vfold web server sends out an e-mail (if provided) notification with the predicted results attached. It is recommended to bookmark the job-specific notification page for later check of the job status and for downloading Vfold predicted results, since Vfold2D and VfoldThermal might take a long computational time (hours or even longer) depending on the sequence length. An online README file about the interpretation of the Vfold predictions is available on the Vfold web server.

Conclusion

The Vfold package is developed to predict RNA structures and folding thermodynamics. The web server will be updated continuously with the development of new Vfold-based algorithms for RNA folding. In the future development, we plan to add structure predictions for the formation of RNA-RNA complexes. We will also add the effect of the ion-dependent electrostatic free energies and the heat capacity effect, which can cause the temperature-dependence of the enthalpy and entropy parameters for the loop and base stack formations, to the melting curve calculations and structure predictions. (PDF) Click here for additional data file.
  44 in total

Review 1.  Prediction of RNA secondary structure by free energy minimization.

Authors:  David H Mathews; Douglas H Turner
Journal:  Curr Opin Struct Biol       Date:  2006-05-19       Impact factor: 6.809

Review 2.  Bridging the gap in RNA structure prediction.

Authors:  Bruce A Shapiro; Yaroslava G Yingling; Wojciech Kasprzak; Eckart Bindewald
Journal:  Curr Opin Struct Biol       Date:  2007-03-23       Impact factor: 6.809

3.  The role of site accessibility in microRNA target recognition.

Authors:  Michael Kertesz; Nicola Iovino; Ulrich Unnerstall; Ulrike Gaul; Eran Segal
Journal:  Nat Genet       Date:  2007-09-23       Impact factor: 38.330

4.  The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data.

Authors:  Marc Parisien; François Major
Journal:  Nature       Date:  2008-03-06       Impact factor: 49.962

5.  Automated de novo prediction of native-like RNA tertiary structures.

Authors:  Rhiju Das; David Baker
Journal:  Proc Natl Acad Sci U S A       Date:  2007-08-28       Impact factor: 11.205

6.  iFoldRNA: three-dimensional RNA structure prediction and folding.

Authors:  Shantanu Sharma; Feng Ding; Nikolay V Dokholyan
Journal:  Bioinformatics       Date:  2008-06-25       Impact factor: 6.937

7.  Ab initio RNA folding by discrete molecular dynamics: from structure prediction to folding mechanisms.

Authors:  Feng Ding; Shantanu Sharma; Poornima Chalasani; Vadim V Demidov; Natalia E Broude; Nikolay V Dokholyan
Journal:  RNA       Date:  2008-05-02       Impact factor: 4.942

8.  Discrete state model and accurate estimation of loop entropy of RNA secondary structures.

Authors:  Jian Zhang; Ming Lin; Rong Chen; Wei Wang; Jie Liang
Journal:  J Chem Phys       Date:  2008-03-28       Impact factor: 3.488

9.  lncRNAs transactivate STAU1-mediated mRNA decay by duplexing with 3' UTRs via Alu elements.

Authors:  Chenguang Gong; Lynne E Maquat
Journal:  Nature       Date:  2011-02-10       Impact factor: 49.962

10.  RNA FRABASE version 1.0: an engine with a database to search for the three-dimensional fragments within RNA structures.

Authors:  Mariusz Popenda; Marek Blazewicz; Marta Szachniuk; Ryszard W Adamiak
Journal:  Nucleic Acids Res       Date:  2007-10-05       Impact factor: 16.971

View more
  73 in total

1.  Predicting 3D Structure, Flexibility, and Stability of RNA Hairpins in Monovalent and Divalent Ion Solutions.

Authors:  Ya-Zhou Shi; Lei Jin; Feng-Hua Wang; Xiao-Long Zhu; Zhi-Jie Tan
Journal:  Biophys J       Date:  2015-12-15       Impact factor: 4.033

2.  Quantitative Understanding of SHAPE Mechanism from RNA Structure and Dynamics Analysis.

Authors:  Travis Hurst; Xiaojun Xu; Peinan Zhao; Shi-Jie Chen
Journal:  J Phys Chem B       Date:  2018-04-27       Impact factor: 2.991

3.  A New Method to Predict Ion Effects in RNA Folding.

Authors:  Li-Zhen Sun; Shi-Jie Chen
Journal:  Methods Mol Biol       Date:  2017

4.  VfoldLA: A web server for loop assembly-based prediction of putative 3D RNA structures.

Authors:  Xiaojun Xu; Chenhan Zhao; Shi-Jie Chen
Journal:  J Struct Biol       Date:  2019-06-04       Impact factor: 2.867

5.  A Method to Predict the 3D Structure of an RNA Scaffold.

Authors:  Xiaojun Xu; Shi-Jie Chen
Journal:  Methods Mol Biol       Date:  2015

6.  Unified energetics analysis unravels SpCas9 cleavage activity for optimal gRNA design.

Authors:  Dong Zhang; Travis Hurst; Dongsheng Duan; Shi-Jie Chen
Journal:  Proc Natl Acad Sci U S A       Date:  2019-04-15       Impact factor: 11.205

7.  Predicting RNA Scaffolds with a Hybrid Method of Vfold3D and VfoldLA.

Authors:  Xiaojun Xu; Shi-Jie Chen
Journal:  Methods Mol Biol       Date:  2021

8.  Modeling Structure, Stability, and Flexibility of Double-Stranded RNAs in Salt Solutions.

Authors:  Lei Jin; Ya-Zhou Shi; Chen-Jie Feng; Ya-Lan Tan; Zhi-Jie Tan
Journal:  Biophys J       Date:  2018-08-30       Impact factor: 4.033

9.  IsRNA1: De Novo Prediction and Blind Screening of RNA 3D Structures.

Authors:  Dong Zhang; Jun Li; Shi-Jie Chen
Journal:  J Chem Theory Comput       Date:  2021-02-09       Impact factor: 6.006

10.  A Method to Predict the Structure and Stability of RNA/RNA Complexes.

Authors:  Xiaojun Xu; Shi-Jie Chen
Journal:  Methods Mol Biol       Date:  2016
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.