Literature DB >> 27095203

SimRNAweb: a web server for RNA 3D structure modeling with optional restraints.

Marcin Magnus1, Michał J Boniecki1, Wayne Dawson1, Janusz M Bujnicki2.   

Abstract

RNA function in many biological processes depends on the formation of three-dimensional (3D) structures. However, RNA structure is difficult to determine experimentally, which has prompted the development of predictive computational methods. Here, we introduce a user-friendly online interface for modeling RNA 3D structures using SimRNA, a method that uses a coarse-grained representation of RNA molecules, utilizes the Monte Carlo method to sample the conformational space, and relies on a statistical potential to describe the interactions in the folding process. SimRNAweb makes SimRNA accessible to users who do not normally use high performance computational facilities or are unfamiliar with using the command line tools. The simplest input consists of an RNA sequence to fold RNA de novo. Alternatively, a user can provide a 3D structure in the PDB format, for instance a preliminary model built with some other technique, to jump-start the modeling close to the expected final outcome. The user can optionally provide secondary structure and distance restraints, and can freeze a part of the starting 3D structure. SimRNAweb can be used to model single RNA sequences and RNA-RNA complexes (up to 52 chains). The webserver is available at http://genesilico.pl/SimRNAweb.
© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27095203      PMCID: PMC4987879          DOI: 10.1093/nar/gkw279

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The ribonucleic acid (RNA) plays fundamental roles in biology, including the transmission of genetic information, regulation of gene expression and catalysis of biochemical reactions (1). Many RNA molecules or their parts (domains or motifs) fold into stable three-dimensional (3D) structures that define, at least partially, their ability to interact with other molecules and carry out their tasks within the cell (2). However, experimental determination of RNA 3D structures is laborious and challenging, and the majority of known RNAs remain structurally uncharacterized. Although a limited number of experimentally obtained RNA 3D structures are available in the Protein Data Bank (3), there are a multitude of RNA sequences [e.g. in the RNAcentral database (4)] for which no 3D structure exists. Experimental information about RNA secondary structure (base pairing pattern) has been accumulating at increasing speed, thanks to the introduction of high-throughput techniques (5,6). However, 2D diagrams that describe base-pairing interactions alone provide only part of the story. Often, one must look at the 3D structure to gain a better appreciation for the mutual position of chemical groups that may be functionally relevant and interact with other molecules. This additional information about the 3D RNA structure is neither obvious nor intuitive from the sequence or secondary structure alone. To address the problem of the paucity of 3D structural information, computational structure prediction methods have been developing that either utilize information derived from known structures of other RNA molecules, by way of template-based modeling, or attempt to simulate the physical process of RNA structure formation, by way of template-free modeling (7–9). Some methods work on an all atom representation of the RNA molecule, while others simplify it by coarse-graining (10). These methods have a variety of strengths in certain areas that make them useful in contemplating the structural basis of biological function. For instance, there are a number of approaches that enable 3D structure prediction starting from RNA sequence alone. Some of these methods also provide a web service such as RNAComposer (11), Vfold3D (12) and iFoldRNA (13); however, they do not permit folding of RNA molecules composed of multiple chains or with complex restraints, such as utilizing a pre-defined 3D structure. We have recently introduced SimRNA, a method for RNA folding simulation and 3D structure prediction that uses a coarse-grained representation of five atoms per residue and a statistical potential methodology (14). It can predict RNA 3D structure from sequence alone, and, if available, can use additional structural information in the form of secondary structure restraints, distance restraints that define the local arrangement of certain atoms, and can jump-start the simulation with a 3D structure provided in a PDB file. Thus far, SimRNA has been available as a stand-alone package that required the user to have good computer skills and a powerful computer. Here, we introduce a web service that simplifies the steps of the stand-alone package, does not require the user to supply computing power and memory, provides a simple interface for the user, and displays the progress of the simulation in real time. This renders the approach available to an individual who is not necessarily an expert in RNA structure and does not have access to state-of-the-art 3D molecular modeling facilities, but who needs a model of the RNA 3D structure, for instance to design biochemical experiments, or may want to observe the conformational changes of the RNA as it folds.

METHODS

Workflow implemented as SimRNAweb

The primary purpose of SimRNAweb is to carry out all the typical tasks normally required in a SimRNA calculation and to provide the computer resource needed to obtain an output structure. However, SimRNAweb also provides special features such as showing the progress of the simulation and showing the structure during the progression of the simulation. During the simulation and at the end, the best-scored conformations are clustered and the all-atom models of representatives of the three largest clusters are generated. In the output, structures of these three models are displayed using JSmol (15), and their coordinates are made available for download. Therefore, the user does not have to carry out the tasks of a simulation, processing the resulting files, clustering the results, and converting the data from a five-atom representation to an all atom representation—all this is done remotely by the server (Figure 1). Thus, even though SimRNA uses a coarse-grained representation for the folding simulation, the user can submit PDB-formatted files in all atom or SimRNA representation (files containing only the atoms used by SimRNA) and receives the results as all atom PDB-formatted files. All log files and intermediate data are also made available to users who need them for more advanced analyses or who are simply curious about all the details of the calculations (in SimRNA representation).
Figure 1.

The workflow of the SimRNAweb server.

The workflow of the SimRNAweb server.

Required input files—sequence and/or 3D structure

The smallest set of input data consists of an RNA sequence in a single line. Alternatively, a PDB structure in the PDB format can be submitted. Starting a simulation from a pre-defined 3D structure can be used in a variety of applications, such as sampling the conformational space in a vicinity of a given conformation, testing the ability of a given RNA sequence to maintain different 3D architectures or refining a structural model built with some other technique such as comparative/homology modeling. Regardless of whether the user submits a sequence or a PDB file, in a separate window, the user can supplement the input information with RNA secondary structure, additional information such as a file containing distance restraints between the atoms. In the case of submitting a PDB file, the user can specify positions in the 3D structure that need to be frozen (kept rigid) in the course of the simulation. All of this information can be submitted simultaneously. If a PDB file is supplied, then all the labels should be ATOM and any non-RNA species should be removed from the file. SimRNA relies only on five atoms per nucleotide: P, C4′, N1, C2, C4 (for pyrimidine bases) or P, C4′, N9, C2, C6 (for purine bases). All other atoms are ignored (and don't need to be removed). As long as these five basic atoms are present for a given residue of RNA, SimRNA can process the information. SimRNA can directly read HETATM tags, but the name of the residue and the five atoms mentioned above must be set to standard residues (A, C, G or U). In principle, SimRNA can read PDB files with chemically modified residues and interpret them as A, C, G or U, but this requires manual editing of the residue and atom names and we recommend this feature to expert users only. Please refer to the online manual for more detailed information. Many PDB files have missing terminal and internal residues and the sequence of residues inferred from the coordinates in the PDB file is not always identical in every respect to the sequence that should be modeled. When an incomplete PDB file is submitted to SimRNAweb as a starting structure, a complete sequence can also be submitted to fill in missing spaces and model small insertions. When submitting a PDB file together with a sequence, SimRNAweb will automatically try to align the submitted sequence with the sequence in the PDB file, according to the numbering of residues in the PDB file. The alignment rules are currently absolutely strict: the nth residue in sequence must correspond to residue number n in the PDB file; and it is the responsibility of the user to provide structures and sequences that correspond in residue identity and number. Any submission in which the input structure is different than or misaligned with the input sequence may be processed erroneously. It should be also emphasized that many PDB files have arbitrary numbering of residues that does not correspond to the actual sequence; e.g. the first residue in the PDB file can have an index smaller or larger than 1 (in the case of engineered molecules or sequence fragments), while the first residue in sequence always has an index of 1. Hence, the combined input of sequence and structure is a preliminary, experimental feature and should be used with caution. We recommend it only to advanced users who can edit PDB files with third-party programs of their choice. At this stage SimRNAweb does not provide any support for inferring, verifying, or correcting alignments between the input sequence and structure files, and it does not support homology modeling (replacement of one sequence in the input file with another sequence). For such operations we recommend the ModeRNA server (16).

Secondary structure restraints

When information on the secondary structure (including pseudoknots) is available, it can be provided as restraints. When only secondary structure is involved, this can be entered as a single line in a dot bracket format and for pseudoknots, multiple lines demarking overarching helices should be used (see the online documentation).

Distance restraints

Distance restraints can represent any type of pairwise interaction as long as it can be defined in terms of any pair of the five atoms (P, C4′ and N1, C2, C4 for pyrimidines or N9, C2, C6 for purines) or a virtual point at the middle of the base called MB. Distance restraints come in two forms: SLOPE and WELL. A typical SLOPE restraint has the form ‘SLOPE A/23/C4′ C/45/P 5.5 8.5 0.2’, where the last number indicates a penalty in SimRNA energy units (these units do not correspond to standard energy units; they depend on the statistical force field). It says that atom C4′ of residue 23 on chain A should be between 5.5 and 8.5 Å to atom P of residue 45 of chain C to avoid an energy penalty, and when this distance is <5.5 Å or >8.5 Å, the penalty increases by 0.2 unit of energy per Å (so at a distance 10.5 or 3.5 Å, the penalty will be 0.4 units). A typical WELL restraint has the form ‘WELL A/23/C4′ C/45/P 6.5 7.5 0.5’, where the last number indicates a reward in SimRNA energy units (depth of the well). It says that the structure should essentially fall into a ‘well’ that has -0.5 units of energy when the atoms A/23/C4′ and C/45/P are between 6.5 and 7.5 Å, and outside of this region there is no effect to the energy. Together, these two restraints are intended to encourage this pair of atoms (C4′ and P) to locate themselves at ∼7 Å from each other: irrespective of anything else we know about chains A or C. In general, restraints should be used to encourage the structure to form in a certain way. We recommend using ranges of distance to represent the uncertainty of the distance, and the energy units as a relative weight of restraints. Further explanation can be found in the online manual.

PDB restraints

A separate field in the submission form for the server permits the user to specify which parts of the structure are frozen by writing the chain and the frozen residues, e.g. ‘A:1–10, 15–20; B:1–10’, where A and B are the chains and the residue indices should be numbered according to the PDB file. This feature can be used to keep some parts of the model that should be left unchanged and refold other parts, with or without other restraints.

Parameters of the RNA folding simulation

In addition to the main input files, the user can reduce the simulation length from 500 frames (default value) to a shorter run. Particularly when restraints are used, the run is typically solved within 300 frames for a sequence of up to 75 nucleotides in length. Other options are to cut the simulation in the middle of a run, specify different parts of the PDB structure to free and parts of the PDB structure to join with an input sequence.

Training (parametrization) and testing datasets

The structure prediction engine and the default parameters of the simulation are currently (February 2016) identical to those of the published version of SimRNA version 3.20 (14). Only one minor correction has been introduced to define the set of atoms frozen if a residue is specified as frozen. SimRNA v.3.20/1 was benchmarked against a variety of test structures including the dataset from Ding et al. (17), the 2008 Das&Baker dataset (18), the 2010 Das&Baker dataset of RNA structural motifs (19), the Seetin&Mathews dataset (20) and published structures available from RNA puzzles (21,22). With the exception of the Das&Baker motifs dataset, which required restraints on the bordering parts of the crystal structure to do the study, all the sequences were tested both in de novo simulations and using various levels of restraints including secondary structure and distance restraints. According to these benchmarks, SimRNA is competitive with other methods, including FARNA and iFoldRNA. It often predicts 3D structures for RNA molecules up to 100 nt correctly, and secondary structure and tertiary restraints improve these predictions and allow for folding much longer sequences.

RESULTS

SimRNAweb server

The SimRNAweb server is an automated and user-friendly implementation of SimRNA, a method for RNA 3D structure modeling developed in our laboratory (14) and used in practice in many cases, including participation in the RNA Puzzles experiment (21,22). SimRNAweb server has been running since September 2015. So far, it has processed several hundred predictions (folding de novo and under restraints) and an increasing number of individuals outside our group have been involved in running and testing. Most recently, SimRNAweb has entered the new category of fully automated RNA 3D structure prediction of the RNA Puzzles competition.

Example applications

Figure 2A illustrates an example of RNA 3D structure prediction with SimRNAweb, based on sequence information alone (without secondary structure or any other restraints). An experimentally determined molecule of RNA tertiary domain essential to HCV IRES-mediated translation initiation comprises two chains from the crystal structure (PDB id.: 1kh6). SimRNAweb run with default parameters folded it to a model that matched the reference structure very well (correct secondary structure, correct orientation of all structural elements, and RMSD of 5.9 Å). The additional application of secondary structure restraints improves this model to 4.6 Å (data not shown).
Figure 2.

Examples of RNA 3D structure prediction with SimRNAweb. Green indicates the reference structure (PDB ids: A – 1kh6, B – 3l0u, C – 3w3s), red indicates models generated with SimRNAweb (representatives of the first cluster).

Examples of RNA 3D structure prediction with SimRNAweb. Green indicates the reference structure (PDB ids: A – 1kh6, B – 3l0u, C – 3w3s), red indicates models generated with SimRNAweb (representatives of the first cluster). Figure 2B shows a predicted 3D structure for tRNA(Phe) from Escherichia coli (73 residues), for an input comprising sequence, secondary structure, and distance restraints. If only secondary structure restraints are used, the resulting model approaches RMSD of 12 Å to the crystal structure. However, when known tertiary interactions between residues 8–14 and 48–50 are accounted for with just six pairwise distance restraints, the server generates a model with a native-like topology and local contacts, and with RMSD of 6.1 Å. Finally, Figure 2C shows the results of refining a homology model using the SimRNAweb server. Here, the starting structure of Aquifex aeolicus tRNA(Sec) (99 residues) was homology-modeled based on the crystal structure of A. aeolicus tRNA(Met) using the ModeRNA server (16). Confidently modeled regions, including the tRNA core, were frozen, while the loops and the extra arm (not present in the template) were allowed to refold. Comparison of the starting structure and the locally refolded model with the experimentally determined structure of A. aeolicus tRNA(Sec) reveals an improvement in RMSD from 5.6 to 4.1 Å RMSD in the process.

DISCUSSION

SimRNAweb offers a new user-friendly interface for the SimRNA program, delivering RNA 3D structure modeling to any user. The server accepts an RNA sequence and can enable folding with a mixture of various types of restraints including secondary and/or 3D structure and/or distance restraints. The user can also submit sequences or structures containing multiple RNA chains (up to 52) and thereby SimRNAweb can be used for studying RNA-RNA interactions. Starting a simulation from a pre-defined 3D structure can be used in a variety of application, such as refinement of models built with other methods, sampling the conformational space in a vicinity of a given conformation, and testing the ability of a given RNA sequence to maintain different 3D architectures. Advanced applications of SimRNAweb, which make use of folding with restraints, include docking pre-folded RNA molecules/domains/motifs to each other, and optimizing homology models, with confident parts of the structures completely frozen or highly restrained, and with other regions allowed to change conformation upon folding and/or binding.

AVAILABILITY

The web server is available at http://genesilico.pl/SimRNAweb. This website is free and open to all users and there is no login requirement.
  18 in total

1.  RNA-Puzzles: a CASP-like evaluation of RNA three-dimensional structure prediction.

Authors:  José Almeida Cruz; Marc-Frédérick Blanchet; Michal Boniecki; Janusz M Bujnicki; Shi-Jie Chen; Song Cao; Rhiju Das; Feng Ding; Nikolay V Dokholyan; Samuel Coulbourn Flores; Lili Huang; Christopher A Lavender; Véronique Lisi; François Major; Katarzyna Mikolajczak; Dinshaw J Patel; Anna Philips; Tomasz Puton; John Santalucia; Fredrick Sijenyi; Thomas Hermann; Kristian Rother; Magdalena Rother; Alexander Serganov; Marcin Skorupski; Tomasz Soltysinski; Parin Sripakdeevong; Irina Tuszynska; Kevin M Weeks; Christina Waldsich; Michael Wildauer; Neocles B Leontis; Eric Westhof
Journal:  RNA       Date:  2012-02-23       Impact factor: 4.942

2.  Automated de novo prediction of native-like RNA tertiary structures.

Authors:  Rhiju Das; David Baker
Journal:  Proc Natl Acad Sci U S A       Date:  2007-08-28       Impact factor: 11.205

3.  iFoldRNA: three-dimensional RNA structure prediction and folding.

Authors:  Shantanu Sharma; Feng Ding; Nikolay V Dokholyan
Journal:  Bioinformatics       Date:  2008-06-25       Impact factor: 6.937

4.  Ab initio RNA folding by discrete molecular dynamics: from structure prediction to folding mechanisms.

Authors:  Feng Ding; Shantanu Sharma; Poornima Chalasani; Vadim V Demidov; Natalia E Broude; Nikolay V Dokholyan
Journal:  RNA       Date:  2008-05-02       Impact factor: 4.942

5.  An RNA Mapping DataBase for curating RNA structure mapping experiments.

Authors:  Pablo Cordero; Julius B Lucks; Rhiju Das
Journal:  Bioinformatics       Date:  2012-09-12       Impact factor: 6.937

6.  Automated RNA tertiary structure prediction from secondary structure and low-resolution restraints.

Authors:  Matthew G Seetin; David H Mathews
Journal:  J Comput Chem       Date:  2011-04-21       Impact factor: 3.376

Review 7.  Computational approaches to 3D modeling of RNA.

Authors:  Christian Laing; Tamar Schlick
Journal:  J Phys Condens Matter       Date:  2010-06-15       Impact factor: 2.333

Review 8.  RNA and protein 3D structure modeling: similarities and differences.

Authors:  Kristian Rother; Magdalena Rother; Michał Boniecki; Tomasz Puton; Janusz M Bujnicki
Journal:  J Mol Model       Date:  2011-01-22       Impact factor: 1.810

9.  RNA-Puzzles Round II: assessment of RNA structure prediction programs applied to three large RNA structures.

Authors:  Zhichao Miao; Ryszard W Adamiak; Marc-Frédérick Blanchet; Michal Boniecki; Janusz M Bujnicki; Shi-Jie Chen; Clarence Cheng; Grzegorz Chojnowski; Fang-Chieh Chou; Pablo Cordero; José Almeida Cruz; Adrian R Ferré-D'Amaré; Rhiju Das; Feng Ding; Nikolay V Dokholyan; Stanislaw Dunin-Horkawicz; Wipapat Kladwang; Andrey Krokhotin; Grzegorz Lach; Marcin Magnus; François Major; Thomas H Mann; Benoît Masquida; Dorota Matelska; Mélanie Meyer; Alla Peselis; Mariusz Popenda; Katarzyna J Purzycka; Alexander Serganov; Juliusz Stasiewicz; Marta Szachniuk; Arpit Tandon; Siqi Tian; Jian Wang; Yi Xiao; Xiaojun Xu; Jinwei Zhang; Peinan Zhao; Tomasz Zok; Eric Westhof
Journal:  RNA       Date:  2015-04-16       Impact factor: 4.942

10.  RNAcentral: an international database of ncRNA sequences.

Authors:  Anton I Petrov; Simon J E Kay; Richard Gibson; Eugene Kulesha; Dan Staines; Elspeth A Bruford; Mathew W Wright; Sarah Burge; Robert D Finn; Paul J Kersey; Guy Cochrane; Alex Bateman; Sam Griffiths-Jones; Jennifer Harrow; Patricia P Chan; Todd M Lowe; Christian W Zwieb; Jacek Wower; Kelly P Williams; Corey M Hudson; Robin Gutell; Michael B Clark; Marcel Dinger; Xiu Cheng Quek; Janusz M Bujnicki; Nam-Hai Chua; Jun Liu; Huan Wang; Geir Skogerbø; Yi Zhao; Runsheng Chen; Weimin Zhu; James R Cole; Benli Chai; Hsien-Da Huang; His-Yuan Huang; J Michael Cherry; Artemis Hatzigeorgiou; Kim D Pruitt
Journal:  Nucleic Acids Res       Date:  2014-10-28       Impact factor: 16.971

View more
  39 in total

1.  A New Method to Predict Ion Effects in RNA Folding.

Authors:  Li-Zhen Sun; Shi-Jie Chen
Journal:  Methods Mol Biol       Date:  2017

2.  HNADOCK: a nucleic acid docking server for modeling RNA/DNA-RNA/DNA 3D complex structures.

Authors:  Jiahua He; Jun Wang; Huanyu Tao; Yi Xiao; Sheng-You Huang
Journal:  Nucleic Acids Res       Date:  2019-07-02       Impact factor: 16.971

3.  Methods for Molecular Modelling of Protein Complexes.

Authors:  Tejashree Rajaram Kanitkar; Neeladri Sen; Sanjana Nair; Neelesh Soni; Kaustubh Amritkar; Yogendra Ramtirtha; M S Madhusudhan
Journal:  Methods Mol Biol       Date:  2021

4.  IsRNA1: De Novo Prediction and Blind Screening of RNA 3D Structures.

Authors:  Dong Zhang; Jun Li; Shi-Jie Chen
Journal:  J Chem Theory Comput       Date:  2021-02-09       Impact factor: 6.006

5.  Limits in accuracy and a strategy of RNA structure prediction using experimental information.

Authors:  Jian Wang; Benfeard Williams; Venkata R Chirasani; Andrey Krokhotin; Rajeshree Das; Nikolay V Dokholyan
Journal:  Nucleic Acids Res       Date:  2019-06-20       Impact factor: 16.971

Review 6.  RNA Structural Dynamics As Captured by Molecular Simulations: A Comprehensive Overview.

Authors:  Jiří Šponer; Giovanni Bussi; Miroslav Krepl; Pavel Banáš; Sandro Bottaro; Richard A Cunha; Alejandro Gil-Ley; Giovanni Pinamonti; Simón Poblete; Petr Jurečka; Nils G Walter; Michal Otyepka
Journal:  Chem Rev       Date:  2018-01-03       Impact factor: 60.622

7.  Optimization of RNA 3D structure prediction using evolutionary restraints of nucleotide-nucleotide interactions from direct coupling analysis.

Authors:  Jian Wang; Kangkun Mao; Yunjie Zhao; Chen Zeng; Jianjin Xiang; Yi Zhang; Yi Xiao
Journal:  Nucleic Acids Res       Date:  2017-06-20       Impact factor: 16.971

8.  Different effects of long noncoding RNA NDRG1-OT1 fragments on NDRG1 transcription in breast cancer cells under hypoxia.

Authors:  Ching-Ching Yeh; Jun-Liang Luo; Nam Nhut Phan; Yi-Chun Cheng; Lu-Ping Chow; Mong-Hsun Tsai; Eric Y Chuang; Liang-Chuan Lai
Journal:  RNA Biol       Date:  2018-12-04       Impact factor: 4.652

9.  U5 snRNA Interactions With Exons Ensure Splicing Precision.

Authors:  Olga V Artemyeva-Isman; Andrew C G Porter
Journal:  Front Genet       Date:  2021-07-02       Impact factor: 4.599

10.  Structural insights into RapZ-mediated regulation of bacterial amino-sugar metabolism.

Authors:  Grecia M Gonzalez; Svetlana Durica-Mitic; Steven W Hardwick; Martin C Moncrieffe; Marcus Resch; Piotr Neumann; Ralf Ficner; Boris Görke; Ben F Luisi
Journal:  Nucleic Acids Res       Date:  2017-10-13       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.