| Literature DB >> 20501603 |
Eckart Bindewald1, Tanner Kluth, Bruce A Shapiro.
Abstract
UNLABELLED: Computational RNA secondary structure prediction approaches differ by the way RNA pseudoknot interactions are handled. For reasons of computational efficiency, most approaches only allow a limited class of pseudoknot interactions or are not considering them at all. Here we present a computational method for RNA secondary structure prediction that is not restricted in terms of pseudoknot complexity. The approach is based on simulating a folding process in a coarse-grained manner by choosing helices based on established energy rules. The steric feasibility of the chosen set of helices is checked during the folding process using a highly coarse-grained 3D model of the RNA structures. Using two data sets of 26 and 241 RNA sequences we find that this approach is competitive compared to the existing RNA secondary structure prediction programs pknotsRG, HotKnots and UnaFold. The key advantages of the new method are that there is no algorithmic restriction in terms of pseudoknot complexity and a test is made for steric feasibility. AVAILABILITY: The program is available as web server at the site: http://cylofold.abcc.ncifcrf.gov.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20501603 PMCID: PMC2896150 DOI: 10.1093/nar/gkq432
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.A flow-chart depicting the algorithm for predicting RNA secondary structures.
Figure 2.Scheme for mapping between an RNA secondary structure (a) and the used 3D coarse-grained representation (b). Each helix is represented as a capped cylinder (‘capsule’). Single-stranded regions between helices are represented as distance constraints. Only those RNA secondary structures can be a (partial) solution of a secondary structure prediction, for which the algorithm succeeds in placing the corresponding capped cylinders such that they do not collide and do not violate distance constraints.
Figure 3.Screenshot of a typical prediction result returned by the CyloFold web server. The shown sequence corresponds to the bacteriophage T2 gene 32 mRNA pseudoknot (PDB 2TPK).
Prediction results corresponding to 26 RNA structures that are available in the Protein Dank Bank
| PDB | Description | L | PKF | CF | PK | HK | UF | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MCC | SNS | PPV | MCC | SNS | PPV | MCC | SNS | PPV | MCC | SNS | PPV | ||||
| 1A60 | TYMV tRNA-like structure | 44 | 13.6 | 0.74 | 0.77 | 0.71 | 0.96 | 1.00 | 0.93 | 0.83 | 0.77 | 0.91 | 0.83 | 0.77 | 0.91 |
| 1CX0 | HDV ribozyme | 72 | 22.2 | –0.01 | 0.00 | 0.00 | –0.01 | 0.00 | 0.00 | –0.01 | 0.00 | 0.00 | –0.01 | 0.00 | 0.00 |
| 1E95 | SRV-1 pseudoknot | 36 | 33.3 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.70 | 0.50 | 1.00 |
| 1HVU | HIV RT bind. pseudoknot | 30 | 26.6 | 0.95 | 1.00 | 0.91 | 0.95 | 1.00 | 0.91 | 0.56 | 0.40 | 0.80 | 0.56 | 0.40 | 0.80 |
| 1KAJ | MMTV RNA pseudoknot | 32 | 25.0 | 0.85 | 1.00 | 0.73 | 0.85 | 1.00 | 0.73 | 0.85 | 1.00 | 0.73 | 0.53 | 0.50 | 0.57 |
| 1KH6 | HCV IRES domain | 42 | 0.0 | 0.74 | 0.77 | 0.71 | 0.55 | 0.54 | 0.58 | 0.53 | 0.54 | 0.54 | 0.92 | 0.93 | 0.93 |
| 1KPY | PEMV-1 P1P2 pseudoknot | 27 | 22.2 | 0.89 | 1.00 | 0.80 | 0.94 | 1.00 | 0.89 | 0.79 | 0.62 | 1.00 | 0.79 | 0.63 | 1.00 |
| 1KXK | GroupII self-splic. intron | 70 | 0.0 | 0.91 | 0.87 | 0.95 | 0.81 | 0.83 | 0.79 | 0.81 | 0.83 | 0.79 | 0.96 | 0.96 | 0.96 |
| 1L2X | Viral RNA pseudoknot | 27 | 22.2 | 0.94 | 1.00 | 0.89 | 0.94 | 1.00 | 0.89 | 0.79 | 0.63 | 1.00 | 0.79 | 0.63 | 1.00 |
| 1Q9A | 23S rRNA sarcin/ricin | 27 | 0.0 | 0.91 | 0.83 | 1.00 | 0.77 | 0.83 | 0.71 | 0.86 | 1.00 | 0.75 | 0.77 | 0.83 | 0.71 |
| 1U8D | Guanine riboswitch | 67 | 11.9 | 0.87 | 0.87 | 0.87 | 0.88 | 0.78 | 1.00 | 0.88 | 0.78 | 1.00 | 0.88 | 0.78 | 1.00 |
| 2A43 | Luteoviral pseudoknot | 26 | 23.0 | 0.93 | 1.00 | 0.88 | 0.93 | 1.00 | 0.88 | 0.75 | 0.57 | 1.00 | 0.75 | 0.57 | 1.00 |
| 2G1W | tmRNA pseudoknot | 22 | 18.1 | 0.81 | 1.00 | 0.67 | 0.86 | 1.00 | 0.75 | 0.81 | 0.67 | 1.00 | 0.81 | 0.67 | 1.00 |
| 2GIS | SAM- riboswitch | 94 | 8.5 | 0.80 | 0.76 | 0.85 | 0.80 | 0.76 | 0.85 | 0.86 | 0.86 | 0.86 | 0.55 | 0.55 | 0.55 |
| 2HOO | thi-box riboswitch | 83 | 0.0 | 0.70 | 0.67 | 0.74 | 0.58 | 0.62 | 0.54 | 0.58 | 0.62 | 0.54 | 0.58 | 0.62 | 0.54 |
| 2K95 | P2B-P3 telo-merase RNA | 48 | 37.5 | 0.89 | 0.80 | 1.00 | 0.89 | 0.80 | 1.00 | 0.75 | 0.8 | 0.71 | 0.54 | 0.40 | 0.75 |
| 2OIU | L1 Ribozyme Ligase adduct | 71 | 0.0 | 0.86 | 0.78 | 0.95 | 0.98 | 0.96 | 1.00 | 0.98 | 0.96 | 1.00 | 0.98 | 1.00 | 1.00 |
| 2QUS | Hammerhead Ribozyme | 68 | 2.9 | 0.95 | 0.91 | 1.00 | 0.95 | 0.91 | 1.00 | 0.95 | 0.91 | 1.00 | 0.95 | 1.00 | 1.00 |
| 2QWY | SAM-II riboswitch | 52 | 26.9 | 0.48 | 0.46 | 0.50 | 0.48 | 0.46 | 0.5 | 0.34 | 0.31 | 0.4 | 0.35 | 0.31 | 0.40 |
| 2RP0 | PEMV1 mRNA pseudoknot | 26 | 15.3 | 0.88 | 1.00 | 0.78 | 0.88 | 1.00 | 0.78 | 0.84 | 0.71 | 1.00 | 0.84 | 0.71 | 1.00 |
| 2TPK | T2 gene 32 mRNA p.k. | 36 | 27.7 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.71 | 0.58 | 0.88 | 0.63 | 0.58 | 0.70 |
| 361D | Domain E of 5S rRNA | 19 | 0.0 | 0.86 | 1.00 | 0.75 | 0.86 | 1.00 | 0.75 | 0.83 | 0.83 | 0.83 | 0.83 | 0.83 | 0.83 |
| 3DIG | Lysine Riboswitch | 173 | 30.1 | 0.89 | 0.85 | 0.93 | 0.74 | 0.72 | 0.76 | 0.74 | 0.72 | 0.76 | 0.74 | 0.72 | 0.76 |
| 3FU2 | class-I preQ1 riboswitch | 32 | 18.8 | 0.79 | 0.63 | 1.00 | 0.79 | 0.63 | 1.00 | 0.79 | 0.63 | 1.00 | 0.79 | 0.63 | 1.00 |
| 3PHP | TYMV p.k. hairpin | 23 | 0.0 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
| 437D | rib. frame-shifting p.k. | 27 | 22.2 | 0.94 | 1.00 | 0.89 | 0.94 | 1.00 | 0.89 | 0.79 | 0.63 | 1.00 | 0.79 | 0.63 | 1.00 |
| Mean | All | 0.83 | 0.85 | 0.83 | 0.82 | 0.84 | 0.81 | 0.75 | 0.71 | 0.83 | 0.73 | 0.65 | 0.83 | ||
| Mean | No pseudoknots | <5.0 | 0.87 | 0.85 | 0.89 | 0.81 | 0.84 | 0.80 | 0.82 | 0.84 | 0.81 | 0.87 | 0.90 | 0.87 | |
| Mean | Pseudoknots | >5.0 | 0.81 | 0.84 | 0.80 | 0.82 | 0.84 | 0.82 | 0.73 | 0.65 | 0.84 | 0.66 | 0.55 | 0.80 | |
L, Sequence length; PKF, fraction of pseudoknot interactions; For each of the four different prediction methods (CF, Cylofold; PK, pknotsRG; HK, HotKnots 2.0; UF, UNAFold) we report three different measures of prediction quality (SNS, sensitivity; PPV, positive predictive value).
Prediction results for a set of 241 RNA sequences that are part of PseudoBase for the programs CyloFold, pknotsRG (7), HotKnots 2.0 (8) and UNAFold (23)
| MCC | SNS | PPV | |
|---|---|---|---|
| CyloFold | 0.752 | 0.763 | 0.747 |
| pknotsRG | 0.748 | 0.753 | 0.756 |
| HotKnots 2.0 | 0.611 | 0.565 | 0.684 |
| UNAFold | 0.597 | 0.532 | 0.692 |
SNS, sensitivity of predicted base pairs; PPV, positive predictive value.