| Literature DB >> 23044552 |
Jana Sperschneider1, Amitava Datta, Michael J Wise.
Abstract
MOTIVATION: Laboratory RNA structure determination is demanding and costly and thus, computational structure prediction is an important task. Single sequence methods for RNA secondary structure prediction are limited by the accuracy of the underlying folding model, if a structure is supported by a family of evolutionarily related sequences, one can be more confident that the prediction is accurate. RNA pseudoknots are functional elements, which have highly conserved structures. However, few comparative structure prediction methods can handle pseudoknots due to the computational complexity.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23044552 PMCID: PMC3516145 DOI: 10.1093/bioinformatics/bts575
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.For two unaligned RNA sequences Seq and Seq, DotKnot-PW produces structure element dictionaries derived from the probability dot plot. Similarity scores and P-values are computed to detect conserved elements
Fig. 2.A set of edges with positive scores is given between nodes p1,…,p7 in the first sequence and p1,…,p6 in the second sequence. The goal is to find the best set of non-overlapping structure elements in the two sequences such that the interval ordering is preserved. The optimal structure element alignment, which preserves the interval ordering includes structure elements p1, p4, p7 in the first sequence and p1, p4, p6 in the second sequence
Prediction results using a test set of different RNA classes
| Type | ID | Info | Family | Prediction of common structure | Reference structure prediction | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| DotKnot-PW, First (Average) | CARNAC | Tfold | hxmatch | DotKnot | ProbKnot | IPknot | RNA fold | |||||
| Frameshift | SARS-CoV ( | NMR | RF00507 |
| 92.3 (79.2) | 38.5 | 57.7 | 38.5 | 92.3 | 69.2 | 73.1 | 73.1 |
| 82 nt | 82 nt | PPV | 100 (90) | 90.9 | 93.8 | 58.8 | 100 | 72 | 79.2 | 73.1 | ||
| 1 PK | 62% | MCC |
| 0.59 | 0.73 | 0.47 |
| 0.7 | 0.76 | 0.73 | ||
| VMV ( | NMR | RF01840 |
| 100 (100) | 0 | 50 | 42.9 | 100 | 50 | 50 | 50 | |
| 68 nt | 55 nt | PPV | 100 (100) | 0 | 50 | 54.5 | 100 | 41.2 | 41.2 | 41.2 | ||
| 1 PK | 89% | MCC |
| 0 | 0.49 | 0.48 |
| 0.44 | 0.44 | 0.44 | ||
| Ribozyme | HDV ( | X-ray | RF00094 |
| 93.8 (90.7) | 21.9 | 21.9 | 12.5 | 93.8 | 40.6 | 62.5 | 37.5 |
| 87 nt | 90 nt | PPV | 100 (97.1) | 100 | 30.4 | 57.1 | 100 | 48.1 | 80 | 42.9 | ||
| 1 PK | 74% | MCC |
| 0.46 | 0.24 | 0.26 |
| 0.43 | 0.7 | 0.39 | ||
| glmS-Ba ( | SC, MG | RF00234 |
| 76.4 (71.7) | 7.3 | 50.9 | 43.6 | 63.6 | 76.4 | 72.7 | 69.1 | |
| 151 nt | 178 nt | PPV | 95.5 (92.9) | 44.4 | 87.5 | 80 | 67.3 | 82.4 | 85.1 | 82.6 | ||
| 2 PKs | 55% | MCC |
| 0.18 | 0.66 | 0.59 | 0.65 | 0.79 | 0.78 | 0.75 | ||
| EC-RNaseP ( | X-ray | RF00010 |
| 55.3 (51) | 26.8 | 29.3 | 46.3 | 53.7 | 74 | 69.9 | 60.2 | |
| 377 nt | 380 nt | PPV | 68.7 (64.4) | 86.8 | 65.5 | 83.8 | 56.4 | 77.8 | 86 | 64.9 | ||
| 2 PKs | 64% | MCC | 0.61 (0.57) | 0.48 | 0.44 | 0.62 | 0.55 | 0.76 |
| 0.62 | ||
| Untranslated regions | BCV ( | SP, MG | RF00165 |
| 100 (95.6) | 55.6 | 61.1 | 0 | 100 | 55.6 | 94.4 | 55.6 |
| 63 nt | 63 nt | PPV | 100 (96.7) | 100 | 84.6 | 0 | 81.8 | 66.7 | 100 | 66.7 | ||
| 1 PK | 78% | MCC |
| 0.74 | 0.71 | -0.01 | 0.9 | 0.6 | 0.97 | 0.6 | ||
| BaMV ( | SP, MG | RF00290 |
| 59.5 (53.8) | 0 | * | 50 | 40.5 | 64.3 | 50 | 40.5 | |
| 170 nt | 145 nt | PPV | 69.4 (62.4) | 0 | * | 65.6 | 41.5 | 69.2 | 67.7 | 45.9 | ||
| 1 PK | 99% | MCC | 0.64 (0.58) | 0 | * | 0.57 | 0.4 |
| 0.58 | 0.43 | ||
| HPeV1 ( | MG | RF00499 |
| 95.3 (86) | 11.6 | 37.2 | 79.1 | 83.7 | 88.4 | 86 | 79.1 | |
| 116 nt | 112 nt | PPV | 95.3 (88.2) | 100 | 100 | 91.9 | 83.7 | 92.7 | 97.4 | 94.4 | ||
| 1 PK | 88% | MCC |
| 0.34 | 0.61 | 0.85 | 0.83 | 0.9 | 0.91 | 0.86 | ||
| Telomerase | Tthe-telo ( | SP | RF00025 |
| 81.6 (61.1) | 68.4 | 60.5 | 57.9 | 73.7 | 65.8 | 71.1 | 60.5 |
| 159 nt | 160 nt | PPV | 91.2 (70.7) | 86.7 | 79.3 | 84.6 | 70 | 55.6 | 67.5 | 54.8 | ||
| 1 PK | 72% | MCC |
| 0.77 | 0.69 | 0.7 | 0.72 | 0.6 | 0.69 | 0.57 | ||
| Riboswitch | preQ1 ( | NMR | RF00522 |
| 100 (73.4) | 55.6 | * | 55.6 | 55.6 | 55.6 | 55.6 | 55.6 |
| 34 nt | 44 nt | PPV | 100 (100) | 100 | * | 100 | 100 | 55.6 | 100 | 71.4 | ||
| 1 PK | 80% | MCC |
| 0.74 | * | 0.74 | 0.74 | 0.53 | 0.74 | 0.61 | ||
| SAM-I ( | X-ray | RF00162 |
| 73.5 (53.8) | 55.9 | 55.9 | 64.7 | 47.1 | 73.5 | 64.7 | 64.7 | |
| 94 nt | 102 nt | PPV | 100 (83.7) | 79.2 | 100 | 91.7 | 57.1 | 100 | 100 | 88 | ||
| 1 PK | 73% | MCC |
| 0.66 | 0.74 | 0.77 | 0.51 |
| 0.8 | 0.75 | ||
| IRES | PSIV ( | X-ray | RF00458 |
| 65.5 (63.8) | 39.7 | 94.8 | 77.6 | 70.7 | 70.7 | 63.8 | 72.4 |
| 194 nt | 198 nt | PPV | 66.7 (73) | 100 | 96.5 | 91.8 | 69.5 | 67.2 | 78.7 | 71.2 | ||
| 2 PKs | 57% | MCC | 0.66 (0.68) | 0.63 |
| 0.84 | 0.7 | 0.69 | 0.71 | 0.72 | ||
| CSFV ( | SP, MG | RF00209 |
| 39 (44.9) | 45.1 | 68.3 | 74.4 | 62.2 | 75.6 | 65.9 | 68.3 | |
| 244 nt | 272 nt | PPV | 42.1 (53.1) | 97.4 | 80 | 85.9 | 68 | 80.5 | 68.4 | 71.8 | ||
| 1 PK | 83% | MCC | 0.4 (0.49) | 0.66 | 0.74 |
| 0.65 | 0.78 | 0.67 | 0.7 | ||
| mRNA | S15 ( | SP, MG | RF00114 |
| 100 (81.8) | 52.9 | 52.9 | 41.2 | 100 | 58.8 | 58.8 | 58.8 |
| 74 nt | 112 nt | PPV | 100 (83.4) | 52.9 | 40.9 | 46.7 | 100 | 47.6 | 52.6 | 52.6 | ||
| 1 PK | 77% | MCC |
| 0.52 | 0.46 | 0.43 |
| 0.52 | 0.55 | 0.55 | ||
| repZ ( | SP, MG | RF01087 |
| 73.8 (71.4) | 9.5 | 31 | 83.3 | 71.4 | 57.1 | 66.7 | 57.1 | |
| 149 nt | 149 nt | PPV | 86.1 (80.2) | 44.4 | 61.9 | 85.4 | 85.7 | 70.6 | 82.4 | 64.9 | ||
| 1 PK | 90% | MCC | 0.8 (0.76) | 0.2 | 0.43 |
| 0.78 | 0.63 | 0.74 | 0.6 | ||
| tmRNA | Ec-tmRNA ( | NMR, SC | RF00023 |
| 37.5 (50) | 6.7 | 34.6 | 27.9 | 75 | 56.7 | 66.3 | 50 |
| 363 nt | 377 nt | PPV | 45.9 (58.4) | 100 | 67.9 | 61.7 | 79.6 | 60.8 | 76.7 | 47.7 | ||
| 4 PKs | 57% | MCC | 0.41 (0.54) | 0.26 | 0.48 | 0.41 |
| 0.59 | 0.71 | 0.49 | ||
| Average |
|
| 31 | 50.4 | 49.7 | 74 | 64.5 | 67 | 59.5 | |||
| Average | PPV |
| 73.9 | 74.2 | 71.2 | 78.8 | 68 | 78.9 | 64.6 | |||
| Average | MCC |
| 0.45 | 0.6 | 0.59 | 0.76 | 0.66 | 0.72 | 0.61 | |||
Note: Each reference structure is given by its ID (see Supplementary Material for dot-bracket notation). The following column gives the method of experimental support (NMR, NMR spectroscopy; X-ray, X-ray crystallography; SC, sequence comparison; MG, mutagenesis; SP, structure probing), length of the sequence and number of pseudoknots. For each reference structure, the corresponding RFAM family ID, average sequence length and average pairwise sequence identity is shown. The * symbol means that the method failed to run. The ‘first’ prediction for DotKnot-PW is the pairwise prediction with highest combined free energy and similarity score.
Fig. 3.Pairwise prediction results for the S15 mRNA pseudoknot (RFAM family RF00114) with the top two combined free energy and similarity scores. The reference structure is shown at the top and folds into two conformations in dynamic equilibrium: a H-type pseudoknot or a series of hairpins. For the pairwise prediction with highest score, the pseudoknot structure is returned. For the second-best pairwise prediction, the alternative hairpin loop structure is returned