| Literature DB >> 24884954 |
Abstract
BACKGROUND: Improving accuracy and efficiency of computational methods that predict pseudoknotted RNA secondary structures is an ongoing challenge. Existing methods based on free energy minimization tend to be very slow and are limited in the types of pseudoknots that they can predict. Incorporating known structural information can improve prediction accuracy; however, there are not many methods for prediction of pseudoknotted structures that can incorporate structural information as input. There is even less understanding of the relative robustness of these methods with respect to partial information.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24884954 PMCID: PMC4064103 DOI: 10.1186/1471-2105-15-147
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Pseudoknotted and pseudoknot-free secondary structures. Examples of loops and canonical base pairs in a pseudoknotted and a pseudoknot-free secondary structure. The blue base pairs belong to the G structure and the green base pairs belong to the G structure, as defined in Section ‘Definition of G and G’. This figure was produced using the VARNA software [55].
Energy parameters
| Exterior pseudoloop | −1.38 | |
| | initiation penalty | |
| Penalty for introducing pseudoknot | 10.07 | |
| | inside a multiloop | |
| Penalty for introducing pseudoknot | 15.00 | |
| | inside a pseudoloop | |
| Band penalty | 2.46 | |
| Penalty for unpaired base | 0.06 | |
| | in a pseudoloop | |
| Penalty for closed subregion | 0.96 | |
| | inside a pseudoloop | |
| Energy of a hairpin loop closed by | | |
| Energy of stacked pair closed by | | |
| Energy of stacked pair that | 0.89× | |
| | spans a band | |
| Energy of a pseudoknot-free | | |
| | internal loop | |
| Energy of internal loop | 0.74× | |
| | that spans a band | |
| Multiloop initiation penalty | 3.39 | |
| Multiloop base pair penalty | 0.03 | |
| Penalty for unpaired base | 0.02 | |
| | in a multiloop | |
| Penalty for introducing a multiloop | 3.41 | |
| | that spans a band | |
| Base pair penalty for a multiloop | 0.56 | |
| | that spans a band | |
| Penalty for unpaired base in a multiloop | 0.12 | |
| that spans a band |
This table provides the names, description and values of the energy parameters and functions that we used in our methods. The names and definitions are the same as in our original HFold [48], and the values were updated based on the work of Andronescu et al. [36]. These parameters were derived for a temperature of 37°C and 1 M salt concentration.
Figure 2Comparison of robustness of HFold and Iterative HFold. Robustness results for pseudoknotted structures of the HK-PK data set (2A), pseudoknot-free structures of the HK-PK-free data set (2B) and all structures (2C). The X axes show the available information about G structure in percentage format, and the Y axes show bootstrap 95% percentile confidence intervals for average F-measure. Dashed lines show the borders of the bootstrap 95% percentile for average F-measure and solid lines show the average F-measure itself.
Comparison of bootstrap 95% percentile confidence interval of average F-measure of different versions of HFold when given SimFold structure as input vs. when given HotKnots hotspots structures as input
| HK-PK | (55.54, 71.06) | (73.35, 83.53) | (72.83, 83.37) | (50.57, 63.53) | (50.69, 63.54) | (51.42, 64.39) |
| HK-PK-free | (31.37, 38.52) | (75.53, 80.79) | (74.93, 80.26) | (78.42, 83.21) | (78.33, 83.27) | (78.31, 83.17) |
Comparison of bootstrap 95% percentile confidence interval of average F-measure with existing methods
| HK-PK | (72.83, 83.37) | (73.60, 83.35) | (45.34, 57.73) | (54.56, 66.25) |
| HK-PK-free | (74.93, 80.26) | (76.74, 81.95) | (78.78, 83.55) | (77.31, 81.79) |
Comparison of bootstrap 95% percentile confidence interval of average F-measure with existing methods on the DK-pk16 and the IP-pk168 data sets
| DK-pk16 | (68.05, 81.85) | (69.11, 83.81) | (65.42, 75.81) |
| IP-pk168 | (72.65, 79.86) | (65.51, 72.96) | (58.20, 66.09) |
Figure 3Time comparison. Comparison of running times of Iterative HFold and HotKnots in a log plot. The X axis shows log10(time) for HotKnots data points and the Y axis shows log10(time) for Iterative HFold. Time is measured in seconds.
Comparison of Iterative HFold F-measure with ShapeKnots on SHAPE data
| Pre-Q1 riboswitch, B. subtilis | 34 | 1 | 62.5 | 100 | 76.9 | 100 | 100 | 100 |
| Telomerase pseudoknot, human | 47 | 1 | 100 | 100 | 100 | 100 | 100 | 100 |
| tRNA(asp), yeast | 75 | 0 | 81.0 | 100 | 89.5 | 95.2 | 95.2 | 95.2 |
| TPP riboswitch, E. coli | 79 | 0 | 46.5 | 47.6 | 47.1 | 95.4 | 87.5 | 91.3 |
| SARS corona virus pseudoknot | 82 | 1 | 69.2 | 86.3 | 69.2 | 84.6 | 88.0 | 86.3 |
| cyclic-di-GMP riboswitch, V. cholerae | 97 | 0 | 85.5 | 81.0 | 83.2 | 89.3 | 86.2 | 87.7 |
| SAM I riboswitch, T. tengcongenis | 118 | 1 | 79.5 | 91.2 | 84.9 | 92.3 | 97.3 | 94.7 |
| M-Box riboswitch, B. subtilis | 154 | 0 | 87.5 | 91.3 | 89.4 | 87.5 | 91.3 | 89.3 |
| P546 domain, bI3 group I intron | 155 | 0 | 55.4 | 57.4 | 56.4 | 94.6 | 96.4 | 95.5 |
| Lysine riboswitch, T. maritima | 174 | 1 | 85.7 | 94.7 | 90.0 | 87.3 | 88.7 | 88.0 |
| Group I intron, Azoarcus sp. | 214 | 1 | 52.4 | 54.1 | 53.2 | 92.1 | 95.1 | 93.5 |
| Signal recognition particle RNA, human | 301 | 0 | 70.0 | 73.7 | 71.8 | 55.0 | 53.9 | 54.4 |
| Hepatitis C virus IRES domain | 336 | 1 | 71.2 | 74.0 | 72.5 | 92.3 | 96.0 | 94.1 |
| RNase P, B. subtilis | 405 | 1 | 55.7 | 59.3 | 57.4 | 75.6 | 79.8 | 77.7 |
| Group II intron, O. iheyensis | 412 | 1 | 87.9 | 95.9 | 91.7 | 93.2 | 97.6 | 95.3 |
| Group I intron, T. thermophila | 425 | 1 | 83.2 | 85.2 | 84.2 | 93.9 | 91.2 | 92.5 |
| 5’ domain of 23S rRNA, E. coli | 511 | 0 | 84.0 | 72.5 | 77.8 | 92.4 | 76.4 | 83.6 |
| 5’ domain of 16S rRNA, E. coli | 530 | 0 | 73.6 | 69.0 | 71.2 | 89.9 | 80.6 | 84.9 |
| | | | ||||||
| Fluoride riboswitch, P. syringae | 66 | 1 | 100 | 100 | 100 | 93.7 | 93.7 | 93.7 |
| Adenine riboswitch, V. vulnificus | 71 | 0 | 100 | 100 | 100 | 100 | 100 | 100 |
| tRNA(phe), E. coli | 76 | 0 | 100 | 100 | 100 | 100 | 84.0 | 91.3 |
| 5S rRNA, E. coli | 120 | 0 | 91.4 | 91.4 | 91.4 | 85.7 | 76.9 | 81.1 |
| 5’ domain of 16S rRNA, H. volcanii | 473 | 0 | 90.3 | 82.3 | 86.1 | 89.6 | 82.7 | 86.0 |
| HIV-1 5’ pseudoknot domain | 500 | 1 | 45.4 | 50.4 | 47.7 | 100 | 100 | 100 |
Comparison of bootstrap 95% percentile confidence interval of average F-measure between the minimum energy structures and the maximum accuracy structures of the HK-PK and the HK-PK-free data sets
| Iter. HFold - hotspots PKed | (72.83, 83.37) | (78.56, 87.05) | Not significant |
| Iter. HFold - hotspots PK-free | (74.93, 80.26) | (87.70, 90.57) | Significant |
| Iter. HFold - 50 suboptimals PKed | (67.70, 79.57) | (80.41, 88.14) | Significant |
| Iter. HFold - 50 suboptimals PK-free | (76.27, 81.46) | (90.05, 93.00) | Significant |
| HotKnots PKed | (73.60, 83.35) | (84.50, 91.48) | Significant |
| HotKnots PK-free | (76.74, 81.95) | (88.32, 91.08) | Significant |