| Literature DB >> 26329823 |
Agnieszka Rybarczyk1,2, Natalia Szostak3, Maciej Antczak4, Tomasz Zok5, Mariusz Popenda6,7, Ryszard Adamiak8,9, Jacek Blazewicz10,11, Marta Szachniuk12,13.
Abstract
BACKGROUND: The function of RNA is strongly dependent on its structure, so an appropriate recognition of this structure, on every level of organization, is of great importance. One particular concern is the assessment of base-base interactions, described as the secondary structure, the knowledge of which greatly facilitates an interpretation of RNA function and allows for structure analysis on the tertiary level. The RNA secondary structure can be predicted from a sequence using in silico methods often adjusted with experimental data, or assessed from 3D structure atom coordinates. Computational approaches typically consider only canonical, Watson-Crick and wobble base pairs. Handling of non-canonical interactions, important for a full description of RNA structure, is still very difficult.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26329823 PMCID: PMC4557229 DOI: 10.1186/s12859-015-0718-6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Workflow in the RC/Rp pipeline
Quality of non-canonical base pair prediction for RNA STRAND-deposited structures (best values in bold)
| Sequence length (nts) | |||||
|---|---|---|---|---|---|
| 1–50 | 51–100 | 101–200 | 201–500 | ||
| Number of reference structures that include non-canonical base pairs | 319 | 126 | 188 | 455 | |
| Total number of non-canonical base pairs observed in reference structures | 641 | 300 | 607 | 2252 | |
| (a) Results for non-canonical base pairs predicted from sequence | |||||
| Number (and percentage) of correctly predicted non-canonical base pairs present in the reference structures | RNAwolf | 171 (26.68) | 38 (12.67) | 44 (7.25) | 149 (6.62) |
| MC-Fold-DP |
| 94 (31.33) | 157 (25.86) | 636 (28.24) | |
| MC-Fold | 363 (56.63) | 82 (27.33) | 167 (27.51) | n/a | |
| RC/Rp-1 | 369 (57.57) |
|
|
| |
| RC/Rp-2 | 311 (48.52) | 79 (26.33) | 244 (40.20) | 618 (27.44) | |
| RC/Rp-3 | 312 (48.67) | 81 (27.00) | 225 (37.07) | 654 (29.04) | |
| Total number of predicted non-canonical base pairs | RNAwolf | 893 | 501 | 1334 | 8616 |
| MC-Fold-DP | 1099 | 1040 | 2891 | 20123 | |
| MC-Fold | 816 | 699 | 1825 | n/a | |
| RC/Rp-1 | 1493 | 1462 | 4453 | 26050 | |
| RC/Rp-2 | 1418 | 1235 | 4041 | 27282 | |
| RC/Rp-3 | 949 | 698 | 2968 | 14756 | |
| (b) Results for non-canonical base pairs predicted from canonical secondary structure | |||||
| Number (and percentage) of correctly predicted non-canonical base pairs present in the reference structures | RNAwolf | 214 (33.39) | 67 (22.33) | 268 (44.15) | 772 (34.28) |
| MC-Fold-DP | n/a | n/a | n/a | n/a | |
| MC-Fold | 334 (52.11) | 136 (45.33) | 279 (45.96) | n/a | |
| RC/Rp-1 |
|
|
|
| |
| RC/Rp-2 | 398 (62.09) | 131 (43.67) | 290 (47.78) | 974 (43.25) | |
| RC/Rp-3 | 408 (63.65) | 145 (48.33) | 261 (43.00) | 1051 (46.67) | |
| Total number of predicted non-canonical base pairs | RNAwolf | 352 | 154 | 461 | 2183 |
| MC-Fold-DP | n/a | n/a | n/a | n/a | |
| MC-Fold | 335 | 137 | 279 | n/a | |
| RC/Rp-1 | 1404 | 1470 | 4145 | 26479 | |
| RC/Rp-2 | 1273 | 1191 | 3978 | 26287 | |
| RC/Rp-3 | 969 | 672 | 2754 | 15011 | |
Fig. 2Percentage of reference non-canonical base pairs predicted from (a) sequence and (b) canonical secondary structure
Average computing times (and standard deviation) for RNA STRAND-deposited structures (in seconds)
| Method | Sequence length (nts) | |||
|---|---|---|---|---|
| 1–50 | 51–100 | 101–200 | 201–500 | |
| (a) Results for sequence-based prediction | ||||
| RNAwolf | 9.51 (0.25) | 9.80 (0.26) | 10.61 (0.61) | 37.44 (19.15) |
| MC-Fold-DP | 1.62 (0.38) | 1.67 (0.48) | 1.87 (0.48) | 6.63 (2.03) |
| MC-Fold | 6.50 (5.82) | 142.26 (124.27) | 1376.01 (992.24) | n/a |
| RC/Rp-1 | 12.15 (2.78) | 20.94 (4.88) | 33.38 (4.74) | 92.81 (23.39) |
| RC/Rp-2 | 12.22 (2.82) | 21.20 (4.89) | 34.00 (4.81) | 97.27 (24.41) |
| RC/Rp-3 | 12.17 (2.81) | 20.99 (4.90) | 33.56 (4.73) | 93.40 (23.53) |
| (b) Results for sequence and canonical secondary structure-base prediction | ||||
| RNAwolf | 5.71 (4.05) | 3.38 (3.93) | 5.71 (4.05) | 15.06 (58.73) |
| MC-Fold-DP | n/a | n/a | n/a | n/a |
| MC-Fold | 1.87 (2.83) | 35.72 (71.92) | 825.68 (1033.23) | n/a |
| RC/Rp-1 | 10.64 (3.30) | 17.47 (4.19) | 29.44 (4.19) | 83.97 (23.05) |
| RC/Rp-2 | 10.72 (3.35) | 17.73 (4.23) | 29.99 (4.21) | 86.58 (24.03) |
| RC/Rp-3 | 10.67 (3.34) | 17.53 (4.22) | 29.62 (4.21) | 84.44 (23.13) |
The accuracy of secondary structure models predicted from the sequence of K-turn–GNRA construct (best values in bold)
| Method | PPV | TPR | MCC |
|---|---|---|---|
| Variant I: Canonical and non-canonical base pairs | |||
| RNAwolf | 0.67 | 0.44 | 0.54 |
| MC-Fold-DP | 0.85 | 0.61 | 0.72 |
| MC-Fold | 0.77 | 0.56 | 0.65 |
| RC/Rp-1 |
| 0.67 | 0.82 |
| RC/Rp-2 |
|
|
|
| RC/Rp-3 |
| 0.67 | 0.82 |
| Variant II: Canonical base pairs only | |||
| RNAwolf | 0.70 | 0.78 | 0.74 |
| MC-Fold-DP | 0.69 |
| 0.83 |
| MC-Fold | 0.89 | 0.89 | 0.89 |
| RC/Rp-1 |
|
|
|
| RC/Rp-2 |
|
|
|
| RC/Rp-3 |
| 0.89 | 0.94 |
| Variant III: Non-canonical base pairs only, regardless of classification | |||
| RNAwolf | 0.50 | 0.11 | 0.24 |
| MC-Fold-DP | n/a | 0 | n/a |
| MC-Fold |
| 0.11 | 0.33 |
| RC/Rp-1 |
| 0.33 | 0.58 |
| RC/Rp-2 |
|
|
|
| RC/Rp-3 | 0.75 | 0.33 | 0.50 |
| Variant IV: Non-canonical base pairs only, classification dependent | |||
| RNAwolf |
| 0.11 | 0.33 |
| MC-Fold-DP | n/a | n/a | n/a |
| MC-Fold |
| 0.11 | 0.33 |
| RC/Rp-1 | 0.67 | 0.25 | 0.41 |
| RC/Rp-2 | 0.75 |
|
|
| RC/Rp-3 | 0.67 | 0.25 | 0.41 |
The accuracy of secondary structure models predicted from tyrosyl-tRNA sequence (best values in bold)
| Method | PPV | TPR | MCC |
|---|---|---|---|
| Variant I: Canonical and non-canonical base pairs | |||
| RNAwolf | 0.71 | 0.44 | 0.56 |
| MC-Fold-DP | 0.41 | 0.33 | 0.37 |
| MC-Fold | 0.57 | 0.44 | 0.50 |
| RC/Rp-1 | 0.94 | 0.74 | 0.83 |
| RC/Rp-2 |
|
|
|
| RC/Rp-3 | 0.94 |
| 0.85 |
| Variant II: Canonical base pairs only | |||
| RNAwolf | 0.80 | 0.76 | 0.78 |
| MC-Fold-DP | 0.28 | 0.43 | 0.35 |
| MC-Fold | 0.56 | 0.71 | 0.63 |
| RC/Rp-1 | 0.95 |
| 0.98 |
| RC/Rp-2 |
|
|
|
| RC/Rp-3 |
|
|
|
| Variant III: Non-canonical base pairs only, regardless of classification | |||
| RNAwolf | 0.25 | 0.06 | 0.12 |
| MC-Fold-DP | n/a | 0 | n/a |
| MC-Fold | 0.67 | 0.11 | 0.27 |
| RC/Rp-1 | 0.89 | 0.44 | 0.63 |
| RC/Rp-2 |
|
|
|
| RC/Rp-3 | 0.82 |
| 0.64 |
| Variant IV: Non-canonical base pairs only, classification dependent | |||
| RNAwolf | 0.25 | 0.06 | 0.12 |
| MC-Fold-DP | n/a | 0 | n/a |
| MC-Fold | 0.33 | 0.06 | 0.14 |
| RC/Rp-1 | 0.78 | 0.39 | 0.55 |
| RC/Rp-2 |
|
|
|
| RC/Rp-3 | 0.55 | 0.33 | 0.43 |
Fig. 3(a) The reference secondary structure of K-turn–GNRA construct with LW-annotated non-canonical base pairs, and its dot-bracket notation. Base pairs close to particular LW interaction, but not meeting strict criteria for membership are connected by gray dashed lines. (b-e) Secondary structures predicted by (b) RC/Rp-1, (c) RC/Rp-2, (d) RC/Rp-3, and (e) RNAwolf, and arc diagrams to display the results of comparing dot-bracket representations of particular predicted models with the reference structure
Fig. 4(a) The reference secondary structure of archaeal tyrosyl-tRNA with LW-annotated non-canonical base pairs, and its dot-bracket notation. Base pairs close to particular LW interaction, but not meeting strict criteria for membership are connected by gray dashed lines. (b–e) Secondary structures predicted by (b) RC/Rp-1, (c) RC/Rp-2, (d) RC/Rp-3, and (e) RNAwolf, and arc diagrams to display the results of comparing dot-bracket representations of particular predicted models with the reference structure. Orange arcs show pseudoknot interaction