| Literature DB >> 31881969 |
Zhang Kai1,2, Wang Yuting1, Lv Yulin1, Liu Jun1,2, He Juanjuan3.
Abstract
BACKGROUND: RNA pseudoknot structures play an important role in biological processes. However, existing RNA secondary structure prediction algorithms cannot predict the pseudoknot structure efficiently. Although random matching can improve the number of base pairs, these non-consecutive base pairs cannot make contributions to reduce the free energy. RESULT: In order to improve the efficiency of searching procedure, our algorithm take consecutive base pairs as the basic components. Firstly, our algorithm calculates and archive all the consecutive base pairs in triplet data structure, if the number of consecutive base pairs is greater than given minimum stem length. Secondly, the annealing schedule is adapted to select the optimal solution that has minimum free energy. Finally, the proposed algorithm is evaluated with the real instances in PseudoBase.Entities:
Keywords: Minimum free energy; Pseudoknot; RNA secondary structure; Simulated annealing algorithm
Mesh:
Substances:
Year: 2019 PMID: 31881969 PMCID: PMC6933665 DOI: 10.1186/s12864-019-6300-2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1RNA Secondary Structure and Substructures
Fig. 2One of the mapping string Y(X) for sequence X
Fig. 3A arc representation for pseudoknot structure
Fig. 4Consecutive paired MinStem and unpaired MinLoop
Fig. 5K consecutive base pairs set of TMEV molecules
Fig. 6m1 base number exchange process
Fig. 7New neighboring state generation process
Fig. 8Remove base pairing conflicts
Fig. 9Check the rationality of remaining mi
Fig. 10m2 base number exchange
Fig. 11Two different secondary structures of BCRV1
Evaluation results
| Structure | MG | PG | Group | TP | AP | F( |
|---|---|---|---|---|---|---|
| 1 | 2 | 5 | 33 | 6.6 | 862.49 | |
| 1 | 1 | 4 | 31 | 7.75 | 1861.94 |
Benchmark Instances from RNA PseudoBase
| ID | RNA Abbreviation | PKB Number | RNA Type | Length (nt.) | Known bps |
|---|---|---|---|---|---|
| 1 | Mengo_PKB | PKB295 | Viral 5 UTR | 24 | 7 |
| 2 | T4_gene32 | PKB74 | mRNA | 28 | 11 |
| 3 | HAV_PK1 | PKB297 | Viral 5 UTR | 33 | 12 |
| 4 | TEV_PK1 | PKB277 | Viral 5 UTR | 35 | 11 |
| 5 | IPCV1 | PKB35 | Viral tRNA-like | 40 | 8 |
| 6 | ScYLV | PKB281 | Viral Frameshift | 42 | 8 |
| 7 | Ec_PK3 | PKB51 | tmRNA | 46 | 14 |
| 8 | Ec_PK4 | PKB52 | tmRNA | 52 | 19 |
| 9 | BEV | PKB128 | Viral Frameshift | 59 | 16 |
| 10 | BaEV | PKB98 | Viral Readthrough | 62 | 15 |
| 11 | VMV | PKB280 | Viral Frameshift | 68 | 14 |
| 12 | ALFV | PKB350 | Viral Frameshift | 77 | 17 |
| 13 | MVEV | PKB349 | Viral Frameshift | 80 | 18 |
| 14 | SARS-CoV | PKB254 | Viral Frameshift | 82 | 26 |
| 15 | FCiLV3 | PKB395 | Viral tRNA-like | 109 | 37 |
| 16 | BBMV3 | PKB135 | Viral tRNA-like | 116 | 39 |
| 17 | CVV3 | PKB389 | Viral tRNA-like | 129 | 37 |
| 18 | CCMV3 | PKB136 | Viral tRNA-like | 134 | 45 |
State-of-the-art RNA structure predication algorithms
| ID | Method | Website link |
|---|---|---|
| 1 | RnaStructure | |
| 2 | CyloFold | |
| 3 | IPknot | |
| 4 | RNAflod | |
| 5 | CombFold | |
| 6 | HotKnots | |
| 7 | TT2NE |
Sensitivity Comparison Results
| ID | #BP | Sensitivity (%) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | PRSA | ||
| 1 | 7 | 28.6 | 100.0 | 42.9 | 42.9 | 42.9 | 42.9 | # | |
| 2 | 11 | 63.6 | 63.6 | 63.6 | 63.6 | 81.8 | |||
| 3 | 12 | 58.3 | 58.3 | 58.3 | 58.3 | 91.7 | |||
| 4 | 11 | 45.5 | 45.5 | 18.2 | 45.5 | 45.5 | 45.5 | # | |
| 5 | 8 | 62.5 | 62.5 | 62.5 | 62.5 | 62.5 | 62.5 | 87.5 | |
| 6 | 8 | 62.5 | 87.5 | 62.5 | 62.5 | # | |||
| 7 | 14 | 50.0 | 85.7 | 71.4 | 64.3 | 64.3 | 64.3 | 92.9 | |
| 8 | 19 | 57.9 | 42.1 | 68.4 | 68.4 | 68.4 | 68.4 | 63.2 | |
| 9 | 16 | 68.8 | 93.8 | 81.3 | 68.8 | 68.8 | 68.8 | 87.5 | |
| 10 | 15 | 0.0 | 86.7 | 0.0 | 0.0 | 0.0 | 40.0 | 100.0 | 93.3 |
| 11 | 14 | 50.0 | 50.0 | 50.0 | 50.0 | 92.9 | |||
| 12 | 17 | 64.7 | 64.7 | 64.7 | 64.7 | 100.0 | |||
| 13 | 18 | 61.1 | 61.1 | 61.1 | 61.1 | 100.0 | |||
| 14 | 26 | 65.4 | 69.2 | 69.2 | 69.2 | 69.2 | 73.1 | 51.7 | |
| 15 | 37 | 81.1 | 97.3 | 67.6 | 81.1 | 67.6 | # | 91.9 | |
| 16 | 39 | 79.5 | 84.6 | 69.2 | 82.1 | 64.1 | # | 71.8 | 82.1 |
| 17 | 37 | 89.2 | 81.1 | 89.2 | 89.2 | 89.2 | # | 73.0 | 73.0 |
| 18 | 45 | 80.0 | 66.7 | 68.9 | # | 71.1 | 73.3 | ||
| Average | 59.4 | 84.1 | 61.6 | 62.1 | 59.5 | 78.8 | 86.7 | ||
The best Sensitivity values for each algorithm are shown in boldface
Specificity Comparison Results
| ID | #BP | Specificity (%) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | PRSA | ||
| 1 | 7 | 50.0 | 60.0 | 60.0 | 60.0 | 60.0 | # | ||
| 2 | 11 | 87.5 | 87.5 | ||||||
| 3 | 12 | 85.7 | 85.7 | 91.7 | 85.7 | ||||
| 4 | 11 | 62.5 | 28.6 | 62.5 | 62.5 | 62.5 | # | ||
| 5 | 8 | 55.6 | 55.6 | 55.6 | 55.6 | 55.6 | 80.0 | 55.6 | |
| 6 | 8 | 71.4 | 77.8 | 62.5 | 71.4 | 72.7 | # | ||
| 7 | 14 | 87.5 | 76.9 | 90.0 | 90.0 | 90.0 | 92.9 | ||
| 8 | 19 | 66.7 | |||||||
| 9 | 16 | 68.8 | 81.3 | 64.7 | 64.7 | 64.7 | 66.7 | 76.2 | |
| 10 | 15 | 0.0 | 0.0 | 0.0 | 0.0 | 31.6 | 65.2 | 70.0 | |
| 11 | 14 | 43.8 | 38.9 | 41.2 | 41.2 | 70.0 | 65.0 | 70.0 | |
| 12 | 17 | 47.8 | 45.8 | 45.8 | 44.0 | 70.8 | 70.8 | 70.8 | |
| 13 | 18 | 50.0 | 72.0 | 44.0 | 47.8 | 47.8 | 72.0 | 72.0 | |
| 14 | 26 | 89.5 | 72.0 | 78.3 | 85.7 | 78.3 | 73.1 | 46.9 | |
| 15 | 37 | 85.7 | 94.7 | 73.5 | 90.9 | 54.5 | # | 82.9 | |
| 16 | 39 | 81.6 | 86.8 | 75.0 | 82.1 | 73.5 | # | 73.7 | 82.1 |
| 17 | 37 | 82.5 | 88.2 | 100.0 | 86.8 | 89.2 | # | 61.4 | 81.8 |
| 18 | 45 | 83.7 | 66.7 | 86.4 | 75.6 | # | 71.1 | 76.7 | |
| Average | 69.3 | 83.7 | 68.0 | 70.1 | 66.4 | 73.8 | 73.2 | ||
The best Specificity values for each algorithm are shown in boldface
F-measure Comparison Results
| ID | #BP | F-measure (%) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | PRSA | ||
| 1 | 7 | 36.4 | 50.0 | 50.0 | 50.0 | 50.0 | # | ||
| 2 | 11 | 73.7 | 77.8 | 77.8 | 73.7 | 90.0 | |||
| 3 | 12 | 73.7 | 73.7 | 73.7 | 73.7 | 91.7 | |||
| 4 | 11 | 52.6 | 62.5 | 22.2 | 52.6 | 52.6 | 52.6 | # | |
| 5 | 8 | 58.8 | 58.8 | 58.8 | 58.8 | 58.8 | 88.9 | 58.8 | |
| 6 | 8 | 66.7 | 82.4 | 62.5 | 66.7 | 84.2 | # | ||
| 7 | 14 | 63.6 | 92.3 | 74.1 | 75.0 | 75.0 | 75.0 | 92.9 | |
| 8 | 19 | 73.3 | 51.6 | 81.3 | 81.3 | 81.3 | 81.3 | 77.4 | |
| 9 | 16 | 68.8 | 81.3 | 66.7 | 66.7 | 66.7 | 75.7 | 86.5 | |
| 10 | 15 | # | # | # | # | 35.3 | 78.9 | 80.0 | |
| 11 | 14 | 46.7 | 43.8 | 45.2 | 45.2 | 82.4 | 76.5 | 82.4 | |
| 12 | 17 | 55.0 | 53.7 | 53.7 | 52.4 | 82.9 | 82.9 | 82.9 | |
| 13 | 18 | 55.0 | 83.7 | 51.2 | 53.7 | 53.7 | 83.7 | 83.7 | |
| 14 | 26 | 75.6 | 70.6 | 73.5 | 76.6 | 73.5 | 73.1 | 51.7 | |
| 15 | 37 | 83.3 | 70.4 | 85.7 | 70.4 | # | 87.2 | ||
| 16 | 39 | 80.5 | 72.0 | 82.1 | 68.5 | # | 72.7 | 82.1 | |
| 17 | 37 | 85.7 | 84.5 | 88.0 | 89.2 | # | 66.7 | 77.1 | |
| 18 | 45 | 81.8 | 66.7 | 85.4 | 72.1 | # | 71.1 | 75.0 | |
| Average | 66.5 | 82.7 | 67.1 | 68.8 | 66.0 | 74.9 | 79.1 | ||
The best F-measure values for each algorithm are shown in boldface
Fig. 12Comparison of predicted secondary structure by PRSA and CyloFold algorithm