| Literature DB >> 21342569 |
Sayed Mohammad Ebrahim Sahraeian1, Byung-Jun Yoon.
Abstract
BACKGROUND: Accurate and efficient structural alignment of non-coding RNAs (ncRNAs) has grasped more and more attentions as recent studies unveiled the significance of ncRNAs in living organisms. While the Sankoff style structural alignment algorithms cannot efficiently serve for multiple sequences, mostly progressive schemes are used to reduce the complexity. However, this idea tends to propagate the early stage errors throughout the entire process, thereby degrading the quality of the final alignment. For multiple protein sequence alignment, we have recently proposed PicXAA which constructs an accurate alignment in a non-progressive fashion.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21342569 PMCID: PMC3044294 DOI: 10.1186/1471-2105-12-S1-S38
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Graph constructing process. (A) Step 1-Structural skeleton construction: Adding a new base-pair (x, x) and aligning that with its best match: (y, y). (B-D) Step 2-Inserting highly probable local alignments: (B) Adding a new column (node) c*. (C) Extending an existing column (node) c. (D) Merging two columns (nodes) c1 and c2 into a single column (node) c*.
Figure 2An illustrative example for the graph construction process in PicXAA-R. (A) The set of RNA sequences to be aligned. (B) The base-pairs are sorted according to their base-pairing probabilities. (C) The base alignments are sorted according to their transformed alignment probabilities. (D-K) Step 1- Structural skeleton construction: (D, E) Adding a new base-pair (x2, x5) and aligning that with its best match (y2, y4). (F, G) Adding a new base-pair (y1, y5) and aligning that with its best match (v1, v6). (H, I) Extending nodes c3 and c4 by adding the base-pair (z1, z5) to its best match (y1, y5). (J,K) Adding a new base-pair (z2, z4) and aligning that with its best match (v2, v5). (L-R) Step 2- Inserting highly probable local alignments: (L) Extending the node c3 by adding the base alignment (x1, y1). (M) Merging nodes c1 with c5 to include the base alignment (y2, z2) and merging nodes c2 with c6 to include the base alignment (x5, z4).(N) Adding a new node for the alignment (x3, y3). (O) Adding a new node for the alignment (z3, v3). (P) Merging nodes c9 and c10 to include the alignment (x3, v3). (Q) Adding a new node for the alignment (x4, v4). (R) Extending the node c3 by adding the base alignment (x6, z6). (S) The final alignment graph , which gives us the set in a legitimate topological ordering. (T) The alignment obtained from .
Performance evaluation on BRAliBase 2.1
| Method | k2 | k3 | k5 | k7 | k10 | k15 | TIME |
|---|---|---|---|---|---|---|---|
| SPS/SCI | SPS/SCI | SPS/SCI | SPS/SCI | SPS/SCI | SPS/SCI | ||
| PicXAA-R | 84.27 / 85.86 | 86.59 / 83.35 | 88.78 / 83.20 | 90.04 / 81.72 | 90.97 / 79.95 | 92.17 / 79.73 | 6502 |
| ProbConsRNA | 83.58 / 82.46 | 85.46 / 76.54 | 87.90 / 75.85 | 88.99 / 74.91 | 89.90 / 73.25 | 90.76 / 71.92 | 1444 |
| MXSCARNA | 85.02 / 90.67 | 86.57 / 85.56 | 88.43 / 83.44 | 89.40 / 80.89 | 90.17 / 78.34 | 91.26 / 77.18 | 6024 |
| CentroidAlign | 85.55 / 88.64 | 87.06 / 83.77 | 88.93 / 82.40 | 89.99 / 81.23 | 90.96 / 80.22 | 91.65 / 79.34 | 6443 |
| MAFFT-xinsi | 85.66 / 90.77 | 87.76 / 87.11 | 90.27 / 86.70 | 91.36 / 85.70 | 92.26 / 84.73 | 93.22 / 85.38 | 12386 |
Figure 3Accuracy of alignment as a function of the average percent identity. Comparing the accuracy in terms of SPS and SCI scores versus the average percent identity of the alignments in k5, k7, k10, and k15 reference sets of BRAliBase 2.1.
Performance evaluation on BraliSub
| Method | k5 | k7 | k10 | k15 | TIME |
|---|---|---|---|---|---|
| SPS/SCI | SPS/SCI | SPS/SCI | SPS/SCI | ||
| PicXAA-R | 73.90 / 51.39 | 75.06 / 42.37 | 74.02 / 35.75 | 75.43 / 31.29 | 101 |
| ProbConsRNA | 70.59 / 34.94 | 70.18 / 28.45 | 68.73 / 24.03 | 66.53 / 18.29 | 35 |
| MXSCARNA | 70.77 / 46.30 | 69.93 / 35.95 | 68.58 / 27.91 | 69.75 / 17.79 | 84 |
| CentroidAlign | 74.23 / 47.26 | 74.39 / 39.13 | 74.51 / 35.59 | 72.92 / 29.14 | 106 |
| MAFFT-xinsi | 78.28 / 57.60 | 78.56 / 52.10 | 78.48 / 44.75 | 79.23 / 38.79 | 261 |
Performance evaluation on LocExtR
| Method | k20 | k40 | k60 | k80 | TIME |
|---|---|---|---|---|---|
| SPS/SCI | SPS/SCI | SPS/SCI | SPS/SCI | ||
| PicXAA-R | 71.46 / 17.43 | 77.52 / 16.08 | 80.19 / 11.00 | 82.51 / 10.73 | 999 |
| ProbConsRNA | 64.97 / 10.13 | 69.08 / 8.12 | 72.11 / 5.80 | 74.46 / 6.87 | 676 |
| MXSCARNA | 65.52 / 9.67 | 68.30 / 8.44 | 69.45 / 9.15 | 71.16 / 8.93 | 662 |
| CentroidAlign | 71.68 / 18.63 | 74.48 / 15.56 | 77.55 / 11.90 | 79.32 / 10.07 | 1359 |
| MAFFT-xinsi | 77.02 / 26.30 | 80.48 / 20.84 | 81.96 / 16.70 | 83.52 / 14.00 | 3791 |
Performance evaluation on Murlet dataset
| Method | SPS | SCI | SEN | PPV | MCC | TIME |
|---|---|---|---|---|---|---|
| PicXAA-R | 77.90 | 48.15 | 66.08 | 72.71 | 68.29 | 139 |
| ProbConsRNA | 76.26 | 37.47 | 56.79 | 78.12 | 65.10 | 40 |
| MXSCARNA | 74.67 | 44.28 | 64.06 | 74.58 | 68.37 | 120 |
| CentroidAlign | 77.99 | 47.80 | 63.08 | 74.88 | 67.48 | 146 |
| MAFFT-xinsi | 78.72 | 52.94 | 67.04 | 74.56 | 69.64 | 307 |
Figure 4Complexity analysis. Comparing the dependency of different algorithms to the number of sequences in the alignment. The average running time are shown for sequences in BraliSub and LocExtR datasets.