| Literature DB >> 23144872 |
Qi Song1, Tonghua Li, Peisheng Cong, Jiangming Sun, Dapeng Li, Shengnan Tang.
Abstract
MOTIVATION: Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23144872 PMCID: PMC3492357 DOI: 10.1371/journal.pone.0048389
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Statistic result of datasets and database.
| Dataset | Sequence count | Residue count | Turn count | Turn percentage |
|
| 4,107 | 929,035 | 228,062 | 24.5% |
|
| 736 | 176,674 | 42,740 | 24.2% |
|
| 48,428 | 11,293,117 | 2,759,778 | 24.4% |
|
| 77 | 7,718 | 1,973 | 25.6% |
|
| 248 | 53,144 | 12,336 | 23.2% |
Figure 1The flowchart of TurnP.
Figure 4An example of prediction of TurnP, 1B9W_A. in Figure 4(a), the red curve represents turn residues which have been marked as turn, while the other grey curve represents Random Coil.
The marine spiral represents Helix, and the green arrow represents Sheet. In the Figure 4(b), all turns in 1B9W_A were shown as blocks to make them more clearly to see. The line represents the sequence, the red blocks represent turn residues which have been predicted correctly, while the grey ones represent turn residues that have not been predicted by TurnP. Position number is counted every 10 residues for convenience, and the position of relevent turns were signed in (a). The illustration of the 3D structure was drawn by PyMOL [34].
Figure 2Diagram of profile.
We used 3 kinds of profiles in this study: PSSM, SPSSM and Shape string profile. They have 20, 3 and 8 specific elements for each amino acid respectively obtained by sequence alignment and sequence-structure alignment. Each square represents a element that is normalized frequency. The red squares represent large values near ‘1′ and blue ones represent small values near ‘0′; and the deeper the color of the square is, the closer the value to extreme values.
5-fold cross validation results of Train_0925.
| Ac (%) | Qpred (%) | Sn (%) | Sp (%) | MCC | |
|
| 87.2 | 76.7 | 67.7 | 93.4 | 0.64 |
|
| 88.8 | 79.9 | 71.8 | 94.2 | 0.69 |
|
| 90.3 | 85.2 | 72.4 | 96.0 | 0.72 |
Validation results with Test_1025 as test set.
| Ac (%) | Qpred (%) | Sn (%) | Sp (%) | MCC | |
|
| 82.3 | 64.7 | 57.6 | 90.1 | 0.50 |
|
| 82.6 | 64.4 | 60.8 | 89.5 | 0.51 |
The first line shows the validation result using PSSM, Predicted Secondary Structure and Predicted Shape String as feature; the second line shows the validation result adding two more structural evolution information: SPSSM and Shape String Profile.
The prediction performance of our method on EVAset1 and CASP9.
| Ac (%) | Qpred (%) | Sn (%) | Sp (%) | MCC | |
|
| 79.6 | 62.5 | 49.0 | 90.0 | 0.43 |
|
| 73.1 | 48.0 | 75.7 | 72.2 | 0.43 |
|
| 79.3 | 55.5 | 54.1 | 86.9 | 0.41 |
|
| 73.6 | 45.9 | 77.2 | 72.5 | 0.43 |
‘a’ represents the original prediction results. ‘b’ represents the prediction results with decision threshold of 0.2.
Figure 3Comparison chart between Train_0925 5-fold validation and evaluation result of Test_1025, EVAset1 and CASP9.
Performance comparison of the present method and other methods.
| Ac (%) | Qpred (%) | Sn (%) | MCC | |
|
| 88.8 | 79.9 | 71.8 | 0.69 |
| SOM of Meissn | 76.0 | 81.1 | 67.8 | 0.53 |
|
| 61.9 | 45.0 | 68.1 | 0.25 |