| Literature DB >> 18710504 |
Darby Tien-Hao Chang1, Yu-Yen Ou, Hao-Geng Hung, Meng-Han Yang, Chien-Yu Chen, Yen-Jen Oyang.
Abstract
BACKGROUND: Though prediction of protein secondary structures has been an active research issue in bioinformatics for quite a few years and many approaches have been proposed, a new challenge emerges as the sizes of contemporary protein structure databases continue to grow rapidly. The new challenge concerns how we can effectively exploit all the information implicitly deposited in the protein structure databases and deliver ever-improving prediction accuracy as the databases expand rapidly.Entities:
Year: 2008 PMID: 18710504 PMCID: PMC2527571 DOI: 10.1186/1756-0500-1-51
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Prediction accuracies delivered by alternative predictors with the 27 protein chains longer than 100 residues extracted from the EVA server.
| Q3 | Q3H_O | Q3H_P | Q3E_O | Q3E_P | Q3C_O | Q3C_P | SOV | SOVH | SOVE | SOVC | |
| Prote2S | 80.3% | 76.4% | 78.3% | 60.5% | 75.8% | 84.1% | 76.3% | 76.9% | 77.7% | 64.9% | 75.2% |
| Errsig | 2.0% | 3.8% | 3.4% | 9.3% | 7.8% | 2.0% | 2.4% | 2.2% | 3.2% | 9.4% | 2.4% |
| PSIPRED | 78.2% | 78.0% | 76.4% | 60.6% | 67.3% | 77.0% | 75.3% | 75.0% | 76.2% | 62.7% | 72.0% |
| Errsig | 1.2% | 4.1% | 3.8% | 9.0% | 9.4% | 1.8% | 1.9% | 1.4% | 3.7% | 9.0% | 1.8% |
| PROFsec | 77.9% | 71.6% | 81.6% | 61.0% | 63.4% | 80.2% | 72.7% | 76.1% | 75.4% | 64.1% | 73.0% |
| Errsig | 1.2% | 3.7% | 3.8% | 9.2% | 9.2% | 2.0% | 1.6% | 1.4% | 3.8% | 9.2% | 1.9% |
| PHDpsi | 75.2% | 76.4% | 77.3% | 55.5% | 61.9% | 74.1% | 72.5% | 72.5% | 75.6% | 56.3% | 70.1% |
| Errsig | 1.3% | 3.5% | 3.7% | 8.8% | 9.3% | 2.6% | 2.1% | 1.7% | 3.4% | 8.9% | 2.4% |
| SABLE2 | 77.0% | 74.0% | 79.3% | 55.2% | 75.0% | 80.2% | 71.4% | 72.6% | 74.5% | 59.9% | 70.1% |
| Errsig | 1.3% | 3.5% | 3.1% | 8.9% | 4.8% | 2.4% | 1.7% | 2.0% | 3.1% | 9.1% | 2.6% |
| PROF_king | 70.7% | 56.6% | 72.7% | 55.8% | 57.8% | 77.6% | 67.1% | 67.5% | 60.9% | 58.6% | 68.2% |
| Errsig | 1.5% | 4.6% | 7.8% | 9.1% | 7.2% | 1.8% | 2.1% | 1.6% | 4.6% | 9.1% | 2.2% |
Errsig is the significant difference margin for each score and is defined as the standard deviation over the square root of the number of proteins. Q3H/E/C and SOVH/E/C values are the specific Q3 and SOV scores of the predicted helix, strand and coil regions, respectively. Q3H_O (Q3E_O and Q3C_O, respectively) represents correctly predicted helix (strand and coil, respectively) residues (percentage of helix observed), and Q3H_P (Q 3E_P and Q3C_P, respectively) represents correctly predicted helix (strand and coil, respectively) residues (percentage of helix predicted).
Prediction accuracies delivered by alternative predictors with the 62 protein chains shorter than 100 residues extracted from the EVA server.
| Q3 | Q3H_O | Q3H_P | Q3E_O | Q3E_P | Q3C_O | Q3C_P | SOV | SOVH | SOVE | SOVC | |
| Prote2S | 75.1% | 73.1% | 79.4% | 69.7% | 73.7% | 85.3% | 70.6% | 69.4% | 74.7% | 71.8% | 72.4% |
| Errsig | 1.5% | 3.5% | 3.6% | 4.4% | 4.7% | 1.6% | 2.2% | 2.5% | 3.5% | 4.3% | 2.1% |
| PSIPRED | 77.0% | 78.4% | 80.3% | 69.8% | 76.9% | 77.5% | 77.7% | 73.2% | 75.4% | 72.1% | 72.6% |
| Errsig | 1.6% | 3.9% | 3.2% | 4.3% | 3.9% | 1.8% | 2.0% | 2.2% | 3.9% | 4.3% | 2.2% |
| PROFsec | 76.4% | 78.0% | 82.4% | 75.8% | 69.7% | 79.6% | 74.0% | 72.9% | 79.7% | 77.7% | 71.0% |
| Errsig | 1.5% | 3.1% | 3.2% | 3.5% | 4.4% | 1.6% | 1.9% | 2.2% | 3.1% | 3.5% | 2.3% |
| PHDpsi | 75.6% | 82.7% | 76.1% | 70.4% | 67.5% | 75.4% | 77.2% | 70.2% | 79.4% | 72.0% | 69.1% |
| Errsig | 1.7% | 3.1% | 3.6% | 4.1% | 4.7% | 1.9% | 1.9% | 2.4% | 3.3% | 4.1% | 2.5% |
| SABLE2 | 76.3% | 76.1% | 76.4% | 71.3% | 61.2% | 80.7% | 74.8% | 71.5% | 77.1% | 72.1% | 71.0% |
| Errsig | 1.6% | 3.6% | 4.0% | 4.1% | 5.0% | 1.4% | 2.0% | 2.3% | 3.7% | 4.2% | 2.2% |
| PROF_king | 72.5% | 67.4% | 83.5% | 72.6% | 66.6% | 79.9% | 70.1% | 65.8% | 67.2% | 72.8% | 68.5% |
| Errsig | 1.7% | 4.1% | 3.3% | 4.2% | 4.7% | 1.6% | 2.3% | 2.5% | 4.2% | 4.4% | 2.4% |
Prediction accuracies delivered by alternative predictors with the 89 benchmark protein chains extracted from the EVA server.
| Q3 | Q3H_O | Q3H_P | Q3E_O | Q3E_P | Q3C_O | Q3C_P | SOV | SOVH | SOVE | SOVC | |
| Prote2S | 76.7% | 74.1% | 79.1% | 71.4% | 76.6% | 84.9% | 72.3% | 71.7% | 75.6% | 74.2% | 73.3% |
| Errsig | 1.3% | 2.7% | 2.7% | 3.2% | 3.5% | 1.3% | 1.7% | 1.9% | 2.6% | 3.2% | 1.6% |
| PSIPRED | 77.4% | 78.3% | 79.1% | 71.5% | 78.5% | 77.3% | 77.0% | 73.7% | 75.7% | 73.8% | 72.4% |
| Errsig | 1.2% | 3.0% | 2.5% | 3.1% | 2.9% | 1.4% | 1.5% | 1.6% | 2.9% | 3.1% | 1.6% |
| PROFsec | 76.9% | 76.0% | 82.1% | 75.8% | 72.3% | 79.7% | 73.6% | 73.9% | 78.4% | 78.0% | 71.6% |
| Errsig | 1.1% | 2.5% | 2.5% | 2.6% | 3.2% | 1.3% | 1.4% | 1.6% | 2.5% | 2.6% | 1.7% |
| PHDpsi | 75.5% | 80.8% | 76.5% | 70.4% | 70.3% | 75.0% | 75.8% | 70.9% | 78.2% | 71.7% | 69.4% |
| Errsig | 1.3% | 2.4% | 2.7% | 3.0% | 3.4% | 1.5% | 1.5% | 1.7% | 2.5% | 3.0% | 1.9% |
| SABLE2 | 76.5% | 75.5% | 77.3% | 70.9% | 65.4% | 80.6% | 73.7% | 71.8% | 76.3% | 72.9% | 70.7% |
| Errsig | 1.2% | 2.7% | 2.9% | 3.0% | 3.8% | 1.2% | 1.5% | 1.7% | 2.7% | 3.1% | 1.7% |
| PROF_king | 72.0% | 64.1% | 82.5% | 72.0% | 66.2% | 79.2% | 69.1% | 66.3% | 65.3% | 73.0% | 68.4% |
| Errsig | 1.2% | 3.2% | 2.6% | 3.1% | 3.5% | 1.2% | 1.7% | 1.8% | 3.3% | 3.2% | 1.8% |
Size of the training dataset vs. execution times taken by the Prote2S and the SVM during the training process.
| Prote2S | SVM | |||||
| Number of protein chains used to generate the training dataset | Training time (in seconds) | Q3 | SOV | Training time (in seconds) | Q3 | SOV |
| 50 | 29.6 | 64.0% | 52.9% | 138.08 | 71.3% | 64.3% |
| 100 | 91.7 | 69.0% | 64.1% | 527.02 | 74.0% | 68.3% |
| 250 | 486.4 | 71.4% | 67.2% | 5105.63 | 75.5% | 71.0% |
| 500 | 1377.4 | 71.9% | 67.9% | 21040.0 | 76.8% | 72.3% |
| 1000 | 3887.8 | 73.9% | 71.1% | 78795.25 | 77.4% | 73.3% |
Size of the training dataset vs. execution times taken by Prote2S and the SVM for making predictions.
| Prote2S | SVM | |
| Number of protein chains used to generate the training dataset | Testing time (in seconds) | Testing time (in seconds) |
| 50 | 54.5 | 146.7 |
| 100 | 87.6 | 301.0 |
| 250 | 153.3 | 758.5 |
| 500 | 220.7 | 990.7 |
| 1000 | 333.2 | 2532.8 |