| Literature DB >> 24564476 |
Abdollah Dehzangi, Kuldip Paliwal, James Lyons, Alok Sharma, Abdul Sattar.
Abstract
BACKGROUND: Prediction of the structural classes of proteins can provide important information about their functionalities as well as their major tertiary structures. It is also considered as an important step towards protein structure prediction problem. Despite all the efforts have been made so far, finding a fast and accurate computational approach to solve protein structural class prediction problem still remains a challenging problem in bioinformatics and computational biology.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24564476 PMCID: PMC4046757 DOI: 10.1186/1471-2164-15-S1-S2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
The properties of 1189 and 25PDB benchmarks.
| Benchmarks | All- | All- |
|
| Total |
|---|---|---|---|---|---|
| 1189 | 223 | 294 | 334 | 241 | 1092 |
| 25PDB | 443 | 443 | 346 | 441 | 1673 |
Figure 1Feature extraction scheme using the segmented distribution method.
Figure 2The overall accuracies of PSSM-S compared to AAC-PSSM-AC for 25PDB benchmark.
Figure 3The overall accuracies of PSSM-S compared to AAC-PSSM-AC for 1189 benchmark.
The impact of the proposed feature extraction groups (using PSSM for feature extraction) proposed in this study to enhance protein structural class prediction accuracy (in %).
| Combination of features | 25DDB | 1189 |
|---|---|---|
| PSSM-AAC | 64.3 | 61.2 |
| PSSM-AAC + PSSM-SAC | 69.4 | 68.0 |
| PSSM-AAC + PSSM-SAC + PSSM-SD | 88.6 | 77.9 |
| PSSM-AAC + PSSM-SAC + PSSM-SD + AAO | 89.6 | 79.7 |
Figure 4The overall accuracies of SPINE-S with respect to different values of for 25PDB and 1189 benchmarks (where = 25).
The impact of the proposed feature extraction groups (using SPINE-M for feature extraction)proposed in this study to enhance protein structural class prediction accuracy (in %).
| Combination of features | 25DDB | 1189 |
|---|---|---|
| SPINE-AAC | 78.2 | 75.1 |
| SPINE-AAC + SPINE-SAC | 79.2 | 78.2 |
| SPINE-AAC + SPINE-SAC + SPINE-SD | 81.6 | 79.0 |
| SPINE-AAC + SPINE-SAC + SPINE-SD + SSEO | 82.3 | 80.3 |
Figure 5The general architecture of our proposed feature extraction model. The number of features extracted in each feature group is shown in the brackets below the feature groups' names.
Figure 6The overall accuracies of PSSM-SPINE-S with respect to different values of for 25PDB and 1189 benchmarks (where = 4 and = 25%).
Comparison of the results reported for the 25PDB benchmark (in percentage %)
| References | Method | All- | All- |
|
| Overall |
|---|---|---|---|---|---|---|
| [ | Logistic Regression | 69.1 | 61.6 | 60.1 | 38.3 | 57.1 |
| [ | Specific Tri-peptides | 60.6 | 60.7 | 67.9 | 44.3 | 58.6 |
| [ | LLSC-PRED | 75.2 | 67.5 | 62.1 | 44.0 | 62.2 |
| [ | SVM | 77.4 | 66.4 | 61.3 | 45.4 | 62.7 |
| [ | AAD-CGR | 64.3 | 65.0 | 65.0 | 61.7 | 64.0 |
| [ | CWT-PCA-SVM | 76.5 | 67.3 | 66.8 | 45.8 | 64.0 |
| [ | AATP | 81.9 | 74.7 | 75.1 | 55.8 | 71.7 |
| [ | AADP-PSSM | 83.3 | 78.1 | 76.3 | 54.4 | 72.9 |
| [ | SCPRED | 92.6 | 80.1 | 74.0 | 71.0 | 79.7 |
| [ | SSA | 92.6 | 83.7 | 80.5 | 65.9 | 81.5 |
| [ | PSSA | 94.6 | 76.3 | 73.1 | 74.4 | 80.0 |
| [ | RKS-PPSC | 92.8 | 83.3 | 80.8 | 70.1 | 82.9 |
| [ | SVM | 92.6 | 81.3 | 81.5 | 76.0 | 82.9 |
| [ | MODAS | 92.3 | 83.7 | 81.2 | 68.3 | 81.4 |
| [ | AAC-PSSM-AC | 85.3 | 81.7 | 73.7 | 55.3 | 74.1 |
| [ | Physicochemical-based features | 86.1 | 80.8 | 80.6 | 60.1 | 76.7 |
| [ | Structural-based features | 95.0 | 85.6 | 81.5 | 73.2 | 83.9 |
| [ | Structural-based features | 95.0 | 81.3 | 83.2 | 77.6 | 84.3 |
| This Study | PSSM-S | 93.5 | 90.3 | 92.1 | 81.4 |
|
| This Study | SPINE-S | 93.8 | 83.1 | 78.4 | 73.9 |
|
| This Study | PSSM-SPINE-S |
|
|
|
|
|
Comparison of the results reported for the 1189 benchmark (in percentage %)
| References | Method | All- | All- |
|
| Overall |
|---|---|---|---|---|---|---|
| [ | Bayes Classifier | 54.8 | 57.1 | 75.2 | 22.2 | 53.8 |
| [ | Logistic Regression | 57.0 | 62.9 | 64.7 | 25.3 | 53.9 |
| [ | FKNN | 48.9 | 59.5 | 81.7 | 26.6 | 56.9 |
| [ | WSVM | - | - | - | - | 59.2 |
| [ | Specific Tri-peptides | - | - | - | - | 59.9 |
| [ | IB1 | 65.3 | 67.7 | 79.9 | 40.7 | 64.7 |
| [ | AAD-CGR | 62.3 | 67.7 | 66.5 | 63.1 | 65.2 |
| [ | SVM | 75.8 | 75.2 | 82.6 | 31.8 | 67.6 |
| [ | AATP | 72.7 | 85.4 | 82.9 | 42.7 | 72.6 |
| [ | AADP-PSSM | 69.1 | 83.7 | 85.6 | 35.7 | 70.7 |
| [ | SCPRED | 89.1 | 86.7 | 89.6 | 53.8 | 80.6 |
| [ | RKS-PPSC | 89.2 | 86.7 | 82.6 | 65.6 | 81.3 |
| [ | MODAS | 92.3 | 87.1 | 87.9 | 65.4 | 83.5 |
| [ | AAC-PSSM-AC | 80.7 | 86.4 | 81.4 | 45.2 | 74.6 |
| [ | Physicochemical-based features | 80.2 | 83.6 | 85.4 | 44.6 | 74.8 |
| [ | Structural-based features | 92.4 | 87.4 | 82.0 | 71.0 | 83.2 |
| [ | Structural-based features | 93.7 | 84.0 | 83.5 | 66.4 | 82.0 |
| This Study | PSSM-S | 92.6 | 86.0 | 76.7 | 64.3 |
|
| This Study | SPINE-S | 91.9 | 88.3 | 78.9 | 61.7 |
|
| This Study | PSSM-SPINE-S |
|
|
|
|
|
The specificity (in percentage) and MCC measurements for the best results: (a) for the 25PDB benchmark; (b) for the 1189 benchmark
| Feature Vector | Specificity (%) | MCC | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| |
| (a) | 97.7 | 96.3 | 95.2 | 91.9 | 0.93 | 0.80 | 0.78 | 0.91 |
| SPINE-S | 97.8 | 94.0 | 94.4 | 90.5 | 0.89 | 0.80 | 0.75 | 0.61 |
| PSSM-SPINE-S | 98.9 | 97.7 | 96.7 | 96.4 | 0.94 | 0.89 | 0.86 | 0.87 |
| (b) | 98.2 | 94.8 | 89.8 | 90.0 | 0.91 | 0.78 | 0.67 | 0.56 |
| SPINE-S | 97.9 | 95.8 | 90.7 | 89.2 | 0.86 | 0.85 | 0.70 | 0.51 |
| PSSM-SPINE-S | 99.5 | 96.8 | 92.9 | 92.2 | 0.95 | 0.88 | 0.77 | 0.66 |