| Literature DB >> 23387468 |
Bharat Panwar1, Sudheer Gupta, Gajendra P S Raghava.
Abstract
BACKGROUND: The vitamins are important cofactors in various enzymatic-reactions. In past, many inhibitors have been designed against vitamin binding pockets in order to inhibit vitamin-protein interactions. Thus, it is important to identify vitamin interacting residues in a protein. It is possible to detect vitamin-binding pockets on a protein, if its tertiary structure is known. Unfortunately tertiary structures of limited proteins are available. Therefore, it is important to develop in-silico models for predicting vitamin interacting residues in protein from its primary structure.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23387468 PMCID: PMC3577447 DOI: 10.1186/1471-2105-14-44
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Comparative average percent amino acids composition of VIRs, non-VIRs, VAIRs, VBIRs and PLPIRs.
Prediction performance of different classifiers for vitamin-interacting residues (VIRs)
| Binary | SVM (Threshold = −0.8) | 68.57 ± 0.60 | 64.88 ± 0.18 | 65.22 ± 0.21 | 0.20 ± 0.00 |
| SVM (Threshold = −0.5) | 29.53 ± 0.83 | 94.71 ± 0.16 | 88.78 ± 0.15 | 0.27 ± 0.01 | |
| BayesNet | 54.76 ± 1.44 | 69.64 ± 0.99 | 68.29 ± 0.85 | 0.15 ± 0.01 | |
| ComplementNaiveBayes | 67.57 ± 0.90 | 65.16 ± 0.29 | 65.38 ± 0.33 | 0.19 ± 0.01 | |
| NaiveBayes | 35.65 ± 0.85 | 89.52 ± 0.22 | 84.62 ± 0.18 | 0.22 ± 0.01 | |
| NaiveBayesMultinomial | 40.08 ± 1.04 | 87.67 ± 0.24 | 83.35 ± 0.24 | 0.22 ± 0.01 | |
| IBk | 26.67 ± 0.76 | 93.83 ± 0.11 | 87.73 ± 0.15 | 0.22 ± 0.01 | |
| | RandomForest | 35.48 ± 0.78 | 79.13 ± 0.36 | 75.17 ± 0.31 | 0.10 ± 0.01 |
| PSSM | |||||
| BayesNet | 67.41 ± 0.24 | 64.20 ± 0.06 | 64.49 ± 0.05 | 0.19 ± 0.00 | |
| ComplementNaiveBayes | 61.21 ± 0.58 | 78.06 ± 0.23 | 76.53 ± 0.19 | 0.26 ± 0.00 | |
| NaiveBayes | 67.64 ± 0.37 | 65.48 ± 0.11 | 65.68 ± 0.09 | 0.20 ± 0.00 | |
| NaiveBayesMultinomial | 54.91 ± 0.94 | 83.52 ± 0.21 | 80.92 ± 0.16 | 0.28 ± 0.01 | |
| IBk | 50.70 ± 0.90 | 96.91 ± 0.06 | 92.71 ± 0.08 | 0.52 ± 0.01 | |
| RandomForest | 61.54 ± 0.64 | 81.52 ± 0.12 | 79.70 ± 0.11 | 0.30 ± 0.01 |
*Bold value indicates highest performance with balanced sensitivity and specificity.
**Italic value indicates performance with highest MCC.
The values of standard errors are also given with performances.
Figure 2The ROC plot of the performance of different approaches for prediction of VIRs.
Prediction performance of different classifiers for vitamin A-interacting residues (VAIRs)
| Binary | SVM (Threshold = −0.8) | 61.92 ± 2.63 | 65.09 ± 0.43 | 64.80 ± 0.35 | 0.16 ± 0.02 |
| SVM (Threshold = −0.1) | 7.43 ± 1.18 | 99.66 ± 0.10 | 91.28 ± 0.08 | 0.21 ± 0.02 | |
| BayesNet | 14.50 ± 2.11 | 94.30 ± 0.20 | 87.04 ± 0.22 | 0.10 ± 0.02 | |
| ComplementNaiveBayes | 62.09 ± 0.50 | 65.97 ± 0.22 | 65.61 ± 0.20 | 0.17 ± 0.00 | |
| NaiveBayes | 32.53 ± 0.99 | 86.43 ± 0.22 | 81.53 ± 0.27 | 0.15 ± 0.01 | |
| NaiveBayesMultinomial | 60.23 ± 0.82 | 67.94 ± 0.16 | 67.24 ± 0.15 | 0.17 ± 0.01 | |
| IBk | 31.41 ± 2.27 | 89.80 ± 0.20 | 84.49 ± 0.19 | 0.19 ± 0.02 | |
| | RandomForest | 36.07 ± 2.03 | 78.38 ± 0.16 | 74.54 ± 0.30 | 0.10 ± 0.01 |
| PSSM | |||||
| BayesNet | 57.25 ± 1.21 | 69.54 ± 0.52 | 68.42 ± 0.48 | 0.16 ± 0.01 | |
| ComplementNaiveBayes | 59.30 ± 1.23 | 66.96 ± 0.33 | 66.26 ± 0.26 | 0.16 ± 0.01 | |
| NaiveBayes | 63.03 ± 1.65 | 69.09 ± 0.46 | 68.54 ± 0.56 | 0.19 ± 0.01 | |
| NaiveBayesMultinomial | 55.77 ± 1.32 | 70.95 ± 0.21 | 69.57 ± 0.26 | 0.17 ± 0.01 | |
| IBk | 44.05 ± 0.49 | 94.65 ± 0.34 | 90.05 ± 0.27 | 0.39 ± 0.01 | |
| RandomForest | 24.17 ± 0.80 | 99.31 ± 0.08 | 92.49 ± 0.06 | 0.41 ± 0.01 |
*Bold value indicates highest performance with balanced sensitivity and specificity.
**Italic value indicates performance with highest MCC.
The values of standard errors are also given with performances.
Figure 3The ROC plot of the performance of different approaches for prediction of VAIRs.
Prediction performance of different classifiers for vitamin B-interacting residues (VBIRs)
| Binary | SVM (Threshold = −0.8) | 73.22 ± 0.36 | 67.00 ± 0.49 | 67.57 ± 0.47 | 0.24 ± 0.00 |
| SVM (Threshold = −0.6) | 30.36 ± 0.62 | 96.69 ± 0.12 | 90.66 ± 0.11 | 0.33 ± 0.01 | |
| BayesNet | 63.25 ± 0.56 | 66.23 ± 0.73 | 65.96 ± 0.62 | 0.18 ± 0.00 | |
| ComplementNaiveBayes | 68.69 ± 0.52 | 68.51 ± 0.23 | 68.52 ± 0.18 | 0.23 ± 0.00 | |
| NaiveBayes | 37.74 ± 0.90 | 90.45 ± 0.23 | 85.66 ± 0.14 | 0.25 ± 0.01 | |
| NaiveBayesMultinomial | 44.22 ± 0.43 | 87.54 ± 0.24 | 83.60 ± 0.19 | 0.25 ± 0.00 | |
| IBk | 30.81 ± 0.71 | 93.33 ± 0.17 | 87.65 ± 0.14 | 0.24 ± 0.01 | |
| | RandomForest | 39.33 ± 1.08 | 79.36 ± 0.37 | 75.72 ± 0.36 | 0.13 ± 0.01 |
| PSSM | |||||
| BayesNet | 71.65 ± 1.13 | 66.14 ± 0.08 | 66.64 ± 0.10 | 0.23 ± 0.01 | |
| ComplementNaiveBayes | 63.90 ± 1.26 | 81.73 ± 0.28 | 80.11 ± 0.22 | 0.32 ± 0.01 | |
| NaiveBayes | 72.28 ± 1.22 | 66.44 ± 0.09 | 66.97 ± 0.12 | 0.23 ± 0.01 | |
| NaiveBayesMultinomial | 21.22 ± 0.69 | 98.88 ± 0.03 | 91.82 ± 0.06 | 0.34 ± 0.01 | |
| RandomForest | 39.16 ± 0.56 | 97.74 ± 0.09 | 92.41 ± 0.10 | 0.46 ± 0.01 |
*Bold value indicates highest SVM performance with balanced sensitivity and specificity.
**Italic value indicates SVM/IBk performance with highest MCC.
The values of standard errors are also given with performances.
Figure 4The ROC plot of the performance of different approaches for prediction of VBIRs.
Figure 5The ROC plot of the performance of different approaches for prediction of PLPIRs.
Prediction performance of different classifiers for PLP-interacting residues (PLPIRs)
| Binary | SVM (Threshold = −0.7) | 77.02 ± 0.72 | 83.17 ± 0.27 | 82.62 ± 0.28 | 0.42 ± 0.01 |
| SVM (Threshold = −0.5) | 54.76 ± 1.34 | 95.81 ± 0.14 | 92.08 ± 0.18 | 0.51 ± 0.01 | |
| BayesNet | 41.76 ± 0.81 | 88.94 ± 0.49 | 84.65 ± 0.40 | 0.26 ± 0.01 | |
| ComplementNaiveBayes | 75.82 ± 1.74 | 77.14 ± 0.35 | 77.01 ± 0.23 | 0.34 ± 0.01 | |
| NaiveBayes | 52.20 ± 1.50 | 91.18 ± 0.17 | 87.64 ± 0.20 | 0.37 ± 0.01 | |
| NaiveBayesMultinomial | 59.25 ± 1.06 | 88.51 ± 0.19 | 85.85 ± 0.19 | 0.38 ± 0.01 | |
| IBk | 40.02 ± 1.24 | 96.31 ± 0.20 | 91.19 ± 0.21 | 0.41 ± 0.01 | |
| | RandomForest | 52.93 ± 1.09 | 80.03 ± 0.71 | 77.56 ± 0.65 | 0.23 ± 0.01 |
| PSSM | |||||
| BayesNet | 77.66 ± 0.83 | 77.71 ± 0.35 | 77.70 ± 0.30 | 0.36 ± 0.01 | |
| ComplementNaiveBayes | 76.28 ± 1.46 | 89.09 ± 0.54 | 87.93 ± 0.45 | 0.50 ± 0.01 | |
| NaiveBayes | 79.40 ± 0.76 | 80.36 ± 0.35 | 80.28 ± 0.27 | 0.40 ± 0.00 | |
| NaiveBayesMultinomial | 43.96 ± 0.67 | 98.16 ± 0.08 | 93.25 ± 0.07 | 0.52 ± 0.01 | |
| IBk | 76.10 ± 0.82 | 98.80 ± 0.06 | 96.74 ± 0.08 | 0.79 ± 0.01 | |
| RandomForest | 62.27 ± 1.76 | 98.02 ± 0.12 | 94.78 ± 0.20 | 0.66 ± 0.01 |
*Bold value indicates highest performance with balanced sensitivity and specificity.
**Italic value indicates performance with highest MCC.
The values of standard errors are also given with performances.
SVM-based prediction performances for four different types of prediction methods using equal positive and negative instances
| VIRs | 65.98 ± 0.85 | 65.85 ± 0.52 | 65.91 ± 0.60 | 0.32 ± 0.01 | 75.80 ± 0.35 | 77.07 ± 0.69 | 76.43 ± 0.47 | 0.53 ± 0.01 |
| VAIRs | 62.09 ± 2.01 | 61.87 ± 2.92 | 61.99 ± 1.30 | 0.24 ± 0.03 | 73.25 ± 2.43 | 73.83 ± 0.95 | 73.54 ± 1.47 | 0.47 ± 0.03 |
| VBIRs | 68.55 ± 0.75 | 68.37 ± 0.83 | 68.47 ± 0.44 | 0.37 ± 0.01 | 80.08 ± 0.61 | 82.49 ± 0.79 | 81.29 ± 0.23 | 0.63 ± 0.01 |
| PLPIRs | 76.74 ± 1.73 | 74.91 ± 1.42 | 75.82 ± 1.32 | 0.52 ± 0.03 | 89.85 ± 0.87 | 89.85 ± 1.16 | 89.84 ± 0.70 | 0.80 ± 0.01 |
The values of standard errors are also given with performances.
SVM-based prediction performances (at the default threshold) of PSSM approach on the different independent datasets
| 1 | VIRs | V-IND-46 | − | ||||
| − | |||||||
| 2 | VAIRs | VA-IND-15 | − | ||||
| 3 | VBIRs | VB-IND-27 | − | ||||
| 4 | PLPIRs | PLP-IND-16 | − | ||||
| − |
*Bold value indicates performance at the optimized threshold level of balanced sensitivity and specificity.
**Italic value indicates performance at the optimized threshold level of highest MCC.