| Literature DB >> 20977780 |
Shandar Ahmad1, Yumlembam Hemajit Singh, Yogesh Paudel, Takaharu Mori, Yuji Sugita, Kenji Mizuguchi.
Abstract
BACKGROUND: Many structural properties such as solvent accessibility, dihedral angles and helix-helix contacts can be assigned to each residue in a membrane protein. Independent studies exist on the analysis and sequence-based prediction of some of these so-called one-dimensional features. However, there is little explanation of why certain residues are predicted in a wrong structural class or with large errors in the absolute values of these features. On the other hand, membrane proteins undergo conformational changes to allow transport as well as ligand binding. These conformational changes often occur via residues that are inherently flexible and hence, predicting fluctuations in residue positions is of great significance.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20977780 PMCID: PMC3247134 DOI: 10.1186/1471-2105-11-533
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Comparison of the performance between the models trained individually and in an integrated manner.
| Feature | AMAE (Individual) | (SD) | AMAE (Integrated) | (SD) | p-value |
|---|---|---|---|---|---|
| tASA: | 18.73 | 0.92 | 17.43 | 0.75 | < 2.2e-16 |
| scASA: | 24.32 | 0.79 | 22.32 | 0.96 | < 2.2e-16 |
| npASA | 22.44 | 1.2 | 22.00 | 1.01 | 0.02 |
| Phi: | 12.16 | 0.77 | 12.13 | 0.74 | 0.82 |
| Psi: | 12.55 | 0.61 | 12.53 | 0.6 | 0.82 |
| Kappa: | 8.51 | 0.57 | 8.20 | 0.51 | < 2.2e-16 |
| Alpha: | 10.21 | 0.64 | 10.23 | 0.63 | 0.84 |
| HHC(AUC): | 67.04 | 1.58 | 67.29 | 1.19 | 0.26 |
AMAE represents the average of the protein-wise mean absolute error (or AUC, the area under the ROC curve, in the case of HHC) and the standard deviation (SD) indicates the protein-wise variation of the AMAE values. The p-values are obtained by Welsh's t-test implemented in the R-programming language. The Integrated model outperformed the individually trained ones for the ASA and bend angle (kappa) predictions, but showed no statistically significant difference for the other conformational angles.
Summary of the correlation and mean absolute error (MAE) between the predicted and observed values of the residue-wise structural features.
| tASA (%) | scASA (%) | npASA (%) | Phi (deg) | Psi (deg) | Kappa (deg) | Alpha (deg) | HHC (%) | |
|---|---|---|---|---|---|---|---|---|
| Mean (Observed) | 24 | 30 | 29 | -66 | -40 | 110 | 51 | 64 |
| SD (Observed) | 24 | 31 | 30 | 18 | 17 | 15 | 20 | 48 |
| Correlation | 0.5 | 0.5 | 0.49 | 0.01 | 0.06 | 0.11 | 0 | 0.26 |
| MAE | 17.3 | 21.9 | 21.6 | 11.1 | 11.7 | 8.06 | 10.1 | 43 |
The mean and standard deviation of the actually observed values are provided as a reference. The correlation is computed between the predicted and observed values for each protein and then averaged.
Figure 1Performance of the integrated prediction model for various equilibrium structural features as a function of the membrane spanning length of the protein (AE: Mean Absolute error in percentage points; TM res counts: the number of residues of a protein in the membrane spanning region). Each point in the plot represents one protein.
Coefficients of correlation showing the interdependence between the prediction performances for various structural features.
| MAE (tASA) | MAE (scASA) | MAE (npASA) | MAE (Phi) | MAE (Psi) | MAE (Kappa) | MAE (Alpha) | AUC (HHC) | |
|---|---|---|---|---|---|---|---|---|
| MAE (tASA) | 1 | 0.93 | 0.88 | 0.09 | 0.1 | 0.1 | 0.11 | 0.19 |
| MAE (scASA) | 0.93 | 1 | 0.93 | 0.07 | 0.08 | 0.07 | 0.06 | 0.21 |
| MAE (npASA) | 0.88 | 0.93 | 1 | 0.07 | 0.08 | 0.07 | 0.07 | 0.2 |
| MAE (Phi) | 0.09 | 0.07 | 0.07 | 1 | 0.46 | 0.35 | 0.42 | -0.03 |
| MAE (Psi) | 0.1 | 0.08 | 0.08 | 0.46 | 1 | 0.41 | 0.47 | -0.05 |
| MAE (Kappa) | 0.1 | 0.07 | 0.07 | 0.35 | 0.41 | 1 | 0.76 | -0.05 |
| MAE (Alpha) | 0.11 | 0.06 | 0.07 | 0.42 | 0.47 | 0.76 | 1 | -0.04 |
| AUC (HHC) | 0.19 | 0.21 | 0.2 | -0.03 | -0.05 | -0.05 | -0.04 | 1 |
The prediction performance for the ASA features is highly correlated with each other. The prediction performance for some conformational angles (e.g., alpha and kappa) is well-correlated, whereas others (e.g., Phi and Kappa) are only weakly correlated.
MAE: Mean absolute error (% ASA or degrees), AUC: Area under the ROC curve; HHC: helix-helix contact.
Figure 2Correlation between the equilibrium structural feature (ESFs) and the NMA-derived B-factors. Correlation coefficients were calculated for each protein and averaged for the plot. Error bars represent the standard deviation of the protein-wise correlation coefficients. Predicted B-factors from the El Nemo web server were used for this analysis, which was based on the displacements observed in the 100 lowest-frequency modes.
Performance of the prediction of NMA-derived B-factors (BNMA) from PSSM, the observed and predicted values and their combination of the equilibrium structural features (ESFs).
| Input features | Prediction performance (Correlation with BNMA) | Prediction performance (Correlation with randomized BNMA) | P-value |
|---|---|---|---|
| PSSM (9 residue window) and amino acid composition | 0.21 | -0.02 | 1.3e-7 |
| ESFs (Observed) | 0.52 | 0.00 | < 2.2e-16 |
| ESFs (Predicted) | 0.23 | -0.01 | 4.1e-10 |
| ESFs (Predicted + Observed) | 0.46 | -0.01 | < 2.2e-16 |
Performance on randomized BNMA is shown for reference and a p-value is calculated using Welsh's t-test on the two sets of protein-wise performance scores (correlation coefficients). The predicted ESFs alone can predict BNMA with reasonable performance.