| Literature DB >> 23766289 |
Kuan Pern Tan1, Thanh Binh Nguyen, Siddharth Patel, Raghavan Varadarajan, M S Madhusudhan.
Abstract
Residue depth accurately measures burial and parameterizes local protein environment. Depth is the distance of any atom/residue to the closest bulk water. We consider the non-bulk waters to occupy cavities, whose volumes are determined using a Voronoi procedure. Our estimation of cavity sizes is statistically superior to estimates made by CASTp and VOIDOO, and on par with McVol over a data set of 40 cavities. Our calculated cavity volumes correlated best with the experimentally determined destabilization of 34 mutants from five proteins. Some of the cavities identified are capable of binding small molecule ligands. In this study, we have enhanced our depth-based predictions of binding sites by including evolutionary information. We have demonstrated that on a database (LigASite) of ∼200 proteins, we perform on par with ConCavity and better than MetaPocket 2.0. Our predictions, while less sensitive, are more specific and precise. Finally, we use depth (and other features) to predict pKas of GLU, ASP, LYS and HIS residues. Our results produce an average error of just <1 pH unit over 60 predictions. Our simple empirical method is statistically on par with two and superior to three other methods while inferior to only one. The DEPTH server (http://mspc.bii.a-star.edu.sg/depth/) is an ideal tool for rapid yet accurate structural analyses of protein structures.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23766289 PMCID: PMC3692129 DOI: 10.1093/nar/gkt503
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Cavity size estimations by DEPTH, CASTp, VOIDOO and McVol
| DEPTH | CASTp | McVol | VOIDOO | |
|---|---|---|---|---|
| Average error in cavity size estimation | 0.9 Å3 | 23.2 Å3 | 1.3 Å3 | −96.1 Å3 |
| 0.0002 | 0.4317 | <0.0001 |
The P-values reported are from a Wilcoxon paired sign rank test applied to compare DEPTH with the other methods.
Figure 1.Correlation between experimentally measured free energy change on mutation (ΔΔGo) and cavity sizes computed by DEPTH (a), CASTp (b) and McVol (c).
The MCC values for DEPTH, ConCavity and MetaPocket 2.0 binding site residue predictions over the testing set
| DEPTH | ConCavity | MetaPocket 2.0 | |
|---|---|---|---|
| Single-chain | |||
| | 80 | 80 | 70 |
| MCC | 0.55 | 0.53 | 0.55 |
| Difference | 0.02 | 0.00 | |
| | 0.39 | 0.47 | |
| Multi-chain | |||
| | 120 | 120 | 40 |
| MCC | 0.47 | 0.50 | 0.33 |
| Difference | −0.02 | 0.15 | |
| | 0.34 | 0.04 | |
| All | |||
| | 200 | 200 | 110 |
| MCC | 0.50 | 0.51 | 0.47 |
| Difference | 0.00 | 0.03 | |
| | 0.78 | 0.04 |
Each data set was divided into single-chain and multi-chain categories. For each category, a two-tailed paired t-test was performed to test the statistical significance of difference between DEPTH MCC values and those of ConCavity and MetaPocket 2.0. P-values from two-tailed paired t-test are reported. N denotes the size of the dataset over which the comparisons were made.
Statistical analysis of binding residues predictions of DEPTH, ConCavity and MetaPocket 2.0
| Methods | N | TP | FP | TN | FN | Sensitivity | Specificity | Accuracy | Precision |
|---|---|---|---|---|---|---|---|---|---|
| DEPTH | 200 | 0.07 | 0.07 | 0.82 | 0.04 | 0.63 | 0.92 | 0.89 | 0.49 |
| ConCavity | 200 | 0.08 | 0.11 | 0.78 | 0.02 | 0.80 | 0.87 | 0.87 | 0.43 |
| MetaPocket 2.0 | 110 | 0.08 | 0.10 | 0.79 | 0.03 | 0.71 | 0.89 | 0.87 | 0.43 |
TP, FP, TN, FN represent the mean values of true positive, false positive, true negative and false negative rates over the testing set, respectively.
The testing set of 200 protein structures (for MetaPocket 2.0 comparisons, the size of the dataset was 110) consists of 12 020 binding site and 112 035 non-binding site residues. The average chain lengths of single- and multi- chain protein are 308 and 277, respectively. Of 120 multi-chain proteins, 77 are dimers, 3 are trimers, 22 are tetramers and the remaining 13 consist of five or more chains.
RMSD of predicted pKa from experimentally determined values, in pH units
| Residue type | model pKa (pH units) | c0 | c1 | c2 | c3 | c4 | c5 | RMSD (pH units) | |
|---|---|---|---|---|---|---|---|---|---|
| Training set (size) | Testing set (size) | ||||||||
| ASP | 3.8 | −2.18 | 0.29 | 0.47 | −0.61 | 0.16 | −0.15 | 1.02 (112) | 0.71 (15) |
| GLU | 4.5 | −1.91 | −0.1 | 0.79 | −0.19 | 0.26 | −0.09 | 0.83 (125) | 1.07 (15) |
| HIS | 6.5 | 3.13 | −0.04 | −0.54 | 0.28 | −1.12 | −0.83 | 1.14 (60) | 1.26 (15) |
| LYS | 10.5 | 4.22 | −0.21 | −0.19 | −0.01 | −7.65 | −1.81 | 0.86 (70) | 0.80 (15) |
| Total | 0.94 (367) | 0.96 (60) | |||||||
c0–c5 are the coefficients of the linear recombination [Equation (7)].
RMSDs of pKa prediction of DEPTH and other methods to experimentally determined values
| MD/GB/TI with waters | MD/GB/TI without waters | PROPKA3.0 | Geom dep dielectric | Microenv SCP | EGAD | MCCE | QM/MM | DEPTH | |
|---|---|---|---|---|---|---|---|---|---|
| ASP | 1.9 (4) | 1.3 (15) | 0.7 (15) | 0.8 (14) | 0.8 (12) | 0.8 (10) | 1.4 (12) | 0.3 (1) | 0.7 (15) |
| GLU | 1.9 (3) | 1.1 (15) | 1.0 (15) | 0.9 (14) | 0.7 (13) | 1.2 (8) | 0.9 (14) | 0.3 (4) | 1.1 (15) |
| HIS | 1.7 (7) | 1.9 (15) | 1.6 (15) | 1.3 (15) | 0.5 (9) | 1.4 (7) | 1.6 (9) | 1.3 (15) | |
| LYS | 2.5 (1) | 0.9 (15) | 0.7 (15) | 0.8 (9) | 0.6 (9) | 1.1 (11) | 0.8 (15) | ||
| Total | 1.9 (15) | 1.4 (60) | 1.1 (60) | 1.0 (52) | 0.7 (43) | 1.2 (25) | 1.3 (46) | 0.3 (5) | 1.0 (60) |
| <0.001 | <0.0001 | 0.48 | 0.01 | 0.02 | 0.04 | 0.45 |
The number of predictions are given in parentheses. The P-values listed are from a Wilcoxon paired sign rank test comparing the DEPTH to the other methods.
*Indicates that statistically significant difference.