| Literature DB >> 32169820 |
Ning Zhang1, Yuting Chen1, Haoyu Lu1, Feiyang Zhao1, Roberto Vera Alvarez2, Alexander Goncearenco2, Anna R Panchenko3, Minghui Li4.
Abstract
Missense mutations may affect proteostasis by destabilizing or over-stabilizing protein complexes and changing the pathway flux. Predicting the effects of stabilizing mutations on protein-protein interactions is notoriously difficult because existing experimental sets are skewed toward mutations reducing protein-protein binding affinity and many computational methods fail to correctly evaluate their effects. To address this issue, we developed a method MutaBind2, which estimates the impacts of single as well as multiple mutations on protein-protein interactions. MutaBind2 employs only seven features, and the most important of them describe interactions of proteins with the solvent, evolutionary conservation of the site, and thermodynamic stability of the complex and each monomer. This approach shows a distinct improvement especially in evaluating the effects of mutations increasing binding affinity. MutaBind2 can be used for finding disease driver mutations, designing stable protein complexes, and discovering new protein-protein interaction inhibitors.Entities:
Keywords: 3D Reconstruction of Protein; Bioinformatics; Protein Folding
Year: 2020 PMID: 32169820 PMCID: PMC7068639 DOI: 10.1016/j.isci.2020.100939
Source DB: PubMed Journal: iScience ISSN: 2589-0042
Figure 1Pearson Correlation Coefficients between Experimental and Calculated for Three Types of Cross-Validation Tests on the S4191 (Single Mutations) and M1707 (Multiple Mutations) Sets
See also Table S1.
Comparison of Methods' Performance for Single and Multiple Mutations
| Training/Test Set | Model | All Mutations | Decreasing | Increasing | ||||
|---|---|---|---|---|---|---|---|---|
| R | RMSE | Slope | R | RMSE | R | RMSE | ||
| Skempi + Reverse/S1748 | MutaBind2ˆ | 0.63 | 1.25 | 0.83 | 0.45 | 1.17 | 0.77 | 1.52 |
| Skempi/S1748 | MutaBind | 0.38∗ | 1.51 | 0.72 | 0.44 | 1.11 | – | 2.43 |
| BeAtMuSiC | 0.30∗ | 1.58 | 0.55 | 0.43 | 1.14 | −0.25∗ | 2.57 | |
| Test: S1748 | FoldX | 0.42∗ | 1.57 | 0.52 | 0.41 | 1.37 | 0.26∗ | 2.12 |
| Test: S4191 | MutaBind2 CV4 | 0.76 | 1.34 | 1.11 | 0.61 | 1.31 | 0.67 | 1.39 |
| MutaBind2 CV5 | 0.69 | 1.50 | 1.18 | 0.54 | 1.41 | 0.47 | 1.65 | |
| Test: M1707 | MutaBind2 CV4 | 0.74 | 2.13 | 1.09 | 0.51 | 2.04 | 0.60 | 2.26 |
| MutaBind2 CV5 | 0.71 | 2.24 | 1.00 | 0.47 | 2.18 | 0.56 | 2.33 | |
| Test: M1337 | FoldX | 0.49 | 2.43 | 0.52 | 0.37 | 2.49 | 0.24 | 2.21 |
MutaBind2ˆ: MutaBind2 was retrained on “Skempi + Reverse” set.
∗Significant difference between MutaBind2 and other methods with p value < 0.01 calculated on a test set S1748 (implemented in R package cocor).
R, Pearson correlation coefficient between experimental and predicted ΔΔG values; RMSE (kcal mol−1), root-mean-square error, the standard deviation of the residuals (prediction errors); Slope, the slope of the regression line between experimental and predicted ΔΔG values. All presented values of correlation coefficients are statistically significantly different from zero (p value << 0.01). The details about datasets are shown in Table S1.
Figure 2Experimental and Predicted Values of Changes in Binding Affinity for All Mutations in the S4191 (Single Mutations) and M1707 (Multiple Mutations) Sets Using “Leave-One-Complex-Out” (CV4) Cross-Validation
See also Table S1.
Comparison of Methods' Performance on Different Datasets
| Test Set | Method | R | RMSE |
|---|---|---|---|
| S487 | MutaBind2 | 0.41 | 1.25 |
| MutaBind | 0.29∗∗ | 1.63 | |
| BeAtMuSiC | 0.35 | 1.28 | |
| FoldX | 0.34∗ | 1.53 | |
| iSEE | 0.25∗∗ | 1.32 | |
| S8338 | MutaBind2 CV4 | 0.74 | 1.37 |
| MutaBind2 CV5 | 0.66 | 1.53 | |
| mCSM-PPI2 CV4 | 0.75 | 1.30 | |
| mCSM-PPI2 CV5 | 0.67 | 1.39 |
∗ and ∗∗ indicate statistically significant difference between MutaBind2 and other methods in terms of R with p value < 0.05 and p value < 0.01, respectively, calculated on test set S487 (implemented in R package cocor).
R, Pearson correlation coefficient; RMSE, root-mean-square error.
R and RMSE values were taken from the mCSM-PPI2 article (Rodrigues et al., 2019). For testing on S487 set, MutaBind2 was retrained after removing S487 from the training dataset. For testing on S8338 set, MutaBind2 was retrained on S8338. See also Table S6.
Figure 3Receiver Operating Characteristic Curves for Predicting Mutations Disrupting Protein-Protein Interactions Using Different Methods
As one mutation/interaction could be mapped to several Protein DataBank structures, the maximum predicted value of each method was used for each interaction-disruptive mutation and the minimum predicted values were used for those mutations that do not disrupt interactions. More details are shown in Figure S5.
Comparative Performance of MutaBind2 and Three Methods for Predicting Mutations Highly Decreasing and Increasing Binding Affinity on the Independent Test Set of S1748
| MutaBind2 | MutaBind | BeAtMuSiC | FoldX | |
|---|---|---|---|---|
| Sensitivity | 0.75 | 0.86 | 0.73 | 0.58 |
| Specificity | 0.89 | 0.82 | 0.87 | 0.94 |
| MCC | 0.63 | 0.63 | 0.58 | 0.57 |
| AUC | 0.82 | 0.82 | 0.79 | 0.79 |
| Sensitivity | 0.55 | 0 | 0 | 0.44 |
| Specificity | 0.99 | 1.00 | 1.00 | 0.99 |
| MCC | 0.64 | −0.01 | 0 | 0.51 |
| AUC | 0.86 | 0.65∗ | 0.56∗ | 0.74∗ |
MutaBind2 was retrained on the dataset “Skempi + Reverse.”
MCC, Matthews correlation coefficient.
∗p value < 0.01 calculated by Delong test (DeLong et al., 1988) comparing AUC (area under the ROC curve) produced by a given method and AUC produced by MutaBind2; points to significant differences in performance. See also Figure S6.