| Literature DB >> 32272725 |
Swagata Pahari1, Gen Li1, Adithya Krishna Murthy1, Siqi Liang2, Robert Fragoza2, Haiyuan Yu2, Emil Alexov1.
Abstract
Maintaining wild type protein-protein interactions is essential for the normal function of cell and any mutation that alter their characteristics can cause disease. Therefore, the ability to correctly and quickly predict the effect of amino acid mutations is crucial for understanding disease effects and to be able to carry out genome-wide studies. Here, we report a new development of the SAAMBE method, SAAMBE-3D, which is a machine learning-based approach, resulting in accurate predictions and is extremely fast. It achieves the Pearson correlation coefficient ranging from 0.78 to 0.82 depending on the training protocol in benchmarking five-fold validation test against the SKEMPI v2.0 database and outperforms currently existing algorithms on various blind-tests. Furthermore, optimized and tested via five-fold cross-validation on the Cornell University dataset, the SAAMBE-3D achieves AUC of 1.0 and 0.96 on a homo and hereto-dimer test datasets. Another important feature of SAAMBE-3D is that it is very fast, it takes less than a fraction of a second to complete a prediction. SAAMBE-3D is available as a web server and as well as a stand-alone code, the last one being another important feature allowing other researchers to directly download the code and run it on their local computer. Combined all together, SAAMBE-3D is an accurate and fast software applicable for genome-wide studies to assess the effect of amino acid mutations on protein-protein interactions. The webserver and the stand-alone codes (SAAMBE-3D for predicting the change of binding free energy and SAAMBE-3D-DN for predicting if the mutation is disruptive or non-disruptive) are available.Entities:
Keywords: disruptive and non-disruptive mutation; machine learning; protein–protein binding; stabilizing and destabilizing mutation
Mesh:
Substances:
Year: 2020 PMID: 32272725 PMCID: PMC7177817 DOI: 10.3390/ijms21072563
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Prediction performance of SAAMBE-3D on dataset-1 for (a) 20% mutations as a test set and (b) 10% mutations as a test set and on dataset-2 for (c) 20% mutations as a test set and (d) 10% mutations as a test set.
Performance of SAAMBE-3D in predicting stabilizing, destabilizing, highly stabilizing and highly destabilizing mutations.
| Stabilizing and Destabilizing | Highly Stabilizing and Destabilizing | |
|---|---|---|
| AUC | 0.75 | 0.99 |
| Sensitivity | 0.82 | 0.95 |
| Specificity | 0.53 | 1 |
| Precision | 0.86 | 1 |
| Accuracy | 0.76 | 0.96 |
| MCC | 0.34 | 0.82 |
Figure 2Receiver operating characteristic (ROC) curves for predicting stabilizing/destabilizing and highly stabilizing/highly destabilizing mutations.
Figure 3Correlation between predicted and experimental ΔΔG over single mutations on (a) both dataset A and B using SAAMBE-3D (b) dataset A using mCSM-PPI2 and (c) dataset B using MutaBind2.
Figure 4Performance of SAAMBE-3D, mCSM-PPI2 and MutaBind2 on different classes of mutations.
Figure 5Complex specific performance of SAAMBE-3D, mCSM-PPI2 and MutaBind2.
Figure 6Performance of SAAMBE-3D on three (MDM2-p53, NM dataset and s487 dataset) blind test sets.
Figure 7Prediction of disruptive and non-disruptive mutations for homodimer and heterodimer.
Performance of SAAMBE-3D-DN in predicting disruptive and non-disruptive mutations for both homo- and hetero-dimers.
| Homo-Dimer | Hetero-Dimer | |
|---|---|---|
| AUC | 1 | 0.96 |
| Sensitivity | 0.99 | 1 |
| Specificity | 0.99 | 0.95 |
| Precision | 1 | 0.8 |
| Accuracy | 1 | 0.96 |
Comparison of time of calculation for a single prediction between different ΔΔG predictors.
| Method | Time of Calculation |
|---|---|
| mCSM-PPI2 | 42 seconds |
| SAAMBE-3D | 0.21 seconds |
| MutaBind2 | 10 minutes |
| BindProfX | 50 minutes |
| BeAtMuSiC | 2 seconds |
Figure 8Importance level of each feature selected for SAAMBE-3D.