| Literature DB >> 35879651 |
M Walder1, E Edelstein1, M Carroll1, S Lazarev1, J E Fajardo2, A Fiser2, R Viswanathan3.
Abstract
BACKGROUND: Identifying protein interfaces can inform how proteins interact with their binding partners, uncover the regulatory mechanisms that control biological functions and guide the development of novel therapeutic agents. A variety of computational approaches have been developed for predicting a protein's interfacial residues from its known sequence and structure. Methods using the known three-dimensional structures of proteins can be template-based or template-free. Template-based methods have limited success in predicting interfaces when homologues with known complex structures are not available to use as templates. The prediction performance of template-free methods that only rely only upon proteins' intrinsic properties is limited by the amount of biologically relevant features that can be included in an interface prediction model.Entities:
Keywords: Interface prediction; Protein–protein interaction; Structure-based method
Mesh:
Substances:
Year: 2022 PMID: 35879651 PMCID: PMC9316365 DOI: 10.1186/s12859-022-04852-2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Fig. 1Flowchart of ISPIP Methodology: ISPIP’s classification models are generated through training on the interface likelihoods of the three input predictors. (Created with BioRender.com)
ISPIP predictive enhancement by single-threshold metrics
| Classifier | Average F-score | Average MCC |
|---|---|---|
| PredUs 2.0 | 0.400 | 0.351 |
| ISPRED4 | 0.405 | 0.355 |
| DockPred | 0.380 | 0.324 |
| Linear regression | 0.470 | 0.433 |
| Logistic regression | 0.469 | 0.433 |
Fig. 2Enhanced prediction as ISPIP model evolves: (A) The PR curves of the 3 input methods indicate that PredUs 2.0 and ISPRED4 perform slightly better than DockPred. (B) All the ISPIP models significantly outperform the input predictors, and PR-AUC is boosted as the model evolves from simple linear regression to more complex ensemble decision tree algorithms
Increased performance with ISPIP model evolution
| ISPIP model | Average F-score | Average MCC |
|---|---|---|
| Random forest | 0.490 | 0.458 |
| XGBoost | 0.516 | 0.487 |
Fig. 3ISPIP consensus prediction of interface residues: On the left, the structure (1YPI.A) is shown. In the middle, the interface prediction of the 3 input classifiers is displayed. On the right, the ISPIP consensus prediction includes overlapping and unique TP residues of the input classifiers to yield an improved interface prediction of 19 TP out of the 23 annotated residues
Set A vs Set B model performance
| Set A: Average MCC | Set B: Average MCC | |
|---|---|---|
| DockPred | 0.324 | 0.336 |
| XGBoost | 0.487 | 0.495 |
Fig. 4ISPIP outperforms other structure-based classifiers and meta-predictors: The PR curves highlight ISPIP’s improved performance of a complex structure-based classifier (VORFFIP) and previous meta-predictor (meta-PPISP)
Fig. 5ISPIP is robust to poor performance of input classifier: On the left, the structure of 1CP2.A) is shown. In the middle, the interface prediction of the 3 input classifiers is displayed. PredUs 2.0 has an especially poor prediction relative to the other 2 input classifiers. On the right, the ISPIP has a robust consensus prediction with 10 TP out of the 13 annotated residues, despite the poor performance of the PredUs 2.0 input classifier
Optimized regression parameters for Set A proteins
| Regression model | PredUs 2.0 ( | ISPRED4 ( | DockPred ( |
|---|---|---|---|
| Linear | 0.196 | 0.313 | 0.313 |
| Logistic | 1.28 | 2.821 | 1.424 |
Binary classifier evaluation metrics
| Precision = |
| Recall = True Positive Rate (TPR) = |
| False Positive Rate (TPR) = |
| F-Score = |
| MCC = |