| Literature DB >> 32469063 |
Douglas E V Pires1,2,3, Carlos H M Rodrigues1,2, David B Ascher1,2,4.
Abstract
Significant efforts have been invested into understanding and predicting the molecular consequences of mutations in protein coding regions, however nearly all approaches have been developed using globular, soluble proteins. These methods have been shown to poorly translate to studying the effects of mutations in membrane proteins. To fill this gap, here we report, mCSM-membrane, a user-friendly web server that can be used to analyse the impacts of mutations on membrane protein stability and the likelihood of them being disease associated. mCSM-membrane derives from our well-established mutation modelling approach that uses graph-based signatures to model protein geometry and physicochemical properties for supervised learning. Our stability predictor achieved correlations of up to 0.72 and 0.67 (on cross validation and blind tests, respectively), while our pathogenicity predictor achieved a Matthew's Correlation Coefficient (MCC) of up to 0.77 and 0.73, outperforming previously described methods in both predicting changes in stability and in identifying pathogenic variants. mCSM-membrane will be an invaluable and dedicated resource for investigating the effects of single-point mutations on membrane proteins through a freely available, user friendly web server at http://biosig.unimelb.edu.au/mcsm_membrane.Entities:
Year: 2020 PMID: 32469063 PMCID: PMC7319563 DOI: 10.1093/nar/gkaa416
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.mCSM-membrane workflow. The first methodological step on mCSM-membrane was data collection. Experimentally validated effects of mutations on protein stability and pathogenicity were obtained for transmembrane proteins with available structures. During feature engineering, three main classes of features are generated: (i) graph-based signatures of the wild-type residue environment, (ii) a pharmacophore modelling of mutation effects (together with sequence-based properties) and (iii) the inter-residue interactions established. These are then used as evidence to train and test supervised learning algorithms. Random Forest for classification and Extra Trees for regression were the best performing and, therefore, selected methods.
Figure 2.Performance evaluation of mCSM-membrane on cross validation and blind tests. (A) shows the performance of mCSM-membrane on predicting effects of mutations on stability for transmembrane proteins during 10-fold cross validation, achieving a Pearson's correlation of 0.72 (0.83 on 90% of the data). During blind test (B), mCSM-membrane achieved a correlation of 0.67 with experimental data. For the pathogenicity predictor, (C) and (D) show the performance of mCSM-membrane in comparison with well-established methods as ROC plots on cross validation and blind test, respectively. Our method achieved AUC of 0.89 and 0.95.
Comparative performance of mCSM-membrane across training and test data sets with alternative stability predictors
| Training | Test | |||
|---|---|---|---|---|
| Method | Pearson's correlation | RMSE | Pearson's correlation | RMSE |
| FoldX | 0.48* | 1.18 | 0.57 | 1.25 |
| iMutant | 0.27* | 1.29 | 0.37* | 1.41 |
| CUPSAT | 0.01* | 1.34 | 0.15* | 1.50 |
| AUTOMUTE (RepTree) | 0.17* | 1.32 | 0.05* | 1.52 |
| AUTODMUTE (SVM) | 0.14* | 1.33 | 0.04* | 1.52 |
| MAESTRO | 0.20* | 1.16 | 0.17* | 1.09 |
| SDM | 0.01* | 1.34 | −0.14* | 1.51 |
| mCSM | 0.21* | 1.31 | 0.59 | 1.23 |
| DUET | 0.18* | 1.32 | 0.47* | 1.34 |
| Dynamut | 0.31* | 1.27 | 0.62 | 1.19 |
| mCSM-membrane | 0.72 | 0.93 | 0.67 | 1.13 |
*P-value < 0.05 by Fisher r-to-z transformation test compared to mCSM-membrane
Performance assessment of mCSM-membrane in predicting pathogenic mutations across training and test data sets, in comparison with alternative methods.
| Training | Test | |||||
|---|---|---|---|---|---|---|
| Method | AUC | F1 | MCC | AUC | F1 | MCC |
| PolyPhen2 | 0.79 | 0.79 | 0.47 | 0.73 | 0.75 | 0.40 |
| SIFT | 0.80 | 0.77 | 0.43 | 0.82 | 0.84 | 0.63 |
| PROVEAN | 0.80 | 0.79 | 0.48 | 0.79 | 0.75 | 0.40 |
| SNAP2 | 0.67 | 0.70 | 0.26 | 0.73 | 0.66 | 0.21 |
| MutPred2 | 0.75 | 0.79 | 0.48 | 0.75 | 0.82 | 0.57 |
| PON-P2 | 0.83 | 0.71 | 0.38 | 0.88 | 0.78 | 0.53 |
| BORODA-TM* | - - - | 0.96 | 0.87 | - - - | 0.78 | 0.46 |
| mCSM-membrane | 0.89 | 0.91 | 0.77 | 0.95 | 0.89 | 0.73 |
*AUC values were not calculated for BORODA-TM as no scores, rankings or class probabilities were available.