| Literature DB >> 27384129 |
Douglas E V Pires1,2, Tom L Blundell1, David B Ascher1.
Abstract
The ability to predict how a mutation affects ligand binding is an essential step in understanding, anticipating and improving the design of new treatments for drug resistance, and in understanding genetic diseases. Here we present mCSM-lig, a structure-guided computational approach for quantifying the effects of single-point missense mutations on affinities of small molecules for proteins. mCSM-lig uses graph-based signatures to represent the wild-type environment of mutations, and small-molecule chemical features and changes in protein stability as evidence to train a predictive model using a representative set of protein-ligand complexes from the Platinum database. We show our method provides a very good correlation with experimental data (up to ρ = 0.67) and is effective in predicting a range of chemotherapeutic, antiviral and antibiotic resistance mutations, providing useful insights for genotypic screening and to guide drug development. mCSM-lig also provides insights into understanding Mendelian disease mutations and as a tool for guiding protein design. mCSM-lig is freely available as a web server at http://structure.bioc.cam.ac.uk/mcsm_lig.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27384129 PMCID: PMC4935856 DOI: 10.1038/srep29575
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Predicting the impacts of mutations on protein-ligand affinities using mCSM-lig.
This workflow highlights important steps in the methodology and how the main components of the signatures are computed. Here we use as an example the engineered lipocalin FluA binding to fluorescein (PDB ID: 1N0S, ligand ID: FLU), considering the mutation W129H. Given a mutation site in a wild-type protein, its structural environment is extracted and the distance patterns among the atoms summarized in the mCSM-lig signature. To take into account the change in atom types due to the mutation, a pharmacophore count is performed for the wildtype and mutant residue. The changes in pharmacophore count, ligand physicochemical properties and estimations of protein stability are then appended into the signature, which is used to train/test predictive models. This figure was created using yED, 3.14.3 (https://www.yworks.com/products/yed).
Performance of the computational approach on regression tasks.
| Feature | Ρ | Std. Error |
|---|---|---|
| Pharmacophore difference | 0.181 | 2.604 |
| Stability prediction | 0.188 | 2.584 |
| Ligand properties | 0.255 | 2.545 |
| Graph-based signatures | 0.569 | 2.167 |
| mCSM-lig | 0.628 | 2.059 |
The mCSM-lig denotes a combination of the described features and presents a significant improvement in performance in comparison with either individual feature. Performance assessed on 10-fold cross validation using a Gaussian Process.
Performance of mCSM-lig after outlier removal for the complete set of mutations and for those in contact with the ligand.
| % of the data set used | Full data set (#763) | Distance to ligand ≤5 Å (#545) |
|---|---|---|
| 100% | ρ = 0.627 | ρ = 0.674 |
| 95% | ρ = 0.699 | ρ = 0.729 |
| 90% | ρ = 0.737 | ρ = 0.769 |
| 80% | ρ = 0.801 | ρ = 0.824 |
Figure 2Regression plot between experimental and predicted effects of mutation on ligand affinity on the full data set (763 mutations, left graph) and on mutated residues close to the ligand (545 mutations ≤5 Å, right graph).
mCSM-lig achieved a Pearson correlation coefficient of ρ = 0.627 over the entire data set and ρ = 0.674 for those mutations close to the ligand, with this correlation improving to ρ = 0.737 and ρ = 0.769 respectively after 10% outlier removal. Outliers are depicted in red.