| Literature DB >> 27356511 |
Pier Luigi Martelli1,2, Piero Fariselli3,4, Castrense Savojardo3,5, Giulia Babbi3,5, Francesco Aggazio3,5, Rita Casadio3,5.
Abstract
BACKGROUND: Modern genomic techniques allow to associate several Mendelian human diseases to single residue variations in different proteins. Molecular mechanisms explaining the relationship among genotype and phenotype are still under debate. Change of protein stability upon variation appears to assume a particular relevance in annotating whether a single residue substitution can or cannot be associated to a given disease. Thermodynamic properties of human proteins and of their disease related variants are lacking. In the present work, we take advantage of the available three dimensional structure of human proteins for predicting the role of disease related variations on the perturbation of protein stability.Entities:
Keywords: Disease related-variations; Interactomics networks; Protein stability; Residue solvent accessibility
Mesh:
Substances:
Year: 2016 PMID: 27356511 PMCID: PMC4928156 DOI: 10.1186/s12864-016-2726-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Performance of INPS3D and other state-of-the-art predictors
| Method | Cross-validation (2648 variations on 132 proteins) | Blind test set (351 variations on 60 proteins) | Blind test set (42 variations on P53 protein) |
|---|---|---|---|
| INPSb | 0.53/1.29a | 0.68/1.26a | 0.71/1.49a |
| INPS3D | 0.58/1.20a | 0.72/1.15a | 0.76/1.35a |
| MAESTROc | 0.63/1.17a | 0.71/1.16a | 0.44/1.71a,e |
| mCSMd | 0.51/1.26a | 0.67/1.19a | 0.68/1.40a |
aPearson’s correlation coefficient/standard error (kcal/mol)
Data are from b[15]; c[13]; d[12], ethis work, respectively
Fig. 1Distribution of the absolute value of the ΔΔG predicted with INPS3D, MAESTRO and INPS. The set includes 4717 disease related variations and 687 polymorphisms in 368 OMIM proteins
Fig. 2Relative Solvent Accessibility of the variations as a function of ΔΔG predicted for the variants of the OMIM set. The box-plot reports the median and the lower and upper quartiles of the distribution of relative solvent accessibility for each interval of ΔΔG
Fig. 3Frequency of the solvent accessible variations as a function of ΔΔG predicted for the protein variants of the OMIM set
Relation between thermodynamic properties and structural properties in proteins with biologically functional monomeric assembly
| Disease-related variant | RSA ≥ 0.20 | RSA < 0.20 |
|---|---|---|
| |ΔΔG| ≤ 1 | 562 (23.4 %)a
| 756 (31.4 %)a
|
| |ΔΔG| > 1 | 176 (7.3 %)a
| 907 (37.8 %)a
|
| Polymorphic variant | ||
| |ΔΔG| ≤ 1 | 194 (59.0 %)a
| 72 (21.9 %)a
|
| |ΔΔG| > 1 | 22 (6.7 %)a
| 41 (12.5 %)a
|
aNumber of residue predicted to be part of a protein-protein interaction patch (for details on the prediction method, see [30]). Predicted set: 2401 disease related variations and 329 polymorphic variations in 177 proteins
Relation between thermodynamic properties and structural properties in proteins with biologically functional multimeric assembly
| Disease-related variations | RSA ≥ 0.20 | RSA < 0.20 |
|---|---|---|
| |ΔΔG| ≤ 1 | 660 (28.5 %) Monomera
| 650 (28.0 %) Monomera
|
| 550 (25.0 %) Complexa
| 760 (31.5 %) Complexa
| |
| |ΔΔG| > 1 | 213 (9.2 %) Monomera
| 793 (34.2 %) Monomera
|
| 196 (8.5 %) Complexa
| 810 (35.0 %) Complexa
| |
| Polymorphic variations | ||
| |ΔΔG| ≤ 1 | 198 (55.6 %) Monomera
| 84 (23.6 %) Monomera
|
| 186 (52.2 %) Complexa
| 96 (27.0 %) Complexa
| |
| |ΔΔG| > 1 | 29 (8.1 %) Monomera
| 45 (12.6 %) Monomera
|
| 29 (8.1 %) Complexa
| 45 (12.6 %) Complexa
|
aNumber of residue predicted to be part of a protein-protein interaction patch. 2316 disease related variations and 356 polymorphic variations in 191 proteins. Predictions of INPS-3D and PRED-PPI are independent of the assembly state. RSA values were independently estimated on the monomeric and the complex structures
Fig. 4Relation between the per-protein fraction of non-perturbing, solvent accessible variations and the corresponding number of the wild-type partners of interactions in the human interactome. The box-plot reports the median and the lower and upper quartiles of the number of interactions present in IntAct as a function of the fraction of solvent accessible, non-perturbing variations. The dashed blue line connects the average values. Non perturbing variations are those predicted to promote a |ΔΔG| ≤ 1 kcal/mol with INPS3D and found in protein sites that are solvent accessible. Data refers to 170 proteins with 4037 variations of our data set. Proteins with less than 5 disease-related variations or without interactomic data reported in IntAct are excluded
Fig. 5Relation between the per-protein fraction of solvent accessible variations and the corresponding number of the wild-type partners of interactions in the human interactome. The box-plot reports the median and the lower and upper quartiles of the number of interactions present in IntAct as a function of the fraction of solvent accessible variations. The dashed blue line connects the average values. Data refers to 170 proteins with 4037 variations of our data set. Proteins with less than 5 disease-related variations or without interactomic data reported in IntAct are excluded