| Literature DB >> 19758473 |
Joke Reumers1, Joost Schymkowitz, Fréderic Rousseau.
Abstract
BACKGROUND: Linking structural effects of mutations to functional outcomes is a major issue in structural bioinformatics, and many tools and studies have shown that specific structural properties such as stability and residue burial can be used to distinguish neutral variations and disease associated mutations.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19758473 PMCID: PMC2745591 DOI: 10.1186/1471-2105-10-S8-S9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Summary of structural coverage of SNP data.
| No additional criteria | 9877 | 7.4 |
| Sequence coverage > 80 or alignment length > 100 | 8238 | 6.2 |
| Sequence identity > 80 | 5416 | 4.1 |
| Sequence coverage > 80 or alignment length > 100, and sequence identity > 80 | 5318 | 4.0 |
| Doublehit validation status, MAF > 0.01 | 680 | 0.51 |
| Doublehit validation status, MAF > 0.01, sequence identity > 80 | 229 | 0.17 |
| Doublehit validation status, MAF > 0.01, sequence coverage > 80 or alignment length > 100 | 446 | 0.33 |
| Doublehit validation status, MAF > 0.01, sequence coverage > 80 or alignment length > 100, and sequence identity > 80 | 209 | 0.16 |
Several criteria resulting from the above analyses are applied to assess the structural coverage and reliability of that coverage of human SNPs in the Ensembl database, as well as the overlap of the structural coverage with quality parameters for the validation and frequency status of the polymorphism data.
Figure 1Distributions for the major structural criteria in the disease and polymorphism datasets. White = disease mutations, grey = polymorphisms. A. Stability difference as calculated by the FoldX force field (in kcal.mol-1). B. Difference in aggregation propensity as calculated by the Tango algorithm. Values close to neutral changes (in the range [-50, 50]) are left out for display purposes. C. Distribution of degree of burial of the amino acid substitution site.
Predictive power of structural properties of the modeled variant proteins.
| Overall stability of residue | 14 | 33 | 0.22 | 1.61 | 0.19 |
| Backbone H bond | 32 | 72 | 0.40 | -1.05 | 0.22 |
| Sidechain H bond | 99 | 100 | 0.07 | -1.76 | |
| Electrostatics | 86 | 93 | 0.11 | -0.10 | -0.01 |
| Entropy side chain | 59 | 80 | 0.22 | 0.32 | 0.05 |
| Entropy main chain | 13 | 27 | 0.18 | 1.96 | 0.10 |
| Van der Waals contribution | 25 | 47 | 0.23 | -0.98 | 0.15 |
| Solvation hydrophobic | 10 | 22 | 0.16 | -0.6 | 0.16 |
| Solvation polar | 42 | 70 | 0.28 | 1.5 | 0.06 |
| Van der Waals clash | 18 | 33 | 0.17 | 0.22 | 0.15 |
| Side chain burial | 51 | 67 | 0.16 | 0.43 | -0.1 |
| Main chain burial | 59 | 83 | 0.26 | 0.73 | 0.05 |
| Entropy side chain | 72 | 84 | 0.15 | 0.93 | 0 |
The false positive rate (FPR = 1 - specificity) and the true positive rate (TPR = sensitivity) for the threshold on the specific property that gave the best Matthews correlation coefficient (MCC) are shown. MCC90 is the Matthews correlation coefficient for a specificity of 90% (i.e. 10% false positive rate). The ROC curves corresponding with the evaluation of all properties can be found in Supplementary Figure S2 in Additional file 1. FoldX was used to evaluate both the overall stability contribution of the amino acid substitution site in the modeled structure and the various factors involved in this stability. The entropy of the variant amino acid was calculated using a sampling strategy to assess the possible side chain conformations allowed at the substitution site. Both stability and entropy were calculated for all mutations and for a subset of buried mutations (side chain burial < 0.5) and surface mutations (side chain burial ≥ 0.5). Corresponding ROC curves are shown in Supplementary Figure S3 in Additional file
Predictive power of the differences between wild type and variant proteins for different structural properties.
| Overall stability difference | 73 | 85 | 0.15 | -0.45 | 0.14 |
| Overall stability diff. (surface) | 0 | 8 | 0.2 | 3.1 | 0.13 |
| Overall stability diff. (buried) | 21 | 44 | 0.25 | 2.64 | 0.12 |
| Backbone clash | 91 | 99 | 0.18 | -1.00 | -0.02 |
| Backbone H bond | 59 | 83 | 0.26 | -0.025 | 0.06 |
| Sidechain H bond | 79 | 92 | 0.18 | -0.13 | -0.14 |
| Electrostatics | 6 | 18 | 0.18 | 0.15 | 0.16 |
| Entropy main chain | 6 | 18 | 0.18 | 0.15 | 0.04 |
| Entropy side chain | 64 | 74 | 0.11 | -0.125 | -0.05 |
| Solvation hydrophobic | 57 | 75 | 0.19 | -0.15 | -0.03 |
| Solvation polar | 22 | 36 | 0.15 | 0.20 | -0.05 |
| Torsion clash | 1 | 3 | 0.07 | 1.00 | -0.05 |
| Van der Waals contribution | 7 | 14 | 0.11 | 0.89 | 0.10 |
| Van der Waals clash | 98 | 100 | 0.10 | -1.60 | 0.02 |
| FoldX entropy difference | 85 | 92 | 0.11 | -1.85 | -0.02 |
| FoldX entropy diff. (buried) | 96 | 100 | 0.14 | -2.70 | -0.05 |
| FoldX entropy diff. (surface) | 37 | 57 | 0.20 | -0.10 | 0.02 |
| Tango | 1 | 3 | 0.07 | 39.9 | 0 |
| Tango (positive, more aggr.) | 14 | 22 | 0.10 | 16.37 | 0 |
| Tango (negative, less aggr.) | 69 | 78 | 0.10 | -8.00 | 0 |
| Waltz | 0 | 1 | 0.07 | 748.97 | 0 |
| Waltz (positive, more aggr.) | 16 | 21 | 0.06 | 677.15 | 0 |
| Waltz (negative, less aggr.) | 99 | 100 | 0.07 | -2412.78 | 0 |
| Limbo | 17 | 33 | 0.18 | 5.45 | 0 |
FoldX was used to evaluate both the overall stability difference between wild type and variant structure, and the constituting contributions leading to this stability difference. The entropy difference caused by the amino acid substitution was calculated using a sampling strategy to assess the possible side chain conformations allowed at the substitution site. Both stability and entropy difference were calculated for all mutations and for a subset of buried mutations (side chain burial < 0.5) and surface mutations (side chain burial ≥ 0.5). Corresponding ROC curves are shown in Supplementary Figure S2 in Additional file 1.