Literature DB >> 21593128

SDM--a server for predicting effects of mutations on protein stability and malfunction.

Catherine L Worth¹, Robert Preissner, Tom L Blundell.

Abstract

The sheer volume of non-synonymous single nucleotide polymorphisms that have been generated in recent years from projects such as the Human Genome Project, the HapMap Project and Genome-Wide Association Studies means that it is not possible to characterize all mutations experimentally on the gene products, i.e. elucidate the effects of mutations on protein structure and function. However, automatic methods that can predict the effects of mutations will allow a reduced set of mutations to be studied. Site Directed Mutator (SDM) is a statistical potential energy function that uses environment-specific amino-acid substitution frequencies within homologous protein families to calculate a stability score, which is analogous to the free energy difference between the wild-type and mutant protein. Here, we present a web server for SDM (http://www-cryst.bioc.cam.ac.uk/~sdm/sdm.php), which has obtained more than 10,000 submissions since being online in April 2008. To run SDM, users must upload a wild-type structure and the position and amino acid type of the mutation. The results returned include information about the local structural environment of the wild-type and mutant residues, a stability score prediction and prediction of disease association. Additionally, the wild-type and mutant structures are displayed in a Jmol applet with the relevant residues highlighted.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：
Proteins

Year: 2011 PMID： 21593128 PMCID： PMC3125769 DOI： 10.1093/nar/gkr363

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Primarily hydrophobic interactions and a network of hydrogen bonds stabilize the folded state of a protein. However, a protein that is folded correctly is only marginally more stable than when it is unfolded, and mutations that affect a stabilizing interaction within a folded protein may lead to protein instability and malfunction. Where protein malfunction does occur and cannot be remediated by an alternative molecular pathway this may result in disease. For example, destabilizing mutations in phenylalanine hydroxylase lead to the metabolic disease, phenylketonuria (1). In fact, up to 80% of Mendelian disease-associated single mutations in protein coding regions are estimated to be caused by protein destabilization effects (2). However, a huge volume of single nucleotide polymorphisms (SNPs) has been generated in recent years from projects such as the Human Genome Project (3) and the HapMap Project (4) largely due to the availability of high-throughput array-based genotyping methods (5) and next generation sequencing platforms (6,7). Automatic methods that can predict the effect of mutations accurately will allow a reduced set of mutations to be characterized experimentally, saving time and money. Various methods of predicting protein stability changes caused by mutation have been described and can be grouped into four main categories based on the strategy used in the calculation: (i) physical effective energy functions; (ii) empirical potential energy functions; (iii) machine learning methods; and (iv) statistical potential energy functions. Physical potential energy functions (such as molecular mechanics approaches or Monte Carlo simulations) are probably the most accurate methods for predicting the effects of mutations on protein stability, however, they are currently only useful for testing small sets of mutants due to the large amount of time required to compute calculated ΔΔG values (8–12). The reliability of predictions is also complicated by the difficulties in sampling in the folded and unfolded states (12). Empirical potential approaches are fitted to experimental data using a set of weighted terms incorporating physical and statistical energy terms and structural descriptors (13,14). Machine learning methods include neural networks and support vector machines (SVMs) and use information about mutations, protein sequence and structural information to fit a non-linear function to experimental data (15–17). They are similar to empirical potential approaches in their use of experimental data to fit their function and in both cases, care must be taken that the function is not over-fitted to the training data set. Statistical potential energy approaches are derived from the statistical analysis of protein data such as substitution frequencies, distance potentials and amino acid environmental propensities (18–21). Other methods use a combination of the above strategies (22–24). Site Directed Mutator (SDM) is a statistical potential energy function developed by Topham et al. (20) to predict the effect that SNPs will have on the stability of proteins. SDM uses environment-specific amino acid substitution frequencies within homologous protein families to calculate a stability score, which is analogous to the free energy difference between a wild-type and mutant protein. Blind testing on a set of 83 staphylococcal nuclease and 63 barnase mutants showed a correlation of 0.80 between the predicted stability changes and experimental data (20). The method performs comparably or better than other published methods in the task of classifying mutations as stabilizing or destabilizing (25). Additionally, SDM has much improved sensitivity in predicting stabilizing mutations compared to other published methods (five of the seven methods tested incorrectly classify >68% of the stabilizing mutations). When applied to the task of predicting disease-associated mutations, SDM had an accuracy of 61% (26). Therefore, SDM is a useful tool for guiding the design of site-directed mutagenesis experiments or for predicting whether a mutation will impact protein structure and have a role in disease. Here, we present a web server for SDM (http://www-cryst.bioc.cam.ac.uk/~sdm/sdm.php), which has not previously been published.

MATERIALS AND METHODS

Environment-specific substitution tables

SDM uses a set of conformationally constrained environment-specific substitution tables (ESSTs), the general methodology of which are described in (27,28). The tables were derived from 371 protein family sequence alignments from the HOMSTRAD database (29), consisting of 1357 structures and were built using a modified version of the program Makesub, which is able to handle sidechain hydrogen bond satisfaction (C. Topham, unpublished data). By defining the local structural environment of amino acid residues (secondary structure, solvent accessibility and formation of hydrogen bonds) distinct patterns of substitutions have been observed (30,31). Environment-specific substitution tables (ESSTs) store these substitution data quantitatively in the form of probabilities and therefore provide information about the existence of each amino acid in a particular environment and the probability of it being substituted by any other amino acid. Functional residues [as defined by Uniprot (32), the Catalytic Site Atlas (33) and Interpare (34)] were masked from substitution counts.

Definition of structural environment

The structural parameters that were used to define the local environment of amino acid residues are mainchain conformation, solvent accessibility and hydrogen-bonding class. These cut-offs were chosen based on an assessment of relative sidechain solvent accessibility values (36). The accessibility of each residue in a structure was calculated using the program psa (A. Sali, unpublished data). Mainchain conformation and secondary structure: Nine classes of mainchain conformation were defined: residues were identified as belonging to either α-helix or β-sheet first and the remaining residues were classified as being a, b, p, t, l, g or e according to their mainchain φ-ψ torsion angles. The torsion angles and secondary structure assignments were calculated using the sstruc program (D. Smith, unpublished data). Relative sidechain solvent accessibility: Three classes of relative sidechain solvent accessibility were defined based on the method of Lee and Richards (35). Residues with sidechain relative accessibilities of: <17% were defined as inaccessible 17–43% were defined as partially accessible >43% were defined as accessible iii. Hydrogen bonding: Two classes of hydrogen bonding were defined: residues were classed as either being satisfied in terms of their sidechain hydrogen bonding or not based on the criteria described by Worth and Blundell (37). Proteins were first protonated and the charge state of ionisable residues determined using the program, PROPKA (38). The program, hbond (J. Overington, unpublished data), was used to identify hydrogen bonds defined by the criterion that the distance between donor and acceptor was <3.5Å except for interactions involving sulphur atoms where 4.0Å was used. Hydrogen bonds were then further filtered using the methodology described by Worth and Blundell (37). These structural parameters gave a total of 54 local environments (nine mainchain × three solvent accessibility × two hydrogen bonding terms).

Prediction of protein stability changes caused by mutation

The algorithm underlying SDM was first described by Topham et al. (20). In this original work, two stability difference scores were calculated using either amino acid environmental substitution data (method I) or amino acid propensities (method II). Our subsequent analysis showed that updating the substitution and propensity data using additional protein families resulted in a better performance when the environment substitution data were used (data not shown). Therefore, SDM uses only method I to calculate protein stability changes caused by mutation. In addition, SDM now uses a far more comprehensive set of substitution data (ESSTs) compared to the original publication (371 families compared to 131) and known functional sites are excluded from the substitution counts. Furthermore, the local structural environment parameter ‘sidechain hydrogen bond (yes/no)’ was modified to ‘sidechain hydrogen-bonding satisfaction (satisfied/unsatisfied)’ and this was shown to improve the stability score calculations (36). By analogy to the folding-unfolding cycle in Figure 1, the algorithm uses ESSTs to calculate the difference in the stability scores of the folded and unfolded state for the wild-type and mutant protein structures: The substitution data used for calculating the stability score are from families of homologous proteins, which have accepted multiple mutations during the course of their evolution. However, the effects of single substitutions are not often observed over the timescale of evolution e.g. cavity mutants. In order to compensate for this a disruption term is introduced for buried mutated residues. It is defined as the logarithmic function of the absolute value of the net change over the mutated position in the sidechain surface accessible area in an extended peptide Gly-X-Gly, relative to that for glycine. Therefore Equation (1) becomes: ESSTs take into account the environment of only one of the two residues (wild-type or mutant), therefore it is necessary to consider not only the probability of replacement of the wild-type residue (Rj) in the wild-type environment (εwt) by a mutant residue type (rk) in an undefined environment [P(rk/Rj, εwt)] but also the probability of replacement of the mutant residue type (Rk) in the mutant environment (εmut) by the wild-type residue (rj) in an undefined environment [P(rj/Rk, εmut)].

Figure 1.

The thermodynamic cycle can be used to calculate protein stability changes between wild-type and mutant proteins.

The thermodynamic cycle can be used to calculate protein stability changes between wild-type and mutant proteins. In order to normalise the probabilities that are combined from different substitution tables, it is necessary to introduce a reference state. For the wild-type residue (Rj) in the wild-type environment a suitable reference state is the probability of it being conserved in that environment [P(rj/Rj, εwt)]. In an analogous way, for the mutant residue type (Rk) in the mutant environment, a suitable reference state is the probability of it being conserved in that environment [P(rk/Rk, εmut)]. The difference in stability scores for a mutation in the folded state is therefore calculated by: The difference in stability scores in the unfolded state () is also calculated using Equation (3) but uses an environmental substitution table derived from non-hydrogen bonded, surface exposed amino acid residues falling outside regions of regular secondary structure. The stability difference scores for the folded and unfolded state for the wild-type and mutant protein structures are then calculated using Equation (1).

Prediction of disease-association

From studying missense mutations for which the phenotypes are known, it is estimated that the stability margin that can be accommodated without any immediate effect on protein fitness is 1–3 kcal mol−1 (39–41). Studies of Ig-like proteins have shown that mutations that decrease the stability of these proteins by >2 kcal mol−1 result in severe disease phenotypes (42,43). It may appear counter-intuitive that increased protein stability can lead to protein malfunction; however, protein flexibility is essential for enzyme catalysis. For instance, the increased stability of many thermophilic proteins is accompanied by loss of protein flexibility and reduced enzymatic activity at low temperatures (44–48). Furthermore, stabilizing mutations at catalytic site residues typically decrease activity and suggest that function often comes with a substantial penalty to stability (44,49–52). In addition, highly stable proteins are protease-resistant and therefore difficult to regulate—this is important to consider in systems such as cell signalling, where removing a signal is as important as its activation (53). A recent study showed that β-catenin accumulation is the most common aberration in parathyroid tumours of primary origin and that the S37A stabilizing mutation of CTNNB1 was found in 5.8% of the tumours (54). Another example of a stabilizing and damaging mutation is the Parkinson disease-associated A30P mutation, which stabilizes α-synuclein against proteasomal degradation triggered by haeme oxygenease-1 over-expression in human neuroblastoma cells (55). Hence, there is biological evidence that increased protein stability can lead to protein malfunction and hence disease. In light of the studies mentioned in the previous two paragraphs, we have used a cut-off of 2 kcal mol−1 (stabilizing or destabilizing) for classifying mutations as leading to protein malfunction and possibly disease.

Mutant thermodynamic data sets

A subset of the data set used by Capriotti et al. (16) was used for initial benchmarking. This mutant data set was taken from the ProTherm database, which stores thermodynamic data for proteins and mutants (56). Our method requires knowledge of the local structural environment of wild-type and mutant residues in order to predict the effect of mutation on the stability of a protein. If the local environment is incorrectly defined e.g. the protein functions as a trimer but is defined in the crystallographic asymmetric unit as the protomer, this may affect our calculation. To remove the effect of such errors we used the Protein Interfaces, Surface and Assemblies (PISA) service to predict the oligomeric state of each of the proteins in the data set (57). Only those proteins predicted to be monomers were used. This data set is hereafter referred to as the monomeric set. The validation data set used by Dehouck et al. (22) for benchmarking their method PoPMuSiC-2.0 was used for comparison of SDM’s performance to other published stability change prediction algorithms. This data set comprises 350 mutations, none of which was included in any of the databases used to devise or test the seven methods tested by Dehouck et al. (22). A set of 388 mutants (S388) with thermodynamic measurements conducted under physiological conditions was also used to test our method. The S388 data set has been used to test other published methods and therefore allows us to perform a direct comparison of our method to them.

WEBSERVER

Input

SDM requires the 3D co-ordinates of the wild-type protein (in PDB format), the PDB chain identifier, the mutation position and the amino acid type of the mutation in one-letter code in order to calculate a stability score for mutant proteins. Users who have not already obtained a structure of their protein of interest may use the search boxes on the home page to do so. These search boxes allow a user to query the RCSB Protein Data Bank (www.pdb.org) (58) for their protein of interest, using protein name, description or amino acid sequence. The wild-type structure may be submitted using one of two methods; the user can either upload the PDB file or enter the four-letter PDB code. NMR structures are accepted by SDM for input; however, users should note that it is only the first model in the PDB file, which is used for subsequent analysis. SDM also requires a 3D structure of the mutant protein to perform its calculations. In this case, the user has the option of either uploading a mutant structure or using the program ANDANTE to build a model structure of the mutant (59). A requirement of SDM is that the wild-type and mutant structures span the same part of the polypeptide chain; therefore users must ensure that when they upload a mutant PDB structure that they fulfil this requirement. The home page also provides a link to example output in order that users may view the type of output produced before running their job. Additionally, tutorials on usage are available for viewing using the link provided on the navigator bar.

Output

The results page is split into three sections. On the left-hand side the mutant information is displayed (wild-type and mutant amino acid types plus the position). Where ANDANTE was used to build a mutant structure, the PDB file is made available for download. The results returned include information about the local structural environment of the wild-type and mutant residues (the secondary structure, solvent accessibility and sidechain hydrogen bond satisfaction), a stability score prediction and prediction of disease association. As mentioned in the methods section, a cut-off of 2 kcal mol−1 is used to indicate whether a mutation is likely to be disease-associated or not. However, mutations that do not reach this cut-off may still lead to protein malfunction and disease if they affect binding sites. A statement indicating this issue is therefore displayed and the links page lists resources that can be used to assess whether a residue is involved in binding. In the middle portion of the results page, the wild-type and mutant structures are displayed using the Jmol structure viewer (Jmol: an open-source Java viewer for chemical structures in 3D http://www.jmol.org/) with the relevant residues highlighted. The user may control the display of these structures using the menu buttons on the right-hand side. An example of the type of output produced by SDM is shown in Figure 2. A particular advantage of the predictions provided by SDM over other published methods is the indication of the local structural environment of wild-type and mutant residues and the fact that the user may view the 3D structural context of the residues. This allows users to identify possible molecular mechanisms that underlie predicted stability changes for example, loss of hydrogen bonds to the protein backbone.

Figure 2.

Screenshot of SDM analysis results for the example of mutation Y231N in Dystrophin (PDB code 1DXX, chain A). On the left hand side information about the wild-type and mutant residue is displayed such as the secondary structure, solvent accessibility and hydrogen bonds formed by the sidechain. Underneath this information is the predicted effect on protein stability. In this case, SDM predicts that the mutation is highly destabilizing and disease-associated. In fact, this mutation is associated with muscular dystrophy and has been shown to decrease protein stability (73). In the middle, the structural context of the wild-type and mutant amino acids are shown in the Jmol applet with the residues coloured according to their chemical properties (key displayed on right hand side). Using the menus on the right hand side the user can manipulate the Jmol applet and control what is shown.

VALIDATION

SDM has previously been validated using a set of ∼230 mutants and was shown to have an accuracy of 74% in predicting the sign of stability change and a linear correlation coefficient of 0.60 between predicted and observed ΔΔG values (25). Removal of one outlying data point increased the linear correlation coefficient to 0.66. Analysis of the performance of SDM in predicting the sign of stability change in comparison to eight other published methods demonstrated that SDM performs comparably or better than the other methods. Since the benchmarking detailed above was carried out, SDM has been modified so that the definition of sidechain hydrogen bonding has been changed from yes or no to satisfied or unsatisfied. Furthermore, functional residues have been masked from the substitution counts used to generate the ESSTs. We tested the improvement that these changes made to SDM’s predictions using the 855 mutants in the monomeric data set. The additional families used to generate the ESSTs, masking functional residues and incorporation of the hydrogen bond satisfaction term improved the correlation coefficient between predicted stability changes and experimental measurements from 0.51 to 0.58 (Table 1).

Table 1.

Comparison of the performance of SDM using different sets of ESSTs and the monomeric data set

Parameters used to generate ESSTs			Accuracy (%)	R^a	σ (kcal/mol)
Protein families	Hydrogen bonding term	Masking of functional residues	Accuracy (%)	R^a	σ (kcal/mol)
113	Original	No	73	0.51	1.82
371	Original	Yes	73	0.56	1.61
371	Satisfied	No	73	0.56	1.73
371	Satisfied	Yes	71	0.58	1.74

aPearson product-moment correlation coefficient.

Comparison of the performance of SDM using different sets of ESSTs and the monomeric data set aPearson product-moment correlation coefficient. The statistical potential-based method, PoPMuSiC-2.0 was recently reported and achieved a correlation of 0.63 between measured and predicted stability changes (22). The predictive power of the method was shown to be significantly higher than that of other programs described in the literature. In order to compare the predictive power of SDM to PoPMuSiC-2.0 and the other tested methods, we used the same data set of 350 mutants. After the PoPMuSiC algorithms, SDM has the highest linear correlation between predicted and measured ΔΔG values (Table 2). It also has the benefit of making predictions for the entire data set of 350 mutants. It is encouraging that the performance of SDM is improved when considering only highly stabilizing or destabilizing mutations—the correlation coefficient increases from 0.52 to 0.63 (Table 2).

Table 2.

Comparison of the performance of different prediction methods

Method	No. of predictions^b	Complete set (350/309/87 mutants)^a
Method	No. of predictions^b	R	σ (kcal/mol)
Automute^c	315	0.46 / 0.45 / 0.45	1.43 / 1.46 / 1.99
CUPSAT^c	346	0.37 / 0.35 / 0.50	1.91 / 1.96 / 2.14
Dmutant^c	350	0.48 / 0.47 / 0.57	1.81 / 1.87 / 2.31
Eris^c	334	0.35 / 0.34 / 0.49	4.12 / 4.28 / 3.91
I-mutant-2.0^c	346	0.29 / 0.27 / 0.27	1.65 / 1.69 / 2.39
PoPMuSiC-1.0^c	350	0.62 / 0.63 / 0.70	1.24 / 1.25 / 1.66
PoPMuSiC-2.0^c	350	0.67 / 0.67 / 0.71	1.16 / 1.19 / 1.67
SDM	350	0.52 / 0.53 / 0.63	1.80 / 1.81 / 2.11

aThree values are given per column. The first corresponds to the whole validation set of 350 mutants with the unavailable ΔΔG predictions set to 0.0 kcal/mol. The second corresponds to the 309 mutants for which a ΔΔG prediction is available for all predictors. The third corresponds to 87 mutants for which the experimental ΔΔG value causes >2 kcal mol−1 change and for which a ΔΔG prediction is available for all predictors.

b350 mutations were tested with each method. However, some servers failed to compute the ΔΔG prediction for all mutants, resulting in predictions for less than the full number.

cData taken from (22).

The vast majority of published methods for predicting the effects of mutations on protein stability are based on machine learning (ML). These are first trained on a data set of mutations. Many of these ML methods report high correlations with experimental data sets [e.g. CUPSAT R = 0.87 (21) and IMutant2.0 R = 0.71 (60)]. However, when tested later in blind tests, these correlations drop drastically [e.g. CUPSAT R = 0.37 and IMutant-2.0 R = 0.29 (22)]. This reduction in prediction performance may be due to over-fitting to available data sets. The problem of decreasing performance of ML methods using blind-data sets was also observed by two independent assessments of the performance of protein stability predictors (61,62). SDM is not a ML method, but rather a statistical method based on observed amino acid substitutions that have occurred during divergent protein evolution. Therefore, it does not suffer from the problem of over-fitting, as demonstrated by the similar correlation coefficients obtained using the monomeric data set and the PoPMuSiC-2.0 validation data set. The problem of over-fitting is an important point to consider if methods are to be used to help successfully design mutagenesis experiments. Table 3 shows the results of testing the S388 data set. These results show the performance of methods in predicting the sign of stability change i.e. whether a mutation is stabilizing or destabilizing. Many of the methods have accuracies of over 80%, which is impressive. However, if we examine the ability of the methods to predict stabilizing and destabilizing mutations another picture emerges; they tend to be very good at predicting destabilizing mutations but much worse at predicting stabilizing mutations. SDM however has a more balanced sensitivity in predicting both types of mutations, although the specificity of predicting destabilizing mutations is far better than that of predicting stabilizing mutations. Most mutations are destabilizing and this is reflected in the mutant thermodynamic data sets used for developing and testing such methods. Methods that assign all of the samples to the majority class (destabilizing mutations) will have high accuracy even though the performance is poor for the minority class (stabilizing mutations). This trend is observed for most of the methods reported in Table 3. It is possible that some of the results in Table 3 are biased by some over-fitting to the training data sets used in developing the methods.

Table 3.

Comparison of the performance of different prediction methods

Method	MCC	Accuracy	Sens. (+)	Spec. (+)	Sens. (−)	Spec. (−)
Automute S1227^a	0.31	0.87	0.36	0.42	0.94	0.92
FOLDX^b	0.25	0.75	0.56	0.26	0.78	0.93
DFIRE^b	0.11	0.68	0.44	0.18	0.71	0.90
PoPMuSiC-1.0^b	0.20	0.85	0.25	0.33	0.93	0.90
PoPMuSiC-2.0	0.32	0.86	0.35	0.44	0.94	0.91
NeuralNet^b	0.25	0.87	0.21	0.44	0.96	0.90
MuPro SO^c	0.26	0.86	0.30	0.40	0.94	0.90
MuPro TO^c	0.28	0.86	0.31	0.42	0.94	0.91
MuPro ST^c	0.27	0.86	0.31	0.40	0.93	0.91
MuX-S^d	0.39	0.88	0.29	0.67	0.94	0.91
MuX-48^c	0.39	0.89	0.29	0.67	0.98	0.91
SDM	0.28	0.71	0.70	0.24	0.71	0.94

aData taken from Masso and Vaisman (24).

bData taken from Capriotti et al. (16).

cData taken from Cheng et al. (17).

dData taken from Kang et al. (74).

Comparison of the performance of different prediction methods aThree values are given per column. The first corresponds to the whole validation set of 350 mutants with the unavailable ΔΔG predictions set to 0.0 kcal/mol. The second corresponds to the 309 mutants for which a ΔΔG prediction is available for all predictors. The third corresponds to 87 mutants for which the experimental ΔΔG value causes >2 kcal mol−1 change and for which a ΔΔG prediction is available for all predictors. b350 mutations were tested with each method. However, some servers failed to compute the ΔΔG prediction for all mutants, resulting in predictions for less than the full number. cData taken from (22). Comparison of the performance of different prediction methods aData taken from Masso and Vaisman (24). bData taken from Capriotti et al. (16). cData taken from Cheng et al. (17). dData taken from Kang et al. (74). When applied to the task of predicting disease-associated mutations, SDM had an accuracy of 61% (26), only 3% less than the accuracy achieved by the program Sorting Intolerant from Tolerant (SIFT) (63). Of course, it is unsurprising that SIFT obtains a higher accuracy than SDM as SDM is able to distinguish disease-associations only for those mutations that perturb protein structure and not those that directly affect catalytic residues, binding sites etc. Mutations that cause protein malfunction by affecting the functional residues of a protein (active sites or protein–protein interaction sites) or by altering post-translational modifications will not be identified as damaging by SDM. Therefore, to obtain a more accurate prediction of whether an nsSNP is associated with disease, these other effects should also be taken into account. We previously demonstrated that when SDM’s predictions were combined with predictions of functional sites using Crescendo (64) and known functional sites, this combined approach has a comparable accuracy to the other methods tested but has the benefit of a much lower false-positive rate, therefore providing a high-quality set of predictions (26).

SUMMARY

The SDM server provides users with a fast and accurate means of assessing the impact that a mutation will have on protein structure and stability. It provides a 3D view of the wild-type and mutant residues, allowing users to inspect the structural context of the sidechains. SDM is a useful tool for identifying possible disease associations and has been applied to the task of predicting deleterious nsSNPs at the genome scale (25,26,65) and also for generating new hypotheses regarding: (i) the molecular aetiology of renal cell carcinoma and pheochromocytoma in the cancer syndrome, von Hippel-Lindau disease (66); (ii) the structural effects of mutations in thyroid stimulating hormone receptor that are associated with congenital non-goitrous hypothyroidism (67); and (iii) tumour risk associated with mutations in succinate dehydrogenase D (68). It has also been used in the analysis of mutations in the autoimmune regulator protein (69), mixed lineage kinase 3 (70), the adaptor protein MyD88 adaptor-like (71) and breast cancer susceptibility gene 1 (72).

FUNDING

This work was supported by the Biotechnology and Biological Sciences Research Council (research studentship to C.L.W.) and a Wellcome Trust Programme Grant (to T.L.B.). Funding for open access charge: Wellcome Trust Programme Grant. Conflict of interest statement. None declared.

71 in total

1. Performance of protein stability predictors.

Authors: Sofia Khan; Mauno Vihinen
Journal: Hum Mutat Date: 2010-06 Impact factor: 4.878

2. Prediction of the mutation-induced change in thermodynamic stabilities of membrane proteins from free energy simulations.

Authors: Hwangseo Park; Sangyoub Lee
Journal: Biophys Chem Date: 2004-12-22 Impact factor: 2.352

3. Investigating the effects of mutations on protein aggregation in the cell.

Authors: Giulia Calloni; Sara Zoffoli; Massimo Stefani; Christopher M Dobson; Fabrizio Chiti
Journal: J Biol Chem Date: 2004-12-16 Impact factor: 5.157

4. HOMSTRAD: a database of protein structure alignments for homologous families.

Authors: K Mizuguchi; C M Deane; T L Blundell; J P Overington
Journal: Protein Sci Date: 1998-11 Impact factor: 6.725

5. Thermodynamic database for proteins: features and applications.

Authors: M Michael Gromiha; Akinori Sarai
Journal: Methods Mol Biol Date: 2010

6. Missense mutations in dystrophin that trigger muscular dystrophy decrease protein stability and lead to cross-beta aggregates.

Authors: Surinder M Singh; Narsimulu Kongari; Javier Cabello-Villegas; Krishna M G Mallela
Journal: Proc Natl Acad Sci U S A Date: 2010-08-09 Impact factor: 11.205

7. Novel TSHR mutations in consanguineous families with congenital nongoitrous hypothyroidism.

Authors: Hakan Cangul; Neil V Morgan; Julia R Forman; Halil Saglam; Zehra Aycan; Tahsin Yakut; Tuna Gulten; Omer Tarim; Ece Bober; Yasar Cesur; Gail A Kirby; Shanaz Pasha; Mutlu Karkucak; Erdal Eren; Semra Cetinkaya; Veysel Bas; Korcan Demir; Sevil A Yuca; Esther Meyer; Michaela Kendall; Wolfgang Hogler; Timothy G Barrett; Eamonn R Maher
Journal: Clin Endocrinol (Oxf) Date: 2010-11 Impact factor: 3.478

8. Toward classification of BRCA1 missense variants using a biophysical approach.

Authors: Pamela J E Rowling; Rebecca Cook; Laura S Itzhaki
Journal: J Biol Chem Date: 2010-04-08 Impact factor: 5.157

9. Ongoing and future developments at the Universal Protein Resource.

Authors:
Journal: Nucleic Acids Res Date: 2010-11-04 Impact factor: 16.971

10. Mixed lineage kinase 3 gene mutations in mismatch repair deficient gastrointestinal tumours.

Authors: Sérgia Velho; Carla Oliveira; Joana Paredes; Sónia Sousa; Marina Leite; Paulo Matos; Fernanda Milanezi; Ana Sofia Ribeiro; Nuno Mendes; Danilo Licastro; Auli Karhu; Maria José Oliveira; Marjolijn Ligtenberg; Richard Hamelin; Fátima Carneiro; Annika Lindblom; Paivi Peltomaki; Sérgio Castedo; Simó Schwartz; Peter Jordan; Lauri A Aaltonen; Robert M W Hofstra; Gianpaolo Suriano; Elia Stupka; Arsenio M Fialho; Raquel Seruca
Journal: Hum Mol Genet Date: 2009-12-02 Impact factor: 6.150

161 in total

1. The road from next-generation sequencing to personalized medicine.

Authors: Manuel L Gonzalez-Garay
Journal: Per Med Date: 2014 Impact factor: 2.512

2. Novel compound heterozygous mutations in the GPR98 (USH2C) gene identified by whole exome sequencing in a Moroccan deaf family.

Authors: Amale Bousfiha; Amina Bakhchane; Hicham Charoute; Mustapha Detsouli; Hassan Rouba; Majida Charif; Guy Lenaers; Abdelhamid Barakat
Journal: Mol Biol Rep Date: 2017-09-26 Impact factor: 2.316

3. Computational tools help improve protein stability but with a solubility tradeoff.

Authors: Aron Broom; Zachary Jacobi; Kyle Trainor; Elizabeth M Meiering
Journal: J Biol Chem Date: 2017-07-14 Impact factor: 5.157

4. EPSP synthase flexibility is determinant to its function: computational molecular dynamics and metadynamics studies.

Authors: Luís Fernando Saraiva Macedo Timmers; Antônio M S Neto; Rinaldo W Montalvão; Luiz A Basso; Diógenes S Santos; Osmar Norberto de Souza
Journal: J Mol Model Date: 2017-06-07 Impact factor: 1.810

Review 5. Multifactorial level of extremostability of proteins: can they be exploited for protein engineering?

Authors: Debamitra Chakravorty; Mohd Faheem Khan; Sanjukta Patra
Journal: Extremophiles Date: 2017-03-10 Impact factor: 2.395

6. A TMC1 (transmembrane channel-like 1) mutation (p.S320R) in a Polish family with hearing impairment.

Authors: Mohamed Ahamed Hassan; Aftab Ali Shah; Elzbieta Szmida; Robert Smigiel; Maria M Sasiadek; Markus Pfister; Nikolaus Blin; Andreas Bress
Journal: J Appl Genet Date: 2015-01-06 Impact factor: 3.240

7. A Novel Mutation in a Critical Region for the Methyl Donor Binding in DNMT3B Causes Immunodeficiency, Centromeric Instability, and Facial Anomalies Syndrome (ICF).

Authors: Erez Rechavi; Atar Lev; Eran Eyal; Ortal Barel; Nitzan Kol; Sarit Farage Barhom; Ben Pode-Shakked; Yair Anikster; Raz Somech; Amos J Simon
Journal: J Clin Immunol Date: 2016-10-12 Impact factor: 8.317

8. Cofactors-loaded quaternary structure of lysine-specific demethylase 5C (KDM5C) protein: Computational model.

Authors: Yunhui Peng; Emil Alexov
Journal: Proteins Date: 2016-10-01

9. Misfolding of galactose 1-phosphate uridylyltransferase can result in type I galactosemia.

Authors: Thomas J McCorvie; Tyler J Gleason; Judith L Fridovich-Keil; David J Timson
Journal: Biochim Biophys Acta Date: 2013-04-11

10. Dancing through Life: Molecular Dynamics Simulations and Network-Centric Modeling of Allosteric Mechanisms in Hsp70 and Hsp110 Chaperone Proteins.

Authors: Gabrielle Stetz; Gennady M Verkhivker
Journal: PLoS One Date: 2015-11-30 Impact factor: 3.240