| Literature DB >> 25032839 |
Fabrizio Pucci1, Marianne Rooman1.
Abstract
The unraveling and control of protein stability at different temperatures is a fundamental problem in biophysics that is substantially far from being quantitatively and accurately solved, as it requires a precise knowledge of the temperature dependence of amino acid interactions. In this paper we attempt to gain insight into the thermal stability of proteins by designing a tool to predict the full stability curve as a function of the temperature for a set of 45 proteins belonging to 11 homologous families, given their sequence and structure, as well as the melting temperature (Tm) and the change in heat capacity (ΔCP) of proteins belonging to the same family. Stability curves constitute a fundamental instrument to analyze in detail the thermal stability and its relation to the thermodynamic stability, and to estimate the enthalpic and entropic contributions to the folding free energy. In summary, our approach for predicting the protein stability curves relies on temperature-dependent statistical potentials derived from three datasets of protein structures with targeted thermal stability properties. Using these potentials, the folding free energies (ΔG) at three different temperatures were computed for each protein. The Gibbs-Helmholtz equation was then used to predict the protein's stability curve as the curve that best fits these three points. The results are quite encouraging: the standard deviations between the experimental and predicted Tm's, ΔCP's and folding free energies at room temperature (ΔG25) are equal to 13° C, 1.3 kcal/(mol° C) and 4.1 kcal/mol, respectively, in cross-validation. The main sources of error and some further improvements and perspectives are briefly discussed.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25032839 PMCID: PMC4102405 DOI: 10.1371/journal.pcbi.1003689
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Figure 1Stability curves of thermostable and mesostable proteins.
(a,b,c) Different strategies of thermal adaptation of hypothetic proteins. (d) Comparison between the stability curve of Tk-MGMT (PDB [47] code 1GMT) and its mesophilic counterpart Ec-AdaC (PDB code 1SFE) [32].)
Figure 2Flowchart of the protein stability curve prediction method.
Figure 3Predicted stability curves of the 45 proteins considered, which belong to 11 homologous families.
The PDB codes, the host organisms and their environmental temperatures of all the proteins are given in the following list: (a) 2vh7 (Homo sapiens, 37 ), 2bjd (Sulfolobus solfataricus, 80 ), 1v3z (Pyrococcus horikoshii, 98 ). (b) 1am7 (Bacteriophage lambda, 37 ), 2lzm (Escherichia coli, 37 ), 1lz1 (Homo sapiens, 37 ), 1am7 (Gallus gallus, 41 ). (c) 1aqh (Alteromonas haloplanktis, 26 ), 1ppi (Sus scrofa, 39 ), 1jae (Tenebrio molitor, 28 ), 1smd (Homo sapiens, 37 ). (d) 2fal (Aplysia limacina, 17 ), 1ymb (Equus caballus, 38 ), 1bvc (Physeter catodon, 35 ). (e) 1blc (Staphylococcus aureus, 34 ), 1ke4 (Escherichia coli, 37 ), 4blm (Bacillus licheniformis, 43 ), 1bmc (Bacillus cereus, 30 ). (f) 1hml (Homo sapiens, 37 ), 1hfz (Bos taurus, 38 ), 1hmk(Capra hircus, 39 ). (g) 1p3j (Bacillus subtilis, 37 ), 3fb4 (Jeotgalibacillus marinus, 18 ), 1s3g (Bacillus globisporus, 15 ), 1aky (Saccharomyces cerevisiae, 28 ), 1ank (Escherichia coli, 37 ), 1zip (Bacillus stearothermophilus, 51 ). (h) 1oa3 (Hypocrea schweinitzii, 40 ), 1h8v (Thrichoderma reesei, 35 ), 1oa4 (Streptomyces sp. 11ag8, 30 ), 1olr (Humicola grisea, 50 ), 1cec (Clostridium thermocellum, 60 ). (i) 1csp (Bacillus subtilis, ), 1mjc (Escherichia coli, 37 ), 1c9o (Bacillus caldolyticus, 70 ). (j) 1bu7 (Bacillus megaterium, 30 ), 1oxa (Saccharopolyspora erythraea, 31 ), 1akd (Pseudomonas putida, 30 ), 1n97 (Thermus thermophilus, 68 ), 1f4t (Sulfolobus solfataricus, 78 ). (k) 1rgg (Streptomyces aureofaciens, 28 ), 9rnt (Aspergillus Oryzae, 49 ), 1rnh (Escherichia coli, 37 ), 1rbn (Bos taurus, 38 ), 2ehg (Sulfolobus tokodaii, 80 ).
Standard deviation () and linear correlation coefficient () between the experimental and predicted thermal and thermodynamic parameters.
| Parameter |
|
| r | r | N (N | P-value |
|
| 13.4 | 10.2 | 0.69 | 0.76 | 45 (40) |
|
|
| 1.3 kcal/(mol | 0.7 kcal/(mol | 0.92 | 0.41 | 17 (15) |
|
|
| 4.1 kcal/(mol) | 2.6 kcal/(mol) | 0.42 | 0.69 | 16 (14) | 0.05 |
In the computation of and , the 10% worst predicted proteins are excluded. N is the number of proteins for which experimental data are available and the results are computed.
Figure 4Comparison between: (a) the experimental and predicted melting temperatures (in °C), (b) the experimental and computed (in kcal/(mol °C)) and (c) the experimental and the predicted (in kcal/mol), for the set of 45 proteins belonging to the 11 homologous families.
The straight lines correspond to the bisector of the first quadrant (y = x).