Peishan Huang1, Simon K S Chu1, Henrique N Frizzo2, Morgan P Connolly3, Ryan W Caster2, Justin B Siegel2,4,5. 1. Biophysics Graduate Group, University of California, Davis 95616, California, United States. 2. Genome Center, University of California, Davis 95616, California, United States. 3. Microbiology Graduate Group, University of California, Davis 95616, California, United States. 4. Department of Biochemistry & Molecular Medicine, University of California, Davis 95616, California, United States. 5. Department of Chemistry, University of California, Davis 95616, California, United States.
Abstract
Engineering proteins to enhance thermal stability is a widely utilized approach for creating industrially relevant biocatalysts. The development of new experimental datasets and computational tools to guide these engineering efforts remains an active area of research. Thus, to complement the previously reported measures of T 50 and kinetic constants, we are reporting an expansion of our previously published dataset of mutants for β-glucosidase to include both measures of T M and ΔΔG. For a set of 51 mutants, we found that T 50 and T M are moderately correlated, with a Pearson correlation coefficient and Spearman's rank coefficient of 0.58 and 0.47, respectively, indicating that the two methods capture different physical features. The performance of predicted stability using nine computational tools was also evaluated on the dataset of 51 mutants, none of which are found to be strong predictors of the observed changes in T 50, T M, or ΔΔG. Furthermore, the ability of the nine algorithms to predict the production of isolatable soluble protein was examined, which revealed that Rosetta ΔΔG, FoldX, DeepDDG, PoPMuSiC, and SDM were capable of predicting if a mutant could be produced and isolated as a soluble protein. These results further highlight the need for new algorithms for predicting modest, yet important, changes in thermal stability as well as a new utility for current algorithms for prescreening designs for the production of mutants that maintain fold and soluble production properties.
Engineering proteins to enhance thermal stability is a widely utilized approach for creating industrially relevant biocatalysts. The development of new experimental datasets and computational tools to guide these engineering efforts remains an active area of research. Thus, to complement the previously reported measures of T 50 and kinetic constants, we are reporting an expansion of our previously published dataset of mutants for β-glucosidase to include both measures of T M and ΔΔG. For a set of 51 mutants, we found that T 50 and T M are moderately correlated, with a Pearson correlation coefficient and Spearman's rank coefficient of 0.58 and 0.47, respectively, indicating that the two methods capture different physical features. The performance of predicted stability using nine computational tools was also evaluated on the dataset of 51 mutants, none of which are found to be strong predictors of the observed changes in T 50, T M, or ΔΔG. Furthermore, the ability of the nine algorithms to predict the production of isolatable soluble protein was examined, which revealed that Rosetta ΔΔG, FoldX, DeepDDG, PoPMuSiC, and SDM were capable of predicting if a mutant could be produced and isolated as a soluble protein. These results further highlight the need for new algorithms for predicting modest, yet important, changes in thermal stability as well as a new utility for current algorithms for prescreening designs for the production of mutants that maintain fold and soluble production properties.
A common goal of enzyme
engineering is the enhancement of thermal
stability.[1] For industrial applications,
improving a protein’s robustness to thermal challenges or half-life
at elevated temperatures can often be the deciding factor for the
commercialization of a biocatalyst.[2−5] Currently, the most common approach for
improving thermal stability is through directed evolution methodologies,[6,7] which can be time-consuming, costly, and limited in the ability
to search sequence space. Computational design tools to predictably
identify single and combinatorial mutations that enhance thermal stability
are rapidly developing and growing in popularity.[8−14] However, accurate predictions using computational tools to guide
protein stability design remain an active area of research and is
not always successful.The use of large datasets on the mutational
effect on protein stability,
such as ProTherm[15] now maintained by ProtaBank,[16] is often used to train computational methods
for predicting thermal stability. The datasets utilized generally
consist of the equilibrium constant of unfolding (Ku) or the melting temperature of an enzyme (TM).[17] In our previous study,
we determined the thermal stability of 79 β-glucosidase B (BglB)
variants by finding T50, a type of kinetic
stability that is determined by the temperature at which a mutant’s
residual activity is reduced by 50% after a heat challenge over a
defined time.[4,17,18] When analyzing this set of mutants using two established computational
programs (Rosetta ΔΔG and FoldX PSSM)
for predicting thermal stability, we found that there was no significant
correlation between the predictions and the observed T50.[19]One hypothesis
explaining the poor predictive performance of the
algorithms with the BglB dataset is that the algorithms are evaluated
on TM, a direct measure of structural
thermal stability. However, the algorithms were being used to predict T50, which is an indirect measure of the protein’s
thermal stability.[17] Alternatively, the
poor performance could have come from the narrow T50 range (extreme variants are +6.06 and −5.02
°C from the wild type (WT)) as the algorithms are generally benchmarked
on larger changes in thermal stability and ±3 °C may be
within the error of the currently developed algorithms. In this study,
we evaluated both hypotheses.To assess if there was a significant
difference in TM and T50, we developed a
dataset of 51 BglB mutants (Figure ) in which both thermal stability measurements, T50 and TM, were
measured. Interestingly, for the set of 51 measurements, there was
only a modest correlation between T50 and TM, with a Pearson coefficient correlation (PCC)
and Spearman’s rank correlation (SRC) of 0.58 and 0.47, respectively.
This highlights the difference in the physical properties being measured
using these two techniques, TM being the
thermal stability of the protein’s structural elements and T50 reporting on the thermal stability to irreversible
denaturation. However, similar to the previous study,[19] the relationship between the predicted stability with the
experimental TM only results in a weak
correlation not only with the previous algorithms evaluated (Rosetta
ΔΔG and FoldX PSSM) but also with five
other commonly used methods: ELASPIC, DeepDDG, PoPMuSiC, SDM, and
AUTO-MUTE. This result suggests that while the two measurements are
reporting on different physical properties, this is not the key factor
that led to the low predictive accuracy of established algorithms
on this dataset.
Figure 1
Structure of BglB (PDB ID: 2JIE) from the bacterium Paenibacillus
polymyxa. PyMOL rendering[20] of BglB showing the 28 sequence-positions (teal
spheres) of the 51 mutants chosen out of the original 92 previously
expressed proteins for the TM analysis.[19] The reaction scheme of the hydrolysis of 4-nitrophenyl
β-d-glucopyranoside by BglB used in the T50 study.[19]
Structure of BglB (PDB ID: 2JIE) from the bacterium Paenibacillus
polymyxa. PyMOL rendering[20] of BglB showing the 28 sequence-positions (teal
spheres) of the 51 mutants chosen out of the original 92 previously
expressed proteins for the TM analysis.[19] The reaction scheme of the hydrolysis of 4-nitrophenyl
β-d-glucopyranoside by BglB used in the T50 study.[19]To evaluate the second hypothesis, that the changes in thermal
stability of the BglB dataset are too small for current algorithms,
we investigated the ability of the algorithms to predict if a mutation
reduced thermal stability to the point that the protein could no longer
be produced and isolated in a soluble form. Analysis of the computational
algorithms to predict destabilization to the point where no soluble
protein could be isolated showed a significant enrichment based on
the calculated energetics of the mutants for several algorithms, the
most significant of which is for Rosetta ΔΔG. This supports the hypothesis that the lack of performance on the
BglB dataset is due to the narrow range in changes observed for thermal
stability. These slight molecular changes, especially interactions
that are less than 1 kcal/mol, are challenging to accurately model.
This highlights the need for new algorithms for predicting modest,
yet important, changes in thermal stability as well as a new utility
for current algorithms for prescreening designs for the production
of mutants likely to maintain protein structure and be produced as
a soluble protein.
Methods
Mutant Selection, Protein
Expression, and Purification
Out of 79 mutants of BglB that
were previously characterized with T50 data,[19] 51 variants
with plasmid readily available were transformed into chemically competent Escherichia coli BLR (DE3) cells. The variants were
produced and purified, as previously described.[14] Expression was carried out by growing a 5 mL overnight
culture in a 50 mL falcon tube with a breathable seal in Terrific
broth (TB) medium with kanamycin while shaking at 250 rpm at 37 °C.
After the initial overnight culture, cells were spun down and resuspended
in fresh TB with kanamycin with 1 mM isopropyl β-d-1-thiogalactopyranoside
in a 50 mL falcon tube with a breathable seal and incubated while
shaking at 250 rpm at 18 °C for 24 h. Then, the cells were spun
down, lysed, and purified using immobilized metal ion affinity chromatography,
as previously described.[19] The purity of
the protein samples was analyzed using 12–14% SDS-PAGE (Figure SI 1-1), and the yield was assessed based
on the A280 for proteins that appeared >75% pure in the SDS-PAGE
analysis.
Protein samples were considered expressed if they were detectable
in the SDS-PAGE analysis and greater than 0.10 mg/mL using A280, as
previously described.[19]
Melting Temperature
Assay
The melting temperature (TM) of BglB was determined using the Protein
Thermal Shift (PTS) kit (Applied Biosystems, from Thermo Fisher Scientific).
Standard protocols provided by the manufacturer were used. Protein
concentrations ranged from 0.1–0.5 mg/mL, and fluorescence
reading was monitored with a QuantaStudio 3 system from 20 to 90 °C.
The temperature melting curve was first smoothed with a 20 step sliding
window average (Script SI 2). TM was determined from the average of three to four replicates
at which the derivative was largest, and all melting curves can be
found in Figure SI 3.
ΔG Calculations from TM
Calculations were conducted, as previously
described.[21] First, we assumed that the
protein follows the two-state folding mechanism, a binary conversion
of native state to full denaturation. Second, to derive ΔG°unfolding, the fluorescence intensity
was first translated into fractions of folded (Pf) and unfolded (Pu) proteins of
the linear portion of the graph at different temperatures starting
from the minimum fluorescence (Fmin) to
the maximum fluorescence (Fmax) shown
in eq .By taking a two-state
folding–unfolding model, the equilibrium constant of unfolding
(Ku) at different temperatures is then
given byWe plotted ln Ku against 1/T using the van’t
Hoff method shown in eq (Script SI 2), the is defined by the slope, is the y-intercept, T is temperature, and R is the ideal gas
constant.The Gibbs free energy
of protein-unfolding can then be determined
using eq , where ΔG°unfolding is the unfolding energy at a TRT of 298 K. All calculations can be found in Script SI 2.
Molecular Modeling
Seven popular, readily accessible,
and recently developed molecular modeling methods, many of them force-field
and machine-learning-based, were evaluated for their ability to recapitulate
the experimental data: Rosetta ΔΔG,[22] FoldX,[23] ELASPIC,[24] DeepDDG,[25] PoPMuSiC,[26] SDM,[27] and AUTO-MUTE
(DDG).[28] The crystal structure of BglB
(PDB ID: 2JIE) was used across seven different algorithms. First, using a previously
described method,[19] the 2JIE structure was used
as input to the Rosetta ΔΔG application
and run, as previously described (Script SI 5). Briefly, 50 poses of the WT and the mutant were generated for
which 15 energy terms were reported from the score function used.[22] The three lowest system energy scores out of
50 from WT and the mutant were averaged to give the final Rosetta
ΔΔG score. Second, for the FoldX position-specific
scoring metric (PSSM) protocol, the 2JIE structure was first minimized for any
potential inaccurate rotamer assignment using the RepairPDB application.[23] The repaired PDB structure was mutated with
single-point mutants and then modeled using FoldX PSSM. The model
was scored based on 17 terms within the FoldX force-field.[23] Third, the ELASPIC protocol first constructed
a homology model of the WT using the crystal structure, sequence,
molecular, and energetics information. Using the standard procedure
described, the FoldX algorithm was used to construct the mutant model.
The final mutational change is predicted using Stochastic Gradient
Boosted Decision Trees based on the energetic, chemical, and structural
features from FoldX.[24,29] Fourth, using a curated dataset
derived from the Protherm database,[15] DeepDDG
used their previously described shared residue pair neural network
structure to make a prediction of stability.[25] The DeepDDG output indicated that >0 kcal/mol could be considered
stable, whereas <0 kcal/mol could be considered unstable. Fifth,
PoPMuSiC estimated the stability of the WT structure and mutants using
13 statistically potential terms, and an additional two terms that
account for the volume differences of the residues between WT and
the mutant.[26] Sixth, the SDM method evaluated
mutational changes using a statistical potential energy function based
on environment-specific substitution tables. These tables consisted
of data such as structural information, solvent accessibility of the
sidechain, and hydrogen bonding.[27] Lastly,
similar to SDM, the seventh method, AUTO-MUTE, which predicts for
ΔTM and ΔΔG, utilized a statistical potential to calculate the environmental
changes of the residue compared to the WT.[28] The protocol was performed using tree regression at 23 °C and
pH 7.5.Apart from predicting ΔΔG, two additional methods were used to evaluate the algorithms’
ability to predict ΔTM changes.
As mentioned above, the AUTO-MUTE prediction using the Stability Changes
(ΔTM) protocol was performed with
tree regression. Also, HotMuSiC was used to evaluate the mutational
effect with the temperature-dependent potential and other statistical
potential terms such as solvent accessibility, structural, and sequence-based
information.[30]Pearson correlation
coefficient (PCC) and Spearman’s rank
correlation (SRC) analyses were performed between their respective
ΔΔG (ΔΔG = ΔGmutants – ΔGWT) or the change in total system energy (ΔTSE)
of the nine computational methods. Additionally, the available individual
features within the Rosetta ΔΔG and FoldX
PSSM force field were further evaluated against the TM dataset for correlation.Finally, ΔTSE was
evaluated against mutants that could be
isolated as a soluble protein and those that lost structural integrity
and either precipitated or were degraded and therefore could no longer
be isolated as a soluble protein (nonisolated). A Student’s t-test was used to obtain p-values for
the nine computational methods. The two categories between isolated
and nonisolated protein were treated as an independent sample using
an unequal variance.
Results
Evaluating the Relationship
between TM and T50
To the best of our
knowledge, there has not been a large dataset (>50 data points)
directly
comparing the TM and T50 relationship for a single set of protein mutants uniformly
produced and characterized. It is important to distinguish both TM and T50 methods
since the measurements are quantifying and reporting different structural
and functional properties. TM is the temperature
at which half the enzyme is found in the unfolded state over the folded
state.[17,31] This is often evaluated through denaturation
assays, where the thermodynamic measurements (ΔGunfolding) can be obtained.[31] This method is generally a lower throughput method as purified protein
is required to obtain an accurate measurement for the structural properties
for the mutant being evaluated. T50 measures
the temperature of half-inactivation that leads to irreversible unfolding,[11,32] and it is determined by the reduction of half of the enzymatic activity
due heat challenges.[17] This is a very common
assay for protein engineering due to its compatibility with high-throughput
assays and the ability to use cell lysates to evaluate function.To complement our previously measured dataset of T50, 51 of the 92 expressed proteins with available plasmids[19] were selected and evaluated for TM using the Protein Thermal Shift assay to compare T50 and TM. The WT
BglB TM was determined to be 45.97 ±
1.03 °C, while the previously determined T50 was 39.9 ± 0.1 °C.[19] When evaluating the entire dataset, the TM ranged between 37.1 and 54.3 °C, slightly larger than what
was observed for T50, which was between
34.9 and 46.0 °C (Figure SI 1-2).
The variant that had the highest TM in
this dataset was E167A, with a ΔTM of 54.3 °C (+8.33 °C), which was also observed to have
a similar increase in T50 compared with
the WT (+6.06 °C).[19] The variant that
had the lowest TM in this dataset was
found to be E225A, with a ΔTM of
−8.9 °C, which had a corresponding T50 of −3.1 °C.The relationship between T50 and TM is plotted
in Figure A. The PCC
and SRC of 0.58 and 0.47, respectively,
indicate that the two methods are moderately positively correlated.
Correlation between methods increased in cases where mutations resulted
in extremely stable and unstable products, for example, E167A and
E225A, respectively. This is an expected result for small changes
(<3 °C) in thermal stability; the differences in measurement
methods would be expected to play a more significant role than for
larger changes (>5 °C). The evaluation of ΔTM and ΔT50 with experimentally
derived ΔΔG is also plotted in Figure B,C, respectively.
The PCC and SRC show that the TM method
and experimentally derived ΔΔG are strongly
correlated (PCC and SRC of −0.76), compared to those between
ΔΔG and T50 (PCC of −0.35 and SRC of −0.24).
Figure 2
Comparison of two different
experimental thermal stability datasets
and experimentally derived ΔΔG. (A) Relationship
for each mutant between T50 and TM. The PCC of 0.58 illustrates that the two
methods are modestly positively correlated with mutations that are
in the extreme ends of the temperature range (±5 °C). (B)
Evaluation of ΔTM with experimentally
derived ΔΔG shows the two qualities are
highly correlated (PCC = −0.76), unlike (C) where the relationship
between ΔT50 and experimentally
derived ΔΔG has a PCC of −0.35.
Comparison of two different
experimental thermal stability datasets
and experimentally derived ΔΔG. (A) Relationship
for each mutant between T50 and TM. The PCC of 0.58 illustrates that the two
methods are modestly positively correlated with mutations that are
in the extreme ends of the temperature range (±5 °C). (B)
Evaluation of ΔTM with experimentally
derived ΔΔG shows the two qualities are
highly correlated (PCC = −0.76), unlike (C) where the relationship
between ΔT50 and experimentally
derived ΔΔG has a PCC of −0.35.
Evaluating Computational Stability Tools
Using the BglB TM Dataset
The
computational evaluation
of protein stability of the current experimental TM dataset was analyzed in the same manner as our previous
study on T50.[19] An energetically evaluated model for each mutant was generated using
established computational methods and subsequently plotted as a function
of TM to evaluate the calculated energies
related to the observed TM. The PCC and
SRC for the most commonly assessed term, the ΔTSE, was found
to be highest for FoldX PSSM (PCC of −0.34 and SRC of −0.35)
with ΔTM (Figure ). Similarly, the FoldX PSSM correlations
with experimentally derived ΔT50 data were found to be −0.21 and −0.16 for PCC and
SRC, respectively. The overall relationship between the ΔTSE
and the ΔTM thermal stability dataset
slightly improved for FoldX, DeepDDG, PoPMuSiC, and AUTO-MUTE (Figure SI 1-3), while Rosetta ΔΔG and ELASPIC remained relatively unchanged with no significant
correlation. Interestingly, SDM was the only method where the correlation
with ΔT50 is stronger than that
of ΔTM (Figure SI 1-3).
Figure 3
Evaluation of the algorithms ΔTSE versus the experimentally
derived ΔΔG and the TM and T50 datasets. The Pearson
correlation coefficient and Spearman’s rank correlation for
each performance against three types of experimental data were determined.
Five representative comparisons are illustrated above, with four additional
algorithms, SDM, AUTO-MUTE (DDG), AUTO-MUTE (ΔTM), and HoTMuSiC provided in Figure SI 1-3. No algorithm resulted in a significant correlation
between the calculated energies and the observed TM, T50, or ΔΔG for this dataset.
Evaluation of the algorithms ΔTSE versus the experimentally
derived ΔΔG and the TM and T50 datasets. The Pearson
correlation coefficient and Spearman’s rank correlation for
each performance against three types of experimental data were determined.
Five representative comparisons are illustrated above, with four additional
algorithms, SDM, AUTO-MUTE (DDG), AUTO-MUTE (ΔTM), and HoTMuSiC provided in Figure SI 1-3. No algorithm resulted in a significant correlation
between the calculated energies and the observed TM, T50, or ΔΔG for this dataset.An analysis of individual energetic term from Rosetta ΔΔG and FoldX PSSM did not uncover any specific feature in
either method’s energetic evaluation that was strongly correlated
with the TM dataset, as was previously
observed for the T50 dataset[19] (Figure SI 4). The
strongest PCC for TM against any of the
available energetic terms was 0.39 for the Δbackbone clash term
from FoldX PSSM and −0.31 for the Omega energy term from Rosetta
ΔΔG. To be consistent with the previous
performance assessment, we also evaluated the algorithms on experimentally
derived ΔΔG in this dataset (Figure ). The PCC and SRC
of 0.39 and 0.36, respectively, between experimental ΔΔG and ΔTSE for FoldX PSSM outperformed six other algorithms
that were compared. The correlation between experimental ΔΔG with ΔTSE was not unexpected as TM showed a correlation with ΔΔG with a PCC and SRC of −0.76 (Figure B). Analysis of AUTO-MUTE and HotMuSiC to
predict for ΔTM revealed no significant
correlation with the experimental ΔTM (Figure SI 1-3).Based on this
analysis, it is apparent that the general performance
of all given methods at best only weakly correlates with the experimentally
determined effects of the mutations. This data fails to support the
hypothesis that the lack of a previously observed correlation of these
established computational tools with observed changes in thermal stability
in the BglB dataset is due to the difference in the physical property
being measured.
Prediction of Mutant Soluble Expression
The current
dataset consists primarily of modest changes in thermal stability
of <5 °C, calculated to be ±4 kcal/mol of the WT, and
therefore may be challenging for current computational methods to
predict. However, this change has only been analyzed in a fraction
of the 129 mutants tested in the overall BglB dataset. Of the 129
mutants, only 92 were found to be produced and isolated in a soluble
form. All purification procedures are conducted at ∼20 °C.
Since the WT has a T50 of 39.9 °C,
any reduction in T50 of >18 °C
would
result in a loss of structural stability from which insoluble aggregates
or proteolytic degradation would readily occur during production and
purification. In this case, the proteins would no longer be able to
be isolated in a soluble form similar to the WT protein. Therefore,
it seemed pertinent to evaluate if any of the nine algorithms could
differentiate variants in this dataset that could be isolated as a
soluble protein versus those that were not able to be separated as
a soluble protein.For this evaluation, all of the previously
reported 129 mutants were assessed using the nine algorithms following
the same methods used for T50 and TM. A mutant was generally considered soluble
if it was observed on an SDS-PAGE analysis and had an A280 >0.1
mg/mL.
The WT protein produced using the methods described generally resulted
in an average A280 of 1.5 mg/mL, which would provide a >10-fold
change
in yield for mutants having an A280 less than 0.1 mg/mL. While factors
other than thermal stability can affect production and isolation of
soluble protein, in this case, it is assumed that the primary factor
that decreases soluble protein yield is from denaturation of the mutant
protein either during expression or purification. The results of this
analysis are presented in Figure .
Figure 4
Computational prediction for the effect on mutant soluble
protein
production using nine different algorithms. From left to right: Rosetta
ΔΔG, FoldX PSSM, ELASPIC, DeepDDG, PoPMuSiC,
SDM, AUTO-MUTE (DDG), AUTO-MUTE (ΔTM), and HoTMuSiC of soluble (green) and nonisolated protein (blue).
In this case, mutants that resulted in a significant (>10-fold)
decrease
in yield of purified soluble protein are considered nonisolatable.
Significance in population differences was determined using a Student’s t-test. The units (ΔTSE and ΔTM) of all algorithms are individually normalized between
1 to −1. For visualization purposes, outliers were omitted
after normalization. Each graph without normalization and with outliers
can be found in Figure SI 1-4 and all raw
values in Figure SI 4.
Computational prediction for the effect on mutant soluble
protein
production using nine different algorithms. From left to right: Rosetta
ΔΔG, FoldX PSSM, ELASPIC, DeepDDG, PoPMuSiC,
SDM, AUTO-MUTE (DDG), AUTO-MUTE (ΔTM), and HoTMuSiC of soluble (green) and nonisolated protein (blue).
In this case, mutants that resulted in a significant (>10-fold)
decrease
in yield of purified soluble protein are considered nonisolatable.
Significance in population differences was determined using a Student’s t-test. The units (ΔTSE and ΔTM) of all algorithms are individually normalized between
1 to −1. For visualization purposes, outliers were omitted
after normalization. Each graph without normalization and with outliers
can be found in Figure SI 1-4 and all raw
values in Figure SI 4.Of the nine algorithms evaluated, Rosetta ΔΔG, FoldX, DeepDDG, PoPMuSiC, and SDM can capture the enrichment
of mutants isolated as a soluble protein. The differences were evaluated
for statistical significance using the Student’s t-test, and the highest among the top five methods was shown for Rosetta
ΔΔG with a p-value of
1.0 × 10–5. In contrast, enrichment was lower
for ELASPIC, AUTO-MUTE (DDG), AUTO-MUTE (ΔTM), and HotMuSiC with p-values of 0.06,
0.38, 0.65, and 0.07, respectively.A few outliers were observed
in all methods, except for ELASPIC
(Figure SI 1-4). For example, the mutant
G15N for both Rosetta ΔΔG and FoldX PSSM
was identified as severely energetically unfavorable, which is consistent
with the observation that this variant was not able to be isolated
as a soluble protein.
Discussion
Both TM and T50 are methods commonly
used to quantify different physical
aspects of protein thermal stability; however, to date, there has
been relatively little experimental data collected to empirically
evaluate the relationship of these two measurements. Using a dataset
of 51 protein mutants, we observed that there is a moderate positive
correlation (PCC of 0.58 and SRC of 0.47) between these two properties.
The theory comparing two methods has been extensively described in
the work of Hei and Clark.[33] Briefly, T50 can only be used to assess the temperature
at which half of the protein is irreversibly unfolded. Meanwhile, TM provides information on the folded state of
the protein regardless of whether or not the unfolding events are
irreversible. Therefore, it is not surprising that there is only a
moderate correlation between the relationship of T50 and TM.Mutants with
extreme stability changes, such as E164A (>6 °C),
usually exhibit a similar magnitude of change in TM and T50 results. However,
the majority of the mutants show a change of ∼3 °C or
less in this TM and T50 dataset being analyzed, a range in which the relationship
between TM and T50 appears to be weaker. Therefore, analysis with larger datasets
with more extreme stability changes may reveal an even stronger correlation
between these two properties.The relationship between ΔTM and
the experimentally derived ΔΔG of this
dataset (PCC and SRC of −0.76) is not expected to reach a perfect
correlation since it is dependent on the temperature at which ΔΔG was evaluated, as described by Pucci et al.[34] For example, the ΔΔG evaluated at TM of the WT will yield
a correlation closer to −1 and ΔΔG(25°C) will lead to a lower correlation (−0.68).[34]Consistent with our previous analysis,
we found a lack of performance
using established computational tools when predicting TM and T50 from the WT for
this dataset. According to Jia et al., stability prediction using
the experimentally derived free energy change of unfolding ΔΔG (kcal/mol) outperforms the prediction using ΔTM (°C).[35] However,
in this case, we saw no significant change in the predictive performance
for all seven computational tools compared to the experimentally derived
free energy change. In addition, we found TM and ΔΔG to be strongly correlated with
this dataset, which may suggest that the improved performance is only
relevant for more diverse datasets composed of different proteins
as opposed to mutants of a single protein.While none of the
computational methods demonstrated a strong predictive
power for the mutants in this study, Rosetta ΔΔG, FoldX PSSM, ELASPIC, and PoPMuSiC all have previously
been shown to have high correlations with experimental data (PCC between
0.69 to 0.83).[22,26,29,36] This dataset with an experimental ΔΔG range of ±∼4 kcal/mol is within the majority
of the mutants observed in the algorithms that were typically evaluated
on (+8 to −5 kcal/mol).[17,25,26,29] One potential reason for the
lack of performance could be that the structure used in this dataset
has a ligand bound structure, and often, the structures used in the
development of the methods were apoprotein structures. However, using
the PDBFlex database,[37] a clustering of
five available PDBs of BglB from the bacterium P. polymyxa showed an average RMSD of 0.234 and a maximum RMSD of 0.274, thus
making BglB a rigid structure. As there are no significant structural
changes between the apo-form and holo-form of the protein, it is unlikely
that the exact structure used for this study resulted in the low level
of performance by the algorithms. Another possibility is that the
protein evaluated here (BglB) is an outlier when compared to the proteins
used to develop the algorithms in terms of its structure–function
relationship. However, a related study to our analysis has been conducted
for humansuperoxide dismutase 1 in which a low correlation is observed
between experimental and predicted stability.[38] This further validates that current algorithms have limited utility
for proteins outside of those they were benchmarked on. This limitation
hindered by an over-representation of protein families such as lysozyme,
tryptophan synthase, and ribonuclease in curated datasets is often
utilized in benchmarking.[39] Thus, this
highlights the importance of generating high-quality and diverse datasets
of more proteins for evaluating and training new computational tools.This study underlines the need for new computational tools that
can more accurately predict modest changes, rather than major changes,
in thermal stability. This becomes particularly important because
single-point mutants often increase thermal stability by a few degrees
at a time, while major changes are more often produced from the synergistic
effect of combining multiple mutations.[11,40−42] Furthermore, as larger datasets of protein mutants with explicitly
measured biophysical properties are generated, opportunities to explore
combinations of molecular modeling and machine learning methods will
become practical. These algorithms and datasets will enable the development
of robust predictors of thermal stability.
Authors: Luis Cruz; Brigita Urbanc; Jose M Borreguero; Noel D Lazo; David B Teplow; H Eugene Stanley Journal: Proc Natl Acad Sci U S A Date: 2005-12-09 Impact factor: 11.205
Authors: Jesse D Bloom; Sy T Labthavikul; Christopher R Otey; Frances H Arnold Journal: Proc Natl Acad Sci U S A Date: 2006-03-31 Impact factor: 11.205
Authors: Fangqiang Zhu; Feliza A Bourguet; William F D Bennett; Edmond Y Lau; Kathryn T Arrildt; Brent W Segelke; Adam T Zemla; Thomas A Desautels; Daniel M Faissol Journal: Sci Rep Date: 2022-07-21 Impact factor: 4.996
Authors: Guangyue Li; Youcai Qin; Nicolas T Fontaine; Matthieu Ng Fuk Chong; Miguel A Maria-Solano; Ferran Feixas; Xavier F Cadet; Rudy Pandjaitan; Marc Garcia-Borràs; Frederic Cadet; Manfred T Reetz Journal: Chembiochem Date: 2020-11-17 Impact factor: 3.164