Supriyo Bhattacharya1, Sangbae Lee1, Reinhard Grisshammer2, Christopher G Tate3, Nagarajan Vaidehi1. 1. Division of Immunology, Beckman Research Institute of the City of Hope , 1500 East Duarte Rd, Duarte, California 91010, United States. 2. Membrane Protein Structure Function Unit, National Institute of Neurological Disorders and Stroke, National Institutes of Health , Department of Health and Human Services, Rockville, Maryland 20852, United States. 3. MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus , Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom.
Abstract
G protein-coupled receptors (GPCRs) are highly dynamic and often denature when extracted in detergents. Deriving thermostable mutants has been a successful strategy to stabilize GPCRs in detergents, but this process is experimentally tedious. We have developed a computational method to predict the position of the thermostabilizing mutations for a given GPCR sequence. We have validated the method against experimentally measured thermostability data for single mutants of the β1-adrenergic receptor (β1AR), adenosine A2A receptor (A2AR) and neurotensin receptor 1 (NTSR1). To make these predictions we started from homology models of these receptors of varying accuracies and generated an ensemble of conformations by sampling the rigid body degrees of freedom of transmembrane helices. Then, an all-atom force field function was used to calculate the enthalpy gain, known as the "stability score" upon mutation of every residue, in these receptor structures, to alanine. For all three receptors, β1AR, A2AR, and NTSR1, we observed that mutations of hydrophobic residues in the transmembrane domain to alanine that have high stability scores correlate with high experimental thermostability. The prediction using the stability score improves when using an ensemble of receptor conformations compared to a single structure, showing that receptor flexibility is important. We also find that our previously developed LITiCon method for generating conformation ensembles is similar in performance to predictions using ensembles obtained from microseconds of molecular dynamics simulations (which is computationally hundred times slower than LITiCon). We improved the thermostability prediction by including other properties such as residue-based stress and the extent of allosteric communication by each residue in the stability score. Our method is the first step toward a computational method for rapid prediction of thermostable mutants of GPCRs.
G protein-coupled receptors (GPCRs) are highly dynamic and often denature when extracted in detergents. Deriving thermostable mutants has been a successful strategy to stabilize GPCRs in detergents, but this process is experimentally tedious. We have developed a computational method to predict the position of the thermostabilizing mutations for a given GPCR sequence. We have validated the method against experimentally measured thermostability data for single mutants of the β1-adrenergic receptor (β1AR), adenosine A2A receptor (A2AR) and neurotensin receptor 1 (NTSR1). To make these predictions we started from homology models of these receptors of varying accuracies and generated an ensemble of conformations by sampling the rigid body degrees of freedom of transmembrane helices. Then, an all-atom force field function was used to calculate the enthalpy gain, known as the "stability score" upon mutation of every residue, in these receptor structures, to alanine. For all three receptors, β1AR, A2AR, and NTSR1, we observed that mutations of hydrophobic residues in the transmembrane domain to alanine that have high stability scores correlate with high experimental thermostability. The prediction using the stability score improves when using an ensemble of receptor conformations compared to a single structure, showing that receptor flexibility is important. We also find that our previously developed LITiCon method for generating conformation ensembles is similar in performance to predictions using ensembles obtained from microseconds of molecular dynamics simulations (which is computationally hundred times slower than LITiCon). We improved the thermostability prediction by including other properties such as residue-based stress and the extent of allosteric communication by each residue in the stability score. Our method is the first step toward a computational method for rapid prediction of thermostable mutants of GPCRs.
G protein-coupled receptors
(GPCRs) are seven helical transmembrane
(TM) proteins that form the largest class of drug targets. Therefore,
knowledge of the three-dimensional structure and dynamics of GPCRs
will greatly aid drug discovery. However, crystallization of GPCRs
has been a challenging process, since these receptors are dynamic
and exist in multiple conformations and are often not stable in detergents
that are needed for solubilization.[1]One of the two main strategies to crystallize GPCRs is to increase
the hydrophilic area of the receptor by either making fusions in cytoplasmic
loop 3 to stable soluble proteins such as T4 lysozyme (T4L)[2] or apo-cytochrome b (BRIL),[3] or by binding a receptor-specific heavy chain
antibody (nanobody) to the intracellular surface of the receptor.[4] Receptor-T4L and receptor-BRIL fusions have similar
stabilities to the wild type receptor and therefore require the binding
of a high affinity ligand to both stabilize the receptor and lock
it in a single conformational state. In contrast, nanobodies that
bind specifically to the activated state of the receptor impart considerable
thermostability and also lock the receptor in a single conformation.
The alternative strategy to the use of receptor-fusion protein technology
to crystallize GPCRs, is to derive thermostabilized mutant receptors
by introducing specific point mutations into the wild type receptor
sequence.[5] The thermostabilized receptor
shows increased stability and is preferentially in a single conformational
state, and it can therefore be crystallized bound to ligands that
bind even with low affinity. So far several GPCRs have been crystallized
using this strategy, including receptors in class A,[6,5b,7,8] class
B,[9] and class C.[10] Although deriving thermostable mutants is a proven approach for
the structure determination of GPCRs, it remains a challenging task
since the mutations are not transferable between receptors unless
there is considerable homology between the receptors.[11]The original experimental procedure for thermostabilization
was
based on Ala/Leu scanning mutagenesis where every residue was mutated
to Ala (except for Ala residues that were mutated to Leu), each mutant
was expressed and its thermostability was determined relative to the
wild type receptor.[5] Once thermostabilizing
mutations were identified, they were mutated to other amino acid residues
to check if further improvements in thermostability were obtained.
The best thermostabilizing mutations were then combined to give the
optimally stable mutant. This is a time-consuming process, so if we
can understand the structural and kinetic basis for why the mutations
were thermostabilizing,[12] then it might
be possible to predict which residues in a model of a GPCR structure
are thermostabilizing, thus facilitating an accelerated route to thermostabilization
and structure determination.[13]Computational
approaches in designing thermostable mutants for
several proteins are based on bioinformatics techniques to structure
based methods involving statistical[14] and
semiempirical potentials.[15] Knowledge based
potential functions were used to predict thermostable mutations for
enzymes.[14b,14a,14c] In these works, the energy functions used in determining mutant
stabilities were originally developed for globular protein design.
Alchemical free energy calculations in conjunction with molecular
dynamics (MD) simulations were used to design thermostable mutants
for the bacterial blue light photoreceptor.[15] These methods do not take into account the effects of the membrane
environment and are computationally slow. GPCRs are dynamic and they
exist in multiple conformations in the absence of agonists and antagonists.
Thus, computational methods to predict thermostability need to incorporate
protein flexibility in the algorithm. Also, due to the relatively
sparse number of GPCR crystal structures, knowledge based potentials
are not robust for predicting thermostable GPCR mutants. Recently,
Chen et al. used bioinformatics and three-dimensional structural information
to design thermostable mutants of the β1AR.[16] They targeted nonconserved polar residues that
lack hydrogen bond contacts, and hydrophobic residues with packing
defects, and mutated these residues for increased stability. Their
method works well for mutations that convert a small amino acid to
a larger one. However, given that most experimental strategies for
GPCR thermostabilization rely on Ala/Leu scanning mutagenesis, it
is unclear whether the above method could predict such mutations.We have developed a rapid screening computational method, LITiConDesign (by extending the computational method LITiCon
for GPCR conformational sampling[17]), for
predicting single point alanine mutations to thermostabilize GPCRs.
Systematic computational alanine scanning on the TM residues was performed
on an ensemble of receptor conformations generated using a structural
model of the receptor. The stability of the mutant was scored using
the sum of torsional energy and van der Waals (vdW) packing energy
calculated using an all-atom force field energy function. The preliminary
version of this method and validation was published previously (Balaraman
et al. 2010). The computations can be distributed over a computing
cluster and scanning the entire sequence of a given receptor takes
a few hours. We have tested and validated our method using homology
models of varying accuracy for about 400 mutations in three GPCRs:
β1AR, A2AR, and NTSR1. Our method shows
the best predictions for homology models of the target GPCR sequence
derived using high homologous template structures. The method identified
the most stable thermostable mutants in the top 35% of the predicted
mutants.
Methods
Thermostability Score Calculation for Single
Point Mutants
The thermostability scores for the single point
mutants in the LITiConDesign were calculated as the
difference in the total
potential energy of the mutant and the wild type receptor. The thermostability
score for each mutant was calculated as (EWT–Emut), where EWT is the sum of vdW energy and torsional energy
components of the potential energy of the wild type receptor and Emut is the same for the mutant. The potential
energy was calculated either for a single structure of the wild type
and the mutant receptors or for an ensemble of conformations generated
independently for both the wild type and the mutant receptors. The
ensemble of receptor conformations was generated using the computational
method LITiCon (Ligand Induced Transmembrane Conformational Change).
The details of the LITiCon method have been published elsewhere.[17,18] We describe the method briefly as applied here. Starting from an
initial receptor structure, all the seven TM helices were simultaneously
rotated about the helical axis between ±5° in 10° increment,
thus generating 27 = 128 conformations. Next, in each of
the 128 conformations thus obtained, every hydrophobic amino acid
in the transmembrane (TM) region was mutated to alanine (except alanine,
which was mutated to leucine), thus generating 128N conformations (N = number of mutations tested).
Side chains of each conformation of each mutant were then optimized
using the SCWRL4 side chain optimization program,[19] followed by potential energy minimization in the CHARMM27
all-atom force field[20] for 2000 steps.
The minimization and the energy calculation were performed using the
program NAMD.[21] The thermostability score
used just the vdW energy and the torsional energy components of the
total energy. We calculated the thermostability scores for residues
in the human adenosine A2A receptor (A2AR), avian β1 adrenergic
receptor (β1AR), and rat neurotensin
receptor 1 (NTSR1) for which there are experimental stability scores
available.[5c,22] We calculated the thermostability
scores for the single point mutants of the hydrophobic residues to
alanine in the TM regions for these three receptors starting from
their respective crystal structures. We also performed the thermostability
score calculations starting from the homology models of these receptors,
as in the real case scenario one would need to make these predictions
from homology models prior to experiments.
Molecular Dynamics Simulations
of GPCR Crystal Structures
To understand the effect of lipid
packing on the thermostability
predictions, we performed molecular dynamics (MD) simulations on the
crystal structures and homology models of the wild type of each of
β1AR, A2AR, and NTSR1. The thermostability
calculations were performed using the last conformation of the MD
trajectory as well as from the most populated cluster obtained from
clustering analysis of the MD trajectories as detailed below.The starting conformations of β1AR, A2AR, and NTSR1 were obtained from the crystal structures of thermostabilized
β1AR (PDB ID: 2VT4),[6b] A2AR (PDB ID: 3PWH),[8] and NTSR1 (PDB ID: 4GRV).[7] Since the crystal structures are those of thermostabilized
mutants, the receptor structures were mutated to the corresponding
wild type sequences and solvated in explicit palmitoyl-oleoyl-phosphatidylcholine
(POPC) lipid, water, and ions. The packing of the lipid molecules
was performed using the inflategro package in GROMACS.[23] Each system was equilibrated by performing 200
ps of MD at 310 K using a NVT ensemble followed by 5 ns of MD under
NPT conditions at a pressure of 1 bar. The protein and ligand were
kept in place during these equilibration steps using position restraints.
After equilibration to the expected temperature and pressure, a total
of 10 production simulations of up to 100 ns were performed for each
initial conformation with different initial velocities using the NVT
ensemble. The MD simulations were performed with the GROMOS96 force
field[24] using the software package GROMACS,
using a 2 fs time step. A cutoff distance of 12 Å for nonbond
interactions was introduced, and PME (particle mesh Ewald) method[25] was used for long-range vdW interactions. Principal
coordinate analysis (PCA) was performed on each MD trajectory using
only the backbone atoms of TM helix 2 through 7. The loops as well
as TM1 were found to be highly flexible in the MD simulations and
were thus excluded from the PCA. We used the g_covar module of GROMACS to perform the PCA. The receptor conformations
from the MD trajectories were then clustered in the principal coordinate
space using k-means clustering. The conformation corresponding to
the cluster center of the most populated cluster from each trajectory
was used to perform thermostability calculations.The GPCR homology
models were equilibrated in an explicit lipid
environment using the software package NAMD.[21] The receptor structures were solvated in explicit POPC lipid, water,
and ions using the VMD software. Each system was equilibrated for
10 ns at 310 K in a NVT ensemble. The thermostability calculations
were performed starting from the last frame of the MD trajectories
and using LITiCon subsequently to generate the conformational ensemble.
Calculation of Residue Based Stress
To improve the
thermostabile mutant predictions, we further analyzed factors other
than the enthalpy gain that could contribute to the thermostability
of the mutants. From our extensive MD simulation studies to understand
the structural basis of thermostability,[12] we identified that residue based stress and the involvement of each
residue in allosteric communication pathways of the ligand binding
to G protein coupling site to be important. The residue-based stress
(forces) is the net force (from both the bonded and nonbonded forces)
exerted on each residue from the neighboring residues that are within
3 Å except those residues that are directly bonded. The force
computation was performed using the GROMOS96 force field following
the procedure described by Stacklies et al.[26] The average stress is the average residue-based stress over the
entire MD trajectory for the wild type receptor of β1AR, A2AR, and NTSR1. The procedure is discussed in detail
in Niesen et al. 2013.[27]
Calculation
of Allosteric Hub Score
Previously, we
have developed computational method to calculate the allosteric pathways
of communication from the extracellular loops to the ligand binding
site to the intracellular G protein coupling regions of the receptor.[28] Using the MD simulation trajectories of the
wild type β1AR, A2AR, and NTSR1, we calculated
the correlated movement in torsion angles between pairs of distant
residues within each receptor. The metric that we used to calculate
the dynamic correlation is mutual information (MI) in torsion angles,
using 35 bins for the torsion angle distributions.[27] The choice of the number of bins was based on convergence
of the entropy values as discussed in Niesen et al.[27]Using graph theoretic methods (Bhattacharya et al.
2014), we calculated allosteric pathways between each pair of residues
that showed an above average MI (MI > MIavg) and were
farther
than 10 Å apart in the receptor structure. We first constructed
an undirected graph using inter-residue contacts, where the residues
formed nodes and the inter-residue contacts formed the edges of the
network. An inter-residue contact was identified if the Cα atoms of the residue pair were within 10 Å of one another.
The edge weights were calculated as MImax – MIab, where MImax is the maximum MI among all residue
pairs in the receptor and MIab is the MI between the terminal
residues of the edge. Moreover, to avoid selecting pathways with weak
MI, weights of all edges with MI < MIavg were set to
zero. For a given residue pair, the allosteric pathway is defined
to be the connecting route between the two residues, which minimizes
the number of intermediate nodes and maximizes the sum of edge MIs
of the connecting route. The allosteric pathways were calculated using
the shortest path algorithm by Dijkstra,[29] as implemented in the Bioinformatics ToolBox in MATLAB. Residues
that mediate multiple allosteric pathways were identified as allosteric
hubs, where the number of mediated pathways is defined as the “allosteric
hub score”. The greater the allosteric hub score of a given
residue, the higher is the involvement of that residue in the activity
of the receptor, and therefore, mutation of this residue to Ala could
be detrimental to receptor activity. Based on this hypothesis, we
have analyzed the effect of mutating an allosteric hub on the thermostability.
Receptor Structures Tested for Thermostability Prediction
For β1AR and A2AR, we tested the following
GPCR structures for thermostability prediction. (A) Single conformation
from the crystal structure, (B) conformational ensemble generated
from crystal structure using LITiCon, (C) conformational ensemble
generated from crystal structure using MD, (D) single homology model,
(E) conformational ensemble generated from homology model using LITiCon.
For NTSR1, we tested (A) single conformation from the crystal structure,
(B) conformational ensemble generated from crystal structure using
LITiCon, (C) single homology model, (D) conformational ensemble generated
from homology model using LITiCon.
Results
We have
compared the stability score calculated for each alanine
mutant to the experimental thermostability measured for single point
alanine mutants for β1AR,[5c] A2AR,[22a] and NTSR1.[22b] There are a total of 434 single point mutants
for the hydrophobic residues in the TM regions for the three receptors
for which we have performed the calculations. The list of the residues
tested is given in the Supporting Information Table S1. The experimental thermostability was measured by heating
the detergent-solubilized mutant to an elevated temperature (∼28–32
°C), cooling to 0 °C, and the amount of correctly folded
receptor was determined by a radioligand (either an agonist or antagonist
or both) binding assay. In this work, mutants that showed 130% or
higher ligand binding compared to the wild type receptor are defined
as thermostable. For the β1AR and A2AR
receptors, the mutants were heated without any ligand present. In
the case of NTSR1, two types of experiments were performed, one where
the mutants were heated in the presence of the agonist neurotensin,
and the other experiment heated the receptor in the absence of neurotensin.
The former assay is termed as +NT and the latter −NT. Under
the +NT conditions, NTSR1 is assumed to be in a conformation similar
to the structure determined by X-ray crystallography, that is, an
active-like state. The −NT thermostable mutants are probably
stabilized in an inactive state as wild-type NTSR1 shows no appreciable
constitutive activity.[7,30]
Thermostabilities of Single
Alanine Mutants Correlate with Enthalpy
Figure 1 shows the performance of the calculated
thermostability scores starting from the crystal structures of β1AR, A2AR, and NTSR1 (PDB ID: 2VT4, 3EML, 4GRV, respectively).
We have compared the predictability of thermostable mutants starting
from the crystal structure (single conformation) to the predictability
calculated from an ensemble of conformations generated using the crystal
structure in Figure 1. As a measure of predictability
of thermostability using the calculated stability score, we constructed
the Receiver Operational Characteristic (ROC) curves by plotting the
true positive rate against the false positive rate for different cutoffs
of the calculated stability score. Parts a, c, and e of Figure 1 show the ROC curves for β1AR,
A2AR, and NTSR1, respectively, starting from crystal structures
(single conformation) and from ensemble generated starting from the
crystal structure using LITiCon. The straight lines in these figures,
also termed the “random line”, represent the ROC curve
for zero predictability. For all the receptors, the ROC curves are
well above the random line, and the predictability improves when using
an ensemble of conformations generated by LITiCon compared to using
single conformation from crystal structure. This indicates that small
variations in conformation upon single point mutations are important
for more accurate thermostability prediction compared to using the
crystal structures alone. To assess the number of single point mutation
experiments that can be reduced by using these predictions, we plotted
the enrichment factor as a function of cutoff in the number of mutations
for β1AR, A2AR, and NTSR1 in Figure 1b, d, and f, respectively. We have also highlighted
the individual thermostable mutations at the cut-offs where they were
identified. The range of thermostability values used for defining
strong, medium, and weak thermostable mutants for the three receptors
are shown in Table S2 of the Supporting Information. The cutoff values are different for each receptor, due to the difference
in thermostability of the wild type receptor under the experimental
assay conditions. For all three receptors, the left side of the enrichment
plots (Figure 1b, d, and f) shows a higher
concentration of red and yellow mutations. Six out of nine experimentally
known medium and strong thermostable mutants will be recovered within
top 50 predicted thermostable mutants for β1AR, five
out of seven for A2A receptor and three out of four for
NTSR1. This suggests that in the LITiConDesign method of thermostability
prediction, the strong thermostable mutations are selectively enriched
over the weaker mutations. However, the method also missed several
strong thermostabilizing mutations for each receptor (the red and
yellow dots on the right side of Figures 1b,
d, and f). These mutations were missed in all receptor models including
the crystal structures and homology models. We have analyzed the reasons
for missing these mutations in the discussion section.
Figure 1
ROC curves for thermostability prediction using
an ensemble of
receptor structures compared to single receptor structure. (a) β1AR (Ensemble AUC: 0.67. Single structure AUC: 0.56); (c) A2AR (Ensemble AUC: 0.64. Single structure AUC: 0.62); (e) NTSR1
(Ensemble AUC: 0.64. Single structure AUC: 0.59). Enrichment as a
function of cutoff using (b) β1AR crystal structure;
(d) A2AR crystal structure; (f) NTSR1 crystal structure.
The mutants in the order of their measured experimental stability
are shown in colored dots. Red, high thermostability; yellow, medium
thermostability; and empty circles, weak thermostability.
The LITiCon method used for generating conformation
ensemble takes one-hundredth of the computational time required for
MD simulations. To evaluate the effectiveness of the conformation
ensemble generated from LITiCon in predicting thermostable mutations,
in comparison to the ensemble obtained from long time scale MD simulations,
we compared the predictions using the stability scores calculated
using two different schemes; (1) the ensemble generated by LITiCon
starting from the crystal structures and (2) structural ensemble generated
using MD simulations starting from the crystal structure. For details
on selecting the MD conformations, please refer to the Methods section. Parts a and b of Figure 2 show a comparison of the ROC curves from the LITiCon ensemble
to that from structural ensembles from MD simulations for β1AR and A2AR. For β1AR, the LITiCon
using the most populated MD conformation performed the best, whereas
both the crystal structure LITiCon and structural ensembles from MD
simulation showed similar performance. In the case of A2AR, LITiCon using the crystal structure performed the best, whereas
the MD ensemble showed the worst predictability. Overall, these results
indicate that the structural ensembles generated by LITiCon perform
equally or better than conformational ensembles generated using computationally
expensive MD simulations.
Figure 2
Comparison of ROC curves for thermostability prediction
using protein
structural ensemble generated from (1) LITiCon ensemble generated
starting from crystal structure and (2) representative conformations
from MD of crystal structure; (a) β1AR (LITiCon AUC:
0.67. MD ensemble AUC: 0.64); (b) A2AR (LITiCon AUC: 0.64.
MD ensemble AUC: 0.62).
ROC curves for thermostability prediction using
an ensemble of
receptor structures compared to single receptor structure. (a) β1AR (Ensemble AUC: 0.67. Single structure AUC: 0.56); (c) A2AR (Ensemble AUC: 0.64. Single structure AUC: 0.62); (e) NTSR1
(Ensemble AUC: 0.64. Single structure AUC: 0.59). Enrichment as a
function of cutoff using (b) β1AR crystal structure;
(d) A2AR crystal structure; (f) NTSR1 crystal structure.
The mutants in the order of their measured experimental stability
are shown in colored dots. Red, high thermostability; yellow, medium
thermostability; and empty circles, weak thermostability.Comparison of ROC curves for thermostability prediction
using protein
structural ensemble generated from (1) LITiCon ensemble generated
starting from crystal structure and (2) representative conformations
from MD of crystal structure; (a) β1AR (LITiCon AUC:
0.67. MD ensemble AUC: 0.64); (b) A2AR (LITiCon AUC: 0.64.
MD ensemble AUC: 0.62).
Performance of GPCR Homology Models in Predicting Thermostable
Alanine Mutations
In the last section, we showed that the
LITiConDesign method showed good predictability for thermostable mutations
starting from GPCR crystal structures. However, the real case scenario
is that the crystal structure of the GPCR to be thermostabilized will
not be known. In this section, we discuss the thermostability prediction
results derived using homology models for β1AR, A2AR, and NTSR1 without using any crystal structure information
on these receptors. Figure 3 shows a comparison
of the enrichment as a function of number of mutants for several homology
models of β1AR, A2AR, and NTSR1. To observe
how the accuracy of the homology model affects the predictability
of thermostability, we selected several template structures of varying
sequence similarity to β1AR, namely β2AR, dopamine receptor D3DR (67% and 40% sequence similarity),
A2AR (33% sequence similarity), and CXCR4 (24% sequence
similarity). For A2AR and NTSR1, we evaluated one homology
model each, using β1AR and β2AR
as templates, respectively. Unlike β1AR, A2AR and NTSR1 have no close template crystal structures. Therefore,
we wanted to test how generic templates such as β1AR or β2AR perform in thermostability prediction.
Since the NTSR1 homology model was based on the inactive state of
β2AR, we evaluated its predictability using the −NT
data. In contrast, for evaluating thermostability prediction using
the NTSR1 crystal structure, we had used the +NT data, since the crystal
structure shows active state characteristics. We also plotted the
enrichments corresponding to 50% cutoff for each model against the
Cα RMSD in the coordinates of the homology models
from the respective crystal structures in Figure 4. While the β1AR models based on β2AR (RMSD of the β1AR to its crystal structure
is 1.6 Å) and D3DR (RMSD 2.5 Å) performed the
best among homology models, the CXCR4 based model (RMSD of the β1AR model to its crystal structure is 3.1 Å) performed
the worst because of its distant sequence and structural homology
with β1AR. For A2AR, the performance of
the β1AR based homology model is worse compared to
the A2AR crystal structure (65% vs 70%). Unlike β1AR, there are no other templates that are closer in sequence
similarity to A2AR, for which crystal structures are available
(e.g., adenosine receptors other than A2AR). Therefore,
we could not test the effect of template closeness on prediction performance
for A2AR. The RMSD in coordinates of the various homology
models from their respective crystal structures and their percentage
recovery of the thermostable mutants are tabulated in Table S3 of
the Supporting Information. Overall, we
find that for homology models that are within 2.5 Å (RMSD of
the Cα atoms in the TM region) to their respective
crystal structures, there is little change in performance of the thermostability
predictions. The performance of the thermostability prediction is
worse if the homology model has RMSD above 3 Å to its crystal
structure.
Figure 3
Enrichment
as a function of cutoff using (a) homology models of
β1AR based on β2AR crystal structure
as template; (b) β1AR model based on D3DR crystal structure as template; (c) β1AR model
based on A2AR crystal structure as template; (d) β1AR model based on CXCR4 crystal structure as template; (e)
A2AR model based on β1AR; (f) NTSR1 model
based on β2AR structure as template. The experimentally
determined thermostable mutants are highlighted as colored dots: red,
high thermostability; yellow, medium thermostability; blue open circles,
weak thermostability.
Figure 4
Comparison of enrichments for different receptor models. The nomenclatures
are as follows: β1AR homo-D3DR implies
β1AR homology model using D3DR as template;
A2AR homo-β1AR implies A2AR homology model
based on β1AR as template; the other models are named
accordingly. The RMSD of the homology models from their respective
crystal structures are tabulated in Supporting
Information Table S3.
For each homology model, we have compared the performance
of using LITiCon ensembles of receptor conformations to that using
single receptor structures. For each receptor model, we have calculated
the area under the ROC curve (AUC) for the thermostability predictions
using LITiCon as well as single receptor structure. AUC is directly
proportional to the predictability of the computation scheme, and
for computation schemes with zero predictability, the AUC is close
to 0.5. Figure 5 shows the comparison of AUC
for LITiCon generated ensembles versus single receptor structure for
each homology model of β1AR, A2AR, and
NTSR1. Similar to the crystal structures, most of the homology models
show moderate to large improvement in predictability by using ensembles
of receptor conformations compared to single receptor structures.
The improvement in performance is more prominent in the high accuracy
β1AR models that were derived using close homologue
template structures such as β2AR and D3DR. For homology models of β1AR derived using distant
templates such as A2AR and CXCR4, the performance advantage
of LITiCon over single receptor structure is negligible.
Figure 5
Comparison of thermostability prediction using different receptor
structures. For performance comparison, the metric AUC (area under
ROC curve) is used. For positive predictability, the AUC varies between
0.5 and 1. The higher the AUC, the greater is the predictability.
The darker color bars are the AUC calculated using the LITiCon ensemble
of structures, and the lighter colored bars have been calculated using
single conformation either from crystal structures or homology models
as indicated on the x-axis below each set of bars.
In the x-axis, “crystal structure”
refers to the crystal structure of the corresponding receptor. The
blue bars that are labeled “homology β2AR”
refers to the homology model of β1AR derived from
the β2AR inactive state crystal structure as template.
For
A2AR, the performance gain by using LITiCon ensembles
as opposed to single crystal structure is subtle when comparing the
area under the curve, although the enrichment of true positives is
higher for the ensembles as seen in Figure 1c. The performance gain is modest in using the LITiCon ensemble for
A2AR homology model based on β1AR as template.For NTSR1, the homology model based on β2AR showed
a large improvement in predictability using the LITiCon ensemble method,
as compared to using a single homology model structure. The AUC values
for each structural model of the three receptors are given in Supporting Information Table S4. We have also
tested the predictions for NTSR1 against both sets of experimental
thermostability, the −NT and the +NT data. While using the
crystal structure of NTSR1 showed better predictability for the +NT
data, the homology model of NTSR1 based on the inactive state β2AR structure as template showed better predictability for
the −NT data than the +NT data. This can be explained by the
fact that the crystal structure of NTSR1 with neurotensin bound is
in an active-like state, and hence agrees better with the +NT data.
The inactive homology model agrees better with the −NT data,
since the mutants that were thermostabilized under the −NT
conditions were more likely to stabilize the inactive state.Enrichment
as a function of cutoff using (a) homology models of
β1AR based on β2AR crystal structure
as template; (b) β1AR model based on D3DR crystal structure as template; (c) β1AR model
based on A2AR crystal structure as template; (d) β1AR model based on CXCR4 crystal structure as template; (e)
A2AR model based on β1AR; (f) NTSR1 model
based on β2AR structure as template. The experimentally
determined thermostable mutants are highlighted as colored dots: red,
high thermostability; yellow, medium thermostability; blue open circles,
weak thermostability.Comparison of enrichments for different receptor models. The nomenclatures
are as follows: β1AR homo-D3DR implies
β1AR homology model using D3DR as template;
A2AR homo-β1AR implies A2AR homology model
based on β1AR as template; the other models are named
accordingly. The RMSD of the homology models from their respective
crystal structures are tabulated in Supporting
Information Table S3.Comparison of thermostability prediction using different receptor
structures. For performance comparison, the metric AUC (area under
ROC curve) is used. For positive predictability, the AUC varies between
0.5 and 1. The higher the AUC, the greater is the predictability.
The darker color bars are the AUC calculated using the LITiCon ensemble
of structures, and the lighter colored bars have been calculated using
single conformation either from crystal structures or homology models
as indicated on the x-axis below each set of bars.
In the x-axis, “crystal structure”
refers to the crystal structure of the corresponding receptor. The
blue bars that are labeled “homology β2AR”
refers to the homology model of β1AR derived from
the β2AR inactive state crystal structure as template.
Other Receptor Properties
That Improve the Thermostability Scores
We examined the properties
other than enthalpy that can be used
to improve the prediction of thermostability scores. Our previous
results from extensive MD simulations on crystal structures of thermostable
mutants and their respective wild type receptor structures of β1AR, and A2AR[12] have
shown two important properties that are different in the wild type
receptor compared to the thermostable mutants.In our earlier
studies, we observed that the net stress (or force) on each residue
played an important role in the function of the receptor as well as
its thermostability. We have also calculated the allosteric communication
pipelines in GPCRs that communicate the ligand binding to the G protein
coupling region using mutual information on torsion angles.[28] We observed that many residues in the TM regions
of the receptor mediate multiple allosteric communication pathways,
and we define the number of allosteric communication pathways going
through a given residue as its allosteric hub score.In this
paper, besides the enthalpy, we have investigated the effect
of using the net stress or force on each residue and allosteric hub
score in thermostability predictions. Parts a and c of Figure 6 show the correlation between the allosteric hub
scores for each residue in β1AR and A2AR crystal structures and the experimental thermostability score of
the corresponding alanine mutant. We observe an inverse correlation
between the allosteric hub score and the thermostability. Residues
that are the strongest allosteric hubs (allosteric hub score >40,
highlighted in red) show poor thermostability scores upon mutation.
Figure 6
Allosteric hub score and local stress are plotted against
thermostability
for each residue position in (a, b) β1AR and (c,
d) A2AR. Residues on the right side of the dotted vertical
line are considered thermostable. Residues that have high allosteric
hub score or stress and poor thermostability are highlighted in red.
The overall inverse correlation between stress and thermostability
is shown by the regression lines in b and d.
Parts b and d of Figure 6 show plots of
the stress on each residue versus the measured thermostability score
for β1AR and A2AR, respectively. Residues
with strong internal stress are highlighted in red. All of these residues
show weak thermostability (less than that of wild type) when mutated
to alanine. Mutating a residue with high stress to alanine reduces
the thermostability of the mutant receptor. Our earlier MD simulation
studies on multiple mutant thermostable receptors of β1AR, A2AR, and NTSR1 showed that the stress on each residue
is reduced in the thermostable mutant compared to their respective
wild type receptors. This, however, resulted from multiple mutations
made on low stress residue positions in the wild type receptor. Parts
b and d of Figure 6 show the Pearson’s
correlation coefficient for β1AR and A2AR for stress vs thermostability score that showed a weak inverse
correlation between residue stress and thermostability. For both cases,
the correlation coefficient is negative, indicating the inverse relationship.
The trend line obtained using least-squares fitting is also shown
in the same figures.Allosteric hub score and local stress are plotted against
thermostability
for each residue position in (a, b) β1AR and (c,
d) A2AR. Residues on the right side of the dotted vertical
line are considered thermostable. Residues that have high allosteric
hub score or stress and poor thermostability are highlighted in red.
The overall inverse correlation between stress and thermostability
is shown by the regression lines in b and d.
Combining Allosteric Communication and Stress Information with
Enthalpy Score Improves Thermostability Prediction
To study
the feasibility of using stress or allosteric communication in predicting
thermostability, we plotted the normal distance from the random line
(ndis) as a function of allosteric hub score and stress cut-offs,
as shown in Supporting Information Figure S2. The normal distance, ndis, is a measure of predictability of thermostable
mutants and is explained in Text S1 and Figure S1 of Supporting Information. Parts a and b
of Figure S2 show the predictability when no enthalpy score
cutoff is used. Parts c and d of Figure S2 show the predictability among mutants that have enthalpy scores
above a cutoff of −1 kcal/mol (optimal cutoff for both receptors
for maximizing TPR and minimizing FPR). The black and dark blue regions
in Supporting Information Figure S2 show
no predictability, whereas the yellow and red areas show the highest
predictability. For both β1AR and A2AR,
an allosteric hub score of 40 and internal stress of 7000 pN were
found to be the optimal cut-offs for predicting thermostability, as
indicated by the maxima in Figure S2 (red
circle). Thus, combining stress and allosteric hub score with the
stability score from enthalpy improves thermostability prediction
as shown by the increased red region in Figure
S2b and d. For A2AR, the improvement observed by
adding allosteric hub and stress information over enthalpy score was
more significant than for β1AR. We calculated the
enrichments for different percent cutoffs using enthalpy score alone
and including allosteric hub and stress information for both β1AR and A2AR, as shown in Figure 7. For all three metrics, the optimal score cut-offs were used
as mentioned before. For both β1AR and A2AR, we find modest improvement in enrichment for lower cut-offs, while
including allosteric hub and stress information. For β1AR, the improvement was found to be the highest at a cutoff of 35%,
while for A2AR, the maximum enrichment was obtained at
20% cutoff.
Figure 7
Comparison of thermostability prediction using only enthalpy based
score and by combining with allosteric hub score and residue based
stress for (a) β1AR and (b) A2AR. The
percent recovery of thermostable positives by screening different
cut-offs of residue mutations are compared.
Comparison of thermostability prediction using only enthalpy based
score and by combining with allosteric hub score and residue based
stress for (a) β1AR and (b) A2AR. The
percent recovery of thermostable positives by screening different
cut-offs of residue mutations are compared.
Discussion
The stability score reflects the average
enthalpy of an ensemble
of conformations of a mutant near the starting structure. Mutants
that have a high thermostability score show favorable enthalpy for
most of the conformations in the ensemble, whereas the mutants with
low thermostability scores have conformations with unfavorable enthalpy
in the ensemble. This means that thermostable mutants can maintain
favorable enthalpy despite small structural perturbations, unlike
thermally unstable mutants. Thus, thermostable mutants are more resistant
to thermal fluctuations, and this could explain their stability.We have tested the performance and sensitivity of the LITiConDesign
method by using homology models of varying resolution (closeness to
the crystal structure). We find that the homology models derived using
close homologues as templates (templates with low RMSD to the target
crystal structure) perform better than the ones based on distant templates.
Parts a and bo of Figure 8 show the performance
of the different homology models of β1AR and A2AR, respectively. Parts e and f of Figure 8 show the performance of the NTSR1 models. For the +NT data,
we used the crystal structure of active NTSR1, while for the −NT
data, we used the homology model of inactive NSTR1 based on the β2AR crystal structure. We calculated the number of true thermostable
mutants that are recovered in the top 50% when sorted by calculated
thermostability scores. In Figure 8a and b,
for each model, the residues that are correctly identified are highlighted
in orange. As expected, for β1AR, the crystal structure,
and the homology model of β1AR based on β2AR and D3DR performed better than the distant A2AR and CXCR4 based models. Residues that are correctly identified
by most of the structures are colored green, the ones that are identified
by only the close template models and crystal structures are colored
yellow, while the ones that are not identified by any of the models
are colored red. For the NTSR1 models (Figure 8e and f), residues that are correctly identified by the crystal structure
or homology model are colored green, and the ones that are not identified
are colored red. These residues are also highlighted in Figure 8c, d, g, and h. For β1AR, the three
mutants (I551.46A, V902.47A, I1293.40A) that are not identified by any of the models are all located in
tightly packed regions of the receptor, facing the core of the TM
domain. In contrast, the residues that are successfully predicted
by our method are located in loosely packed regions of the receptor
(near the extracellular or intracellular termini of the TM helices)
or facing the lipid bilayer. For the residues located in tightly packed
regions of the receptor, the side chain optimization methods are inadequate
to account for the repacking of the side chains upon mutation. This
could be a possible reason for the failure of our method in not identifying
certain residues. Also, the lack of electrostatic component in our
energy function could contribute to the failure of this method in
identifying certain mutations. The electrostatic energy could be important
for mutations in tightly packed regions of the receptor that are in
the neighborhood of polar residues. It is difficult to accurately
estimate the electrostatic energy, especially in a hybrid environment
such as the lipid bilayer. We have excluded the electrostatic component
from our scoring function, since adding the electrostatic energy worsened
the predictability of the thermostabilizing mutations. However, specialized
energy functions for hydrogen bonds/salt bridges could be developed
to model the effect of the polar residues in modulating receptor stability.
Such energy functions are currently under development in our lab.
Figure 8
(A, B; E, F) Thermostable
residues that were identified using different
GPCR structures: (A) β1AR; (B) A2AR; (E)
active-like state crystal structure of NTSR1; (F) homology model of
inactive NTSR1 based on β2AR. Experimental thermostability scores
have been normalized so that the experimental value for the wild type
receptor is 50%. If a residue was correctly identified by a particular
model, the cell corresponding to that model is colored orange; the
residues that were identified by all models are colored green, ones
that were identified only by the crystal structures or close template
homology models are colored yellow; those that were not identified
by any of the models are colored red. (C, D; G, H) crystal structures
of (C) β1AR, (D) A2AR, (G) NTSR1 +NT,
and (H) NTSR1 −NT showing the thermostable mutation positions.
+NT and −NT refer to NTSR1 mutants that are thermostable in
presence and absence of neurotensin, respectively.
In the case of A2AR (Figure 8d),
the residues that were correctly predicted were all located in
less tightly packed regions of the receptor, either in the extracellular
or intracellular ends of the TM helices, or facing the lipid bilayer.
It is not clear, however, why the three residues C773.25A, L853.33A, and L1925.53A could not be identified.In the case of NTSR1 bound to neurotensin (+NT mutants; Figure 8g), our method failed to identify several residues
that are located mainly at the intracellular part of the receptor
on TM5 and TM6. The residues that were correctly identified are on
TM1 and TM7. Although it is not clear why the residues on TM5 and
TM6 were not identified, it is possible that the limited conformational
sampling performed by LITiCon failed to take into account the flexibility
of TM5 and TM6 in the active state of NTSR1. These two helices are
thought to undergo the largest conformational change during activation
and are thus expected to be dynamic in the active state of the receptor.
In contrast to the +NT mutants, the −NT mutants showed improved
predictability using LITiConDesign, as shown in Figure 8h. Out of the two mutations on TM2, A1202.57L was successfully identified, while I1162.53A, located
one helical turn below A1202.57, was not. Being in the
middle of the TM2 region, I1162.53 is more tightly packed
compared to A1202.57 and hence could not be identified
due to difficulty in predicting side-chain packing. However, it is
not clear why the other two mutations, L721.40A and L2054.61A (both of them face the lipid bilayer) could not be identified.We also tested whether the mutation positions that were combined
to produce the thermostable mutants that were crystallized (with multiple
mutations) for both receptors (i.e., m23 mutant of β1AR and StaR2 mutant of A2AR) could be identified by our
method. Out of the four hydrophobic mutations that are part of the
m23 mutant, only V902.53 was not identified. Among the
StaR2 mutations, all of them were successfully identified. The crystallizing
mutation positions along with their TM scores and thermostability
(single point alanine mutation) are highlighted in Figure S3 of the Supporting Information.(A, B; E, F) Thermostable
residues that were identified using different
GPCR structures: (A) β1AR; (B) A2AR; (E)
active-like state crystal structure of NTSR1; (F) homology model of
inactive NTSR1 based on β2AR. Experimental thermostability scores
have been normalized so that the experimental value for the wild type
receptor is 50%. If a residue was correctly identified by a particular
model, the cell corresponding to that model is colored orange; the
residues that were identified by all models are colored green, ones
that were identified only by the crystal structures or close template
homology models are colored yellow; those that were not identified
by any of the models are colored red. (C, D; G, H) crystal structures
of (C) β1AR, (D) A2AR, (G) NTSR1 +NT,
and (H) NTSR1 −NT showing the thermostable mutation positions.
+NT and −NT refer to NTSR1 mutants that are thermostable in
presence and absence of neurotensin, respectively.In both β1AR and A2AR, mutating allosteric
hubs lead to poor thermostability, and therefore, these positions
can be eliminated from potential positions to mutate. Using MD simulations,
we mapped the major allosteric communication pathways in the inactive
states of β1AR and A2AR. These pathways
connect distant receptor domains that show correlated motion. In GPCRs,
concerted motion of distant domains help to preserve the structural
stability of specific functional states such as the inactive state.
When the correlated movement between distant domains in the inactive
state is disrupted, by mutating allosteric hubs, the inactive state
could be destabilized. We recently showed that in β2AR, mutating allosteric hubs that mediate allosteric communication
in the inactive state led to increased constitutive or agonist induced
activity. On the other hand, mutating hub residues that mediate allosteric
communication in the active state lead to reduced activity.[28] Thus, mutating the allosteric hubs in β1AR and A2AR could destabilize the inactive state
leading to decrease in thermostability.
Conclusions
The LITiConDesign method is a computational method
for rapid prediction of thermostable mutation positions in a given
GPCR or any helical membrane protein. It is based on generating an
ensemble of conformations generated using the conformational sampling
method called LITiCon for GPCRs. We have analyzed the performance
of the computational method LITiConDesign in predicting
thermostabilizing mutations for GPCRs. We have compared the calculated
thermostability scores to the experimental values for 434 single mutants
from three class A GPCRs, β1AR, A2AR,
and NTSR1. We found that mutations that have a high thermostability
score are more probable to be thermostable experimentally. We also
showed that using an ensemble of conformations was a better choice
for predicting thermostability compared to a single structure. We
observed that the ensemble of conformations from LITiCon performs
similarly in predicting the thermostability as the ensembles from
MD simulations. However, the LITiCon method takes 1/10th the computational
time as the MD method. Typically, scanning a GPCR with 140 mutants
takes 10 h using 32 Intel Xeon CPUs. Therefore, it can be very useful
for fast predictions of thermostable single point mutants.We
identified two important properties obtained from MD simulations
of GPCR structures, which can be used to improve thermostability prediction.
Using microseconds of MD simulations of the GPCR crystal structures,
we calculated the net force (stress) on each residue in the receptor
structures. We also identified the residues that communicate multiple
allosteric pipelines (allosteric hubs). In both β1AR and A2AR, mutating allosteric hubs lead to poor thermostability,
and therefore, these positions can be eliminated from potential positions
to mutate. We observed that mutating residues with high level of stress
in the wild type receptors to alanine does not lead to thermostability
of these single point mutants, showing a weak inverse correlation
between thermostability and stress. These observations hold true for
single point mutations to alanine and not to other amino acids. Thus,
including the residue based stress and allosteric hub information
improved the prediction of the thermostable mutation positions in
GPCRs. Our method for predicting thermostability lays the groundwork
for computational design of thermostabilizing mutants of GPCRs in
future.
Authors: B R Brooks; C L Brooks; A D Mackerell; L Nilsson; R J Petrella; B Roux; Y Won; G Archontis; C Bartels; S Boresch; A Caflisch; L Caves; Q Cui; A R Dinner; M Feig; S Fischer; J Gao; M Hodoscek; W Im; K Kuczera; T Lazaridis; J Ma; V Ovchinnikov; E Paci; R W Pastor; C B Post; J Z Pu; M Schaefer; B Tidor; R M Venable; H L Woodcock; X Wu; W Yang; D M York; M Karplus Journal: J Comput Chem Date: 2009-07-30 Impact factor: 3.376
Authors: Sander Pronk; Szilárd Páll; Roland Schulz; Per Larsson; Pär Bjelkmar; Rossen Apostolov; Michael R Shirts; Jeremy C Smith; Peter M Kasson; David van der Spoel; Berk Hess; Erik Lindahl Journal: Bioinformatics Date: 2013-02-13 Impact factor: 6.937