Literature DB >> 25400524

Rapid Computational Prediction of Thermostabilizing Mutations for G Protein-Coupled Receptors.

Supriyo Bhattacharya¹, Sangbae Lee¹, Reinhard Grisshammer², Christopher G Tate³, Nagarajan Vaidehi¹.

Abstract

G protein-coupled receptors (GPCRs) are highly dynamic and often denature when extracted in detergents. Deriving thermostable mutants has been a successful strategy to stabilize GPCRs in detergents, but this process is experimentally tedious. We have developed a computational method to predict the position of the thermostabilizing mutations for a given GPCR sequence. We have validated the method against experimentally measured thermostability data for single mutants of the β1-adrenergic receptor (β1AR), adenosine A2A receptor (A2AR) and neurotensin receptor 1 (NTSR1). To make these predictions we started from homology models of these receptors of varying accuracies and generated an ensemble of conformations by sampling the rigid body degrees of freedom of transmembrane helices. Then, an all-atom force field function was used to calculate the enthalpy gain, known as the "stability score" upon mutation of every residue, in these receptor structures, to alanine. For all three receptors, β1AR, A2AR, and NTSR1, we observed that mutations of hydrophobic residues in the transmembrane domain to alanine that have high stability scores correlate with high experimental thermostability. The prediction using the stability score improves when using an ensemble of receptor conformations compared to a single structure, showing that receptor flexibility is important. We also find that our previously developed LITiCon method for generating conformation ensembles is similar in performance to predictions using ensembles obtained from microseconds of molecular dynamics simulations (which is computationally hundred times slower than LITiCon). We improved the thermostability prediction by including other properties such as residue-based stress and the extent of allosteric communication by each residue in the stability score. Our method is the first step toward a computational method for rapid prediction of thermostable mutants of GPCRs.

Entities: Chemical

Year: 2014 PMID： 25400524 PMCID： PMC4230369 DOI： 10.1021/ct500616v

Source DB: PubMed Journal: J Chem Theory Comput ISSN： 1549-9618 Impact factor: 6.006

Introduction

G protein-coupled receptors (GPCRs) are seven helical transmembrane (TM) proteins that form the largest class of drug targets. Therefore, knowledge of the three-dimensional structure and dynamics of GPCRs will greatly aid drug discovery. However, crystallization of GPCRs has been a challenging process, since these receptors are dynamic and exist in multiple conformations and are often not stable in detergents that are needed for solubilization.[1] One of the two main strategies to crystallize GPCRs is to increase the hydrophilic area of the receptor by either making fusions in cytoplasmic loop 3 to stable soluble proteins such as T4 lysozyme (T4L)[2] or apo-cytochrome b (BRIL),[3] or by binding a receptor-specific heavy chain antibody (nanobody) to the intracellular surface of the receptor.[4] Receptor-T4L and receptor-BRIL fusions have similar stabilities to the wild type receptor and therefore require the binding of a high affinity ligand to both stabilize the receptor and lock it in a single conformational state. In contrast, nanobodies that bind specifically to the activated state of the receptor impart considerable thermostability and also lock the receptor in a single conformation. The alternative strategy to the use of receptor-fusion protein technology to crystallize GPCRs, is to derive thermostabilized mutant receptors by introducing specific point mutations into the wild type receptor sequence.[5] The thermostabilized receptor shows increased stability and is preferentially in a single conformational state, and it can therefore be crystallized bound to ligands that bind even with low affinity. So far several GPCRs have been crystallized using this strategy, including receptors in class A,[6,5b,7,8] class B,[9] and class C.[10] Although deriving thermostable mutants is a proven approach for the structure determination of GPCRs, it remains a challenging task since the mutations are not transferable between receptors unless there is considerable homology between the receptors.[11] The original experimental procedure for thermostabilization was based on Ala/Leu scanning mutagenesis where every residue was mutated to Ala (except for Ala residues that were mutated to Leu), each mutant was expressed and its thermostability was determined relative to the wild type receptor.[5] Once thermostabilizing mutations were identified, they were mutated to other amino acid residues to check if further improvements in thermostability were obtained. The best thermostabilizing mutations were then combined to give the optimally stable mutant. This is a time-consuming process, so if we can understand the structural and kinetic basis for why the mutations were thermostabilizing,[12] then it might be possible to predict which residues in a model of a GPCR structure are thermostabilizing, thus facilitating an accelerated route to thermostabilization and structure determination.[13] Computational approaches in designing thermostable mutants for several proteins are based on bioinformatics techniques to structure based methods involving statistical[14] and semiempirical potentials.[15] Knowledge based potential functions were used to predict thermostable mutations for enzymes.[14b,14a,14c] In these works, the energy functions used in determining mutant stabilities were originally developed for globular protein design. Alchemical free energy calculations in conjunction with molecular dynamics (MD) simulations were used to design thermostable mutants for the bacterial blue light photoreceptor.[15] These methods do not take into account the effects of the membrane environment and are computationally slow. GPCRs are dynamic and they exist in multiple conformations in the absence of agonists and antagonists. Thus, computational methods to predict thermostability need to incorporate protein flexibility in the algorithm. Also, due to the relatively sparse number of GPCR crystal structures, knowledge based potentials are not robust for predicting thermostable GPCR mutants. Recently, Chen et al. used bioinformatics and three-dimensional structural information to design thermostable mutants of the β1AR.[16] They targeted nonconserved polar residues that lack hydrogen bond contacts, and hydrophobic residues with packing defects, and mutated these residues for increased stability. Their method works well for mutations that convert a small amino acid to a larger one. However, given that most experimental strategies for GPCR thermostabilization rely on Ala/Leu scanning mutagenesis, it is unclear whether the above method could predict such mutations. We have developed a rapid screening computational method, LITiConDesign (by extending the computational method LITiCon for GPCR conformational sampling[17]), for predicting single point alanine mutations to thermostabilize GPCRs. Systematic computational alanine scanning on the TM residues was performed on an ensemble of receptor conformations generated using a structural model of the receptor. The stability of the mutant was scored using the sum of torsional energy and van der Waals (vdW) packing energy calculated using an all-atom force field energy function. The preliminary version of this method and validation was published previously (Balaraman et al. 2010). The computations can be distributed over a computing cluster and scanning the entire sequence of a given receptor takes a few hours. We have tested and validated our method using homology models of varying accuracy for about 400 mutations in three GPCRs: β1AR, A2AR, and NTSR1. Our method shows the best predictions for homology models of the target GPCR sequence derived using high homologous template structures. The method identified the most stable thermostable mutants in the top 35% of the predicted mutants.

Methods

Thermostability Score Calculation for Single Point Mutants

The thermostability scores for the single point mutants in the LITiConDesign were calculated as the difference in the total potential energy of the mutant and the wild type receptor. The thermostability score for each mutant was calculated as (EWT–Emut), where EWT is the sum of vdW energy and torsional energy components of the potential energy of the wild type receptor and Emut is the same for the mutant. The potential energy was calculated either for a single structure of the wild type and the mutant receptors or for an ensemble of conformations generated independently for both the wild type and the mutant receptors. The ensemble of receptor conformations was generated using the computational method LITiCon (Ligand Induced Transmembrane Conformational Change). The details of the LITiCon method have been published elsewhere.[17,18] We describe the method briefly as applied here. Starting from an initial receptor structure, all the seven TM helices were simultaneously rotated about the helical axis between ±5° in 10° increment, thus generating 27 = 128 conformations. Next, in each of the 128 conformations thus obtained, every hydrophobic amino acid in the transmembrane (TM) region was mutated to alanine (except alanine, which was mutated to leucine), thus generating 128N conformations (N = number of mutations tested). Side chains of each conformation of each mutant were then optimized using the SCWRL4 side chain optimization program,[19] followed by potential energy minimization in the CHARMM27 all-atom force field[20] for 2000 steps. The minimization and the energy calculation were performed using the program NAMD.[21] The thermostability score used just the vdW energy and the torsional energy components of the total energy. We calculated the thermostability scores for residues in the human adenosine A2A receptor (A2AR), avian β1 adrenergic receptor (β1AR), and rat neurotensin receptor 1 (NTSR1) for which there are experimental stability scores available.[5c,22] We calculated the thermostability scores for the single point mutants of the hydrophobic residues to alanine in the TM regions for these three receptors starting from their respective crystal structures. We also performed the thermostability score calculations starting from the homology models of these receptors, as in the real case scenario one would need to make these predictions from homology models prior to experiments.

Molecular Dynamics Simulations of GPCR Crystal Structures

To understand the effect of lipid packing on the thermostability predictions, we performed molecular dynamics (MD) simulations on the crystal structures and homology models of the wild type of each of β1AR, A2AR, and NTSR1. The thermostability calculations were performed using the last conformation of the MD trajectory as well as from the most populated cluster obtained from clustering analysis of the MD trajectories as detailed below. The starting conformations of β1AR, A2AR, and NTSR1 were obtained from the crystal structures of thermostabilized β1AR (PDB ID: 2VT4),[6b] A2AR (PDB ID: 3PWH),[8] and NTSR1 (PDB ID: 4GRV).[7] Since the crystal structures are those of thermostabilized mutants, the receptor structures were mutated to the corresponding wild type sequences and solvated in explicit palmitoyl-oleoyl-phosphatidylcholine (POPC) lipid, water, and ions. The packing of the lipid molecules was performed using the inflategro package in GROMACS.[23] Each system was equilibrated by performing 200 ps of MD at 310 K using a NVT ensemble followed by 5 ns of MD under NPT conditions at a pressure of 1 bar. The protein and ligand were kept in place during these equilibration steps using position restraints. After equilibration to the expected temperature and pressure, a total of 10 production simulations of up to 100 ns were performed for each initial conformation with different initial velocities using the NVT ensemble. The MD simulations were performed with the GROMOS96 force field[24] using the software package GROMACS, using a 2 fs time step. A cutoff distance of 12 Å for nonbond interactions was introduced, and PME (particle mesh Ewald) method[25] was used for long-range vdW interactions. Principal coordinate analysis (PCA) was performed on each MD trajectory using only the backbone atoms of TM helix 2 through 7. The loops as well as TM1 were found to be highly flexible in the MD simulations and were thus excluded from the PCA. We used the g_covar module of GROMACS to perform the PCA. The receptor conformations from the MD trajectories were then clustered in the principal coordinate space using k-means clustering. The conformation corresponding to the cluster center of the most populated cluster from each trajectory was used to perform thermostability calculations. The GPCR homology models were equilibrated in an explicit lipid environment using the software package NAMD.[21] The receptor structures were solvated in explicit POPC lipid, water, and ions using the VMD software. Each system was equilibrated for 10 ns at 310 K in a NVT ensemble. The thermostability calculations were performed starting from the last frame of the MD trajectories and using LITiCon subsequently to generate the conformational ensemble.

Calculation of Residue Based Stress

To improve the thermostabile mutant predictions, we further analyzed factors other than the enthalpy gain that could contribute to the thermostability of the mutants. From our extensive MD simulation studies to understand the structural basis of thermostability,[12] we identified that residue based stress and the involvement of each residue in allosteric communication pathways of the ligand binding to G protein coupling site to be important. The residue-based stress (forces) is the net force (from both the bonded and nonbonded forces) exerted on each residue from the neighboring residues that are within 3 Å except those residues that are directly bonded. The force computation was performed using the GROMOS96 force field following the procedure described by Stacklies et al.[26] The average stress is the average residue-based stress over the entire MD trajectory for the wild type receptor of β1AR, A2AR, and NTSR1. The procedure is discussed in detail in Niesen et al. 2013.[27]

Calculation of Allosteric Hub Score

Previously, we have developed computational method to calculate the allosteric pathways of communication from the extracellular loops to the ligand binding site to the intracellular G protein coupling regions of the receptor.[28] Using the MD simulation trajectories of the wild type β1AR, A2AR, and NTSR1, we calculated the correlated movement in torsion angles between pairs of distant residues within each receptor. The metric that we used to calculate the dynamic correlation is mutual information (MI) in torsion angles, using 35 bins for the torsion angle distributions.[27] The choice of the number of bins was based on convergence of the entropy values as discussed in Niesen et al.[27] Using graph theoretic methods (Bhattacharya et al. 2014), we calculated allosteric pathways between each pair of residues that showed an above average MI (MI > MIavg) and were farther than 10 Å apart in the receptor structure. We first constructed an undirected graph using inter-residue contacts, where the residues formed nodes and the inter-residue contacts formed the edges of the network. An inter-residue contact was identified if the Cα atoms of the residue pair were within 10 Å of one another. The edge weights were calculated as MImax – MIab, where MImax is the maximum MI among all residue pairs in the receptor and MIab is the MI between the terminal residues of the edge. Moreover, to avoid selecting pathways with weak MI, weights of all edges with MI < MIavg were set to zero. For a given residue pair, the allosteric pathway is defined to be the connecting route between the two residues, which minimizes the number of intermediate nodes and maximizes the sum of edge MIs of the connecting route. The allosteric pathways were calculated using the shortest path algorithm by Dijkstra,[29] as implemented in the Bioinformatics ToolBox in MATLAB. Residues that mediate multiple allosteric pathways were identified as allosteric hubs, where the number of mediated pathways is defined as the “allosteric hub score”. The greater the allosteric hub score of a given residue, the higher is the involvement of that residue in the activity of the receptor, and therefore, mutation of this residue to Ala could be detrimental to receptor activity. Based on this hypothesis, we have analyzed the effect of mutating an allosteric hub on the thermostability.

Receptor Structures Tested for Thermostability Prediction

For β1AR and A2AR, we tested the following GPCR structures for thermostability prediction. (A) Single conformation from the crystal structure, (B) conformational ensemble generated from crystal structure using LITiCon, (C) conformational ensemble generated from crystal structure using MD, (D) single homology model, (E) conformational ensemble generated from homology model using LITiCon. For NTSR1, we tested (A) single conformation from the crystal structure, (B) conformational ensemble generated from crystal structure using LITiCon, (C) single homology model, (D) conformational ensemble generated from homology model using LITiCon.

Results

We have compared the stability score calculated for each alanine mutant to the experimental thermostability measured for single point alanine mutants for β1AR,[5c] A2AR,[22a] and NTSR1.[22b] There are a total of 434 single point mutants for the hydrophobic residues in the TM regions for the three receptors for which we have performed the calculations. The list of the residues tested is given in the Supporting Information Table S1. The experimental thermostability was measured by heating the detergent-solubilized mutant to an elevated temperature (∼28–32 °C), cooling to 0 °C, and the amount of correctly folded receptor was determined by a radioligand (either an agonist or antagonist or both) binding assay. In this work, mutants that showed 130% or higher ligand binding compared to the wild type receptor are defined as thermostable. For the β1AR and A2AR receptors, the mutants were heated without any ligand present. In the case of NTSR1, two types of experiments were performed, one where the mutants were heated in the presence of the agonist neurotensin, and the other experiment heated the receptor in the absence of neurotensin. The former assay is termed as +NT and the latter −NT. Under the +NT conditions, NTSR1 is assumed to be in a conformation similar to the structure determined by X-ray crystallography, that is, an active-like state. The −NT thermostable mutants are probably stabilized in an inactive state as wild-type NTSR1 shows no appreciable constitutive activity.[7,30]

Thermostabilities of Single Alanine Mutants Correlate with Enthalpy

Figure 1 shows the performance of the calculated thermostability scores starting from the crystal structures of β1AR, A2AR, and NTSR1 (PDB ID: 2VT4, 3EML, 4GRV, respectively). We have compared the predictability of thermostable mutants starting from the crystal structure (single conformation) to the predictability calculated from an ensemble of conformations generated using the crystal structure in Figure 1. As a measure of predictability of thermostability using the calculated stability score, we constructed the Receiver Operational Characteristic (ROC) curves by plotting the true positive rate against the false positive rate for different cutoffs of the calculated stability score. Parts a, c, and e of Figure 1 show the ROC curves for β1AR, A2AR, and NTSR1, respectively, starting from crystal structures (single conformation) and from ensemble generated starting from the crystal structure using LITiCon. The straight lines in these figures, also termed the “random line”, represent the ROC curve for zero predictability. For all the receptors, the ROC curves are well above the random line, and the predictability improves when using an ensemble of conformations generated by LITiCon compared to using single conformation from crystal structure. This indicates that small variations in conformation upon single point mutations are important for more accurate thermostability prediction compared to using the crystal structures alone. To assess the number of single point mutation experiments that can be reduced by using these predictions, we plotted the enrichment factor as a function of cutoff in the number of mutations for β1AR, A2AR, and NTSR1 in Figure 1b, d, and f, respectively. We have also highlighted the individual thermostable mutations at the cut-offs where they were identified. The range of thermostability values used for defining strong, medium, and weak thermostable mutants for the three receptors are shown in Table S2 of the Supporting Information. The cutoff values are different for each receptor, due to the difference in thermostability of the wild type receptor under the experimental assay conditions. For all three receptors, the left side of the enrichment plots (Figure 1b, d, and f) shows a higher concentration of red and yellow mutations. Six out of nine experimentally known medium and strong thermostable mutants will be recovered within top 50 predicted thermostable mutants for β1AR, five out of seven for A2A receptor and three out of four for NTSR1. This suggests that in the LITiConDesign method of thermostability prediction, the strong thermostable mutations are selectively enriched over the weaker mutations. However, the method also missed several strong thermostabilizing mutations for each receptor (the red and yellow dots on the right side of Figures 1b, d, and f). These mutations were missed in all receptor models including the crystal structures and homology models. We have analyzed the reasons for missing these mutations in the discussion section.

Figure 1

The LITiCon method used for generating conformation ensemble takes one-hundredth of the computational time required for MD simulations. To evaluate the effectiveness of the conformation ensemble generated from LITiCon in predicting thermostable mutations, in comparison to the ensemble obtained from long time scale MD simulations, we compared the predictions using the stability scores calculated using two different schemes; (1) the ensemble generated by LITiCon starting from the crystal structures and (2) structural ensemble generated using MD simulations starting from the crystal structure. For details on selecting the MD conformations, please refer to the Methods section. Parts a and b of Figure 2 show a comparison of the ROC curves from the LITiCon ensemble to that from structural ensembles from MD simulations for β1AR and A2AR. For β1AR, the LITiCon using the most populated MD conformation performed the best, whereas both the crystal structure LITiCon and structural ensembles from MD simulation showed similar performance. In the case of A2AR, LITiCon using the crystal structure performed the best, whereas the MD ensemble showed the worst predictability. Overall, these results indicate that the structural ensembles generated by LITiCon perform equally or better than conformational ensembles generated using computationally expensive MD simulations.

Figure 2

Comparison of ROC curves for thermostability prediction using protein structural ensemble generated from (1) LITiCon ensemble generated starting from crystal structure and (2) representative conformations from MD of crystal structure; (a) β1AR (LITiCon AUC: 0.67. MD ensemble AUC: 0.64); (b) A2AR (LITiCon AUC: 0.64. MD ensemble AUC: 0.62).

ROC curves for thermostability prediction using an ensemble of receptor structures compared to single receptor structure. (a) β1AR (Ensemble AUC: 0.67. Single structure AUC: 0.56); (c) A2AR (Ensemble AUC: 0.64. Single structure AUC: 0.62); (e) NTSR1 (Ensemble AUC: 0.64. Single structure AUC: 0.59). Enrichment as a function of cutoff using (b) β1AR crystal structure; (d) A2AR crystal structure; (f) NTSR1 crystal structure. The mutants in the order of their measured experimental stability are shown in colored dots. Red, high thermostability; yellow, medium thermostability; and empty circles, weak thermostability. Comparison of ROC curves for thermostability prediction using protein structural ensemble generated from (1) LITiCon ensemble generated starting from crystal structure and (2) representative conformations from MD of crystal structure; (a) β1AR (LITiCon AUC: 0.67. MD ensemble AUC: 0.64); (b) A2AR (LITiCon AUC: 0.64. MD ensemble AUC: 0.62).

Performance of GPCR Homology Models in Predicting Thermostable Alanine Mutations

In the last section, we showed that the LITiConDesign method showed good predictability for thermostable mutations starting from GPCR crystal structures. However, the real case scenario is that the crystal structure of the GPCR to be thermostabilized will not be known. In this section, we discuss the thermostability prediction results derived using homology models for β1AR, A2AR, and NTSR1 without using any crystal structure information on these receptors. Figure 3 shows a comparison of the enrichment as a function of number of mutants for several homology models of β1AR, A2AR, and NTSR1. To observe how the accuracy of the homology model affects the predictability of thermostability, we selected several template structures of varying sequence similarity to β1AR, namely β2AR, dopamine receptor D3DR (67% and 40% sequence similarity), A2AR (33% sequence similarity), and CXCR4 (24% sequence similarity). For A2AR and NTSR1, we evaluated one homology model each, using β1AR and β2AR as templates, respectively. Unlike β1AR, A2AR and NTSR1 have no close template crystal structures. Therefore, we wanted to test how generic templates such as β1AR or β2AR perform in thermostability prediction. Since the NTSR1 homology model was based on the inactive state of β2AR, we evaluated its predictability using the −NT data. In contrast, for evaluating thermostability prediction using the NTSR1 crystal structure, we had used the +NT data, since the crystal structure shows active state characteristics. We also plotted the enrichments corresponding to 50% cutoff for each model against the Cα RMSD in the coordinates of the homology models from the respective crystal structures in Figure 4. While the β1AR models based on β2AR (RMSD of the β1AR to its crystal structure is 1.6 Å) and D3DR (RMSD 2.5 Å) performed the best among homology models, the CXCR4 based model (RMSD of the β1AR model to its crystal structure is 3.1 Å) performed the worst because of its distant sequence and structural homology with β1AR. For A2AR, the performance of the β1AR based homology model is worse compared to the A2AR crystal structure (65% vs 70%). Unlike β1AR, there are no other templates that are closer in sequence similarity to A2AR, for which crystal structures are available (e.g., adenosine receptors other than A2AR). Therefore, we could not test the effect of template closeness on prediction performance for A2AR. The RMSD in coordinates of the various homology models from their respective crystal structures and their percentage recovery of the thermostable mutants are tabulated in Table S3 of the Supporting Information. Overall, we find that for homology models that are within 2.5 Å (RMSD of the Cα atoms in the TM region) to their respective crystal structures, there is little change in performance of the thermostability predictions. The performance of the thermostability prediction is worse if the homology model has RMSD above 3 Å to its crystal structure.

Figure 3

Enrichment as a function of cutoff using (a) homology models of β1AR based on β2AR crystal structure as template; (b) β1AR model based on D3DR crystal structure as template; (c) β1AR model based on A2AR crystal structure as template; (d) β1AR model based on CXCR4 crystal structure as template; (e) A2AR model based on β1AR; (f) NTSR1 model based on β2AR structure as template. The experimentally determined thermostable mutants are highlighted as colored dots: red, high thermostability; yellow, medium thermostability; blue open circles, weak thermostability.

Figure 4

Comparison of enrichments for different receptor models. The nomenclatures are as follows: β1AR homo-D3DR implies β1AR homology model using D3DR as template; A2AR homo-β1AR implies A2AR homology model based on β1AR as template; the other models are named accordingly. The RMSD of the homology models from their respective crystal structures are tabulated in Supporting Information Table S3.

For each homology model, we have compared the performance of using LITiCon ensembles of receptor conformations to that using single receptor structures. For each receptor model, we have calculated the area under the ROC curve (AUC) for the thermostability predictions using LITiCon as well as single receptor structure. AUC is directly proportional to the predictability of the computation scheme, and for computation schemes with zero predictability, the AUC is close to 0.5. Figure 5 shows the comparison of AUC for LITiCon generated ensembles versus single receptor structure for each homology model of β1AR, A2AR, and NTSR1. Similar to the crystal structures, most of the homology models show moderate to large improvement in predictability by using ensembles of receptor conformations compared to single receptor structures. The improvement in performance is more prominent in the high accuracy β1AR models that were derived using close homologue template structures such as β2AR and D3DR. For homology models of β1AR derived using distant templates such as A2AR and CXCR4, the performance advantage of LITiCon over single receptor structure is negligible.

Figure 5

Comparison of thermostability prediction using different receptor structures. For performance comparison, the metric AUC (area under ROC curve) is used. For positive predictability, the AUC varies between 0.5 and 1. The higher the AUC, the greater is the predictability. The darker color bars are the AUC calculated using the LITiCon ensemble of structures, and the lighter colored bars have been calculated using single conformation either from crystal structures or homology models as indicated on the x-axis below each set of bars. In the x-axis, “crystal structure” refers to the crystal structure of the corresponding receptor. The blue bars that are labeled “homology β2AR” refers to the homology model of β1AR derived from the β2AR inactive state crystal structure as template.

For A2AR, the performance gain by using LITiCon ensembles as opposed to single crystal structure is subtle when comparing the area under the curve, although the enrichment of true positives is higher for the ensembles as seen in Figure 1c. The performance gain is modest in using the LITiCon ensemble for A2AR homology model based on β1AR as template. For NTSR1, the homology model based on β2AR showed a large improvement in predictability using the LITiCon ensemble method, as compared to using a single homology model structure. The AUC values for each structural model of the three receptors are given in Supporting Information Table S4. We have also tested the predictions for NTSR1 against both sets of experimental thermostability, the −NT and the +NT data. While using the crystal structure of NTSR1 showed better predictability for the +NT data, the homology model of NTSR1 based on the inactive state β2AR structure as template showed better predictability for the −NT data than the +NT data. This can be explained by the fact that the crystal structure of NTSR1 with neurotensin bound is in an active-like state, and hence agrees better with the +NT data. The inactive homology model agrees better with the −NT data, since the mutants that were thermostabilized under the −NT conditions were more likely to stabilize the inactive state. Enrichment as a function of cutoff using (a) homology models of β1AR based on β2AR crystal structure as template; (b) β1AR model based on D3DR crystal structure as template; (c) β1AR model based on A2AR crystal structure as template; (d) β1AR model based on CXCR4 crystal structure as template; (e) A2AR model based on β1AR; (f) NTSR1 model based on β2AR structure as template. The experimentally determined thermostable mutants are highlighted as colored dots: red, high thermostability; yellow, medium thermostability; blue open circles, weak thermostability. Comparison of enrichments for different receptor models. The nomenclatures are as follows: β1AR homo-D3DR implies β1AR homology model using D3DR as template; A2AR homo-β1AR implies A2AR homology model based on β1AR as template; the other models are named accordingly. The RMSD of the homology models from their respective crystal structures are tabulated in Supporting Information Table S3. Comparison of thermostability prediction using different receptor structures. For performance comparison, the metric AUC (area under ROC curve) is used. For positive predictability, the AUC varies between 0.5 and 1. The higher the AUC, the greater is the predictability. The darker color bars are the AUC calculated using the LITiCon ensemble of structures, and the lighter colored bars have been calculated using single conformation either from crystal structures or homology models as indicated on the x-axis below each set of bars. In the x-axis, “crystal structure” refers to the crystal structure of the corresponding receptor. The blue bars that are labeled “homology β2AR” refers to the homology model of β1AR derived from the β2AR inactive state crystal structure as template.

Other Receptor Properties That Improve the Thermostability Scores

We examined the properties other than enthalpy that can be used to improve the prediction of thermostability scores. Our previous results from extensive MD simulations on crystal structures of thermostable mutants and their respective wild type receptor structures of β1AR, and A2AR[12] have shown two important properties that are different in the wild type receptor compared to the thermostable mutants. In our earlier studies, we observed that the net stress (or force) on each residue played an important role in the function of the receptor as well as its thermostability. We have also calculated the allosteric communication pipelines in GPCRs that communicate the ligand binding to the G protein coupling region using mutual information on torsion angles.[28] We observed that many residues in the TM regions of the receptor mediate multiple allosteric communication pathways, and we define the number of allosteric communication pathways going through a given residue as its allosteric hub score. In this paper, besides the enthalpy, we have investigated the effect of using the net stress or force on each residue and allosteric hub score in thermostability predictions. Parts a and c of Figure 6 show the correlation between the allosteric hub scores for each residue in β1AR and A2AR crystal structures and the experimental thermostability score of the corresponding alanine mutant. We observe an inverse correlation between the allosteric hub score and the thermostability. Residues that are the strongest allosteric hubs (allosteric hub score >40, highlighted in red) show poor thermostability scores upon mutation.

Figure 6

Allosteric hub score and local stress are plotted against thermostability for each residue position in (a, b) β1AR and (c, d) A2AR. Residues on the right side of the dotted vertical line are considered thermostable. Residues that have high allosteric hub score or stress and poor thermostability are highlighted in red. The overall inverse correlation between stress and thermostability is shown by the regression lines in b and d.

Parts b and d of Figure 6 show plots of the stress on each residue versus the measured thermostability score for β1AR and A2AR, respectively. Residues with strong internal stress are highlighted in red. All of these residues show weak thermostability (less than that of wild type) when mutated to alanine. Mutating a residue with high stress to alanine reduces the thermostability of the mutant receptor. Our earlier MD simulation studies on multiple mutant thermostable receptors of β1AR, A2AR, and NTSR1 showed that the stress on each residue is reduced in the thermostable mutant compared to their respective wild type receptors. This, however, resulted from multiple mutations made on low stress residue positions in the wild type receptor. Parts b and d of Figure 6 show the Pearson’s correlation coefficient for β1AR and A2AR for stress vs thermostability score that showed a weak inverse correlation between residue stress and thermostability. For both cases, the correlation coefficient is negative, indicating the inverse relationship. The trend line obtained using least-squares fitting is also shown in the same figures. Allosteric hub score and local stress are plotted against thermostability for each residue position in (a, b) β1AR and (c, d) A2AR. Residues on the right side of the dotted vertical line are considered thermostable. Residues that have high allosteric hub score or stress and poor thermostability are highlighted in red. The overall inverse correlation between stress and thermostability is shown by the regression lines in b and d.

Combining Allosteric Communication and Stress Information with Enthalpy Score Improves Thermostability Prediction

To study the feasibility of using stress or allosteric communication in predicting thermostability, we plotted the normal distance from the random line (ndis) as a function of allosteric hub score and stress cut-offs, as shown in Supporting Information Figure S2. The normal distance, ndis, is a measure of predictability of thermostable mutants and is explained in Text S1 and Figure S1 of Supporting Information. Parts a and b of Figure S2 show the predictability when no enthalpy score cutoff is used. Parts c and d of Figure S2 show the predictability among mutants that have enthalpy scores above a cutoff of −1 kcal/mol (optimal cutoff for both receptors for maximizing TPR and minimizing FPR). The black and dark blue regions in Supporting Information Figure S2 show no predictability, whereas the yellow and red areas show the highest predictability. For both β1AR and A2AR, an allosteric hub score of 40 and internal stress of 7000 pN were found to be the optimal cut-offs for predicting thermostability, as indicated by the maxima in Figure S2 (red circle). Thus, combining stress and allosteric hub score with the stability score from enthalpy improves thermostability prediction as shown by the increased red region in Figure S2b and d. For A2AR, the improvement observed by adding allosteric hub and stress information over enthalpy score was more significant than for β1AR. We calculated the enrichments for different percent cutoffs using enthalpy score alone and including allosteric hub and stress information for both β1AR and A2AR, as shown in Figure 7. For all three metrics, the optimal score cut-offs were used as mentioned before. For both β1AR and A2AR, we find modest improvement in enrichment for lower cut-offs, while including allosteric hub and stress information. For β1AR, the improvement was found to be the highest at a cutoff of 35%, while for A2AR, the maximum enrichment was obtained at 20% cutoff.

Figure 7

Comparison of thermostability prediction using only enthalpy based score and by combining with allosteric hub score and residue based stress for (a) β1AR and (b) A2AR. The percent recovery of thermostable positives by screening different cut-offs of residue mutations are compared.

Discussion

The stability score reflects the average enthalpy of an ensemble of conformations of a mutant near the starting structure. Mutants that have a high thermostability score show favorable enthalpy for most of the conformations in the ensemble, whereas the mutants with low thermostability scores have conformations with unfavorable enthalpy in the ensemble. This means that thermostable mutants can maintain favorable enthalpy despite small structural perturbations, unlike thermally unstable mutants. Thus, thermostable mutants are more resistant to thermal fluctuations, and this could explain their stability. We have tested the performance and sensitivity of the LITiConDesign method by using homology models of varying resolution (closeness to the crystal structure). We find that the homology models derived using close homologues as templates (templates with low RMSD to the target crystal structure) perform better than the ones based on distant templates. Parts a and bo of Figure 8 show the performance of the different homology models of β1AR and A2AR, respectively. Parts e and f of Figure 8 show the performance of the NTSR1 models. For the +NT data, we used the crystal structure of active NTSR1, while for the −NT data, we used the homology model of inactive NSTR1 based on the β2AR crystal structure. We calculated the number of true thermostable mutants that are recovered in the top 50% when sorted by calculated thermostability scores. In Figure 8a and b, for each model, the residues that are correctly identified are highlighted in orange. As expected, for β1AR, the crystal structure, and the homology model of β1AR based on β2AR and D3DR performed better than the distant A2AR and CXCR4 based models. Residues that are correctly identified by most of the structures are colored green, the ones that are identified by only the close template models and crystal structures are colored yellow, while the ones that are not identified by any of the models are colored red. For the NTSR1 models (Figure 8e and f), residues that are correctly identified by the crystal structure or homology model are colored green, and the ones that are not identified are colored red. These residues are also highlighted in Figure 8c, d, g, and h. For β1AR, the three mutants (I551.46A, V902.47A, I1293.40A) that are not identified by any of the models are all located in tightly packed regions of the receptor, facing the core of the TM domain. In contrast, the residues that are successfully predicted by our method are located in loosely packed regions of the receptor (near the extracellular or intracellular termini of the TM helices) or facing the lipid bilayer. For the residues located in tightly packed regions of the receptor, the side chain optimization methods are inadequate to account for the repacking of the side chains upon mutation. This could be a possible reason for the failure of our method in not identifying certain residues. Also, the lack of electrostatic component in our energy function could contribute to the failure of this method in identifying certain mutations. The electrostatic energy could be important for mutations in tightly packed regions of the receptor that are in the neighborhood of polar residues. It is difficult to accurately estimate the electrostatic energy, especially in a hybrid environment such as the lipid bilayer. We have excluded the electrostatic component from our scoring function, since adding the electrostatic energy worsened the predictability of the thermostabilizing mutations. However, specialized energy functions for hydrogen bonds/salt bridges could be developed to model the effect of the polar residues in modulating receptor stability. Such energy functions are currently under development in our lab.

Figure 8

(A, B; E, F) Thermostable residues that were identified using different GPCR structures: (A) β1AR; (B) A2AR; (E) active-like state crystal structure of NTSR1; (F) homology model of inactive NTSR1 based on β2AR. Experimental thermostability scores have been normalized so that the experimental value for the wild type receptor is 50%. If a residue was correctly identified by a particular model, the cell corresponding to that model is colored orange; the residues that were identified by all models are colored green, ones that were identified only by the crystal structures or close template homology models are colored yellow; those that were not identified by any of the models are colored red. (C, D; G, H) crystal structures of (C) β1AR, (D) A2AR, (G) NTSR1 +NT, and (H) NTSR1 −NT showing the thermostable mutation positions. +NT and −NT refer to NTSR1 mutants that are thermostable in presence and absence of neurotensin, respectively.

In the case of A2AR (Figure 8d), the residues that were correctly predicted were all located in less tightly packed regions of the receptor, either in the extracellular or intracellular ends of the TM helices, or facing the lipid bilayer. It is not clear, however, why the three residues C773.25A, L853.33A, and L1925.53A could not be identified. In the case of NTSR1 bound to neurotensin (+NT mutants; Figure 8g), our method failed to identify several residues that are located mainly at the intracellular part of the receptor on TM5 and TM6. The residues that were correctly identified are on TM1 and TM7. Although it is not clear why the residues on TM5 and TM6 were not identified, it is possible that the limited conformational sampling performed by LITiCon failed to take into account the flexibility of TM5 and TM6 in the active state of NTSR1. These two helices are thought to undergo the largest conformational change during activation and are thus expected to be dynamic in the active state of the receptor. In contrast to the +NT mutants, the −NT mutants showed improved predictability using LITiConDesign, as shown in Figure 8h. Out of the two mutations on TM2, A1202.57L was successfully identified, while I1162.53A, located one helical turn below A1202.57, was not. Being in the middle of the TM2 region, I1162.53 is more tightly packed compared to A1202.57 and hence could not be identified due to difficulty in predicting side-chain packing. However, it is not clear why the other two mutations, L721.40A and L2054.61A (both of them face the lipid bilayer) could not be identified. We also tested whether the mutation positions that were combined to produce the thermostable mutants that were crystallized (with multiple mutations) for both receptors (i.e., m23 mutant of β1AR and StaR2 mutant of A2AR) could be identified by our method. Out of the four hydrophobic mutations that are part of the m23 mutant, only V902.53 was not identified. Among the StaR2 mutations, all of them were successfully identified. The crystallizing mutation positions along with their TM scores and thermostability (single point alanine mutation) are highlighted in Figure S3 of the Supporting Information. (A, B; E, F) Thermostable residues that were identified using different GPCR structures: (A) β1AR; (B) A2AR; (E) active-like state crystal structure of NTSR1; (F) homology model of inactive NTSR1 based on β2AR. Experimental thermostability scores have been normalized so that the experimental value for the wild type receptor is 50%. If a residue was correctly identified by a particular model, the cell corresponding to that model is colored orange; the residues that were identified by all models are colored green, ones that were identified only by the crystal structures or close template homology models are colored yellow; those that were not identified by any of the models are colored red. (C, D; G, H) crystal structures of (C) β1AR, (D) A2AR, (G) NTSR1 +NT, and (H) NTSR1 −NT showing the thermostable mutation positions. +NT and −NT refer to NTSR1 mutants that are thermostable in presence and absence of neurotensin, respectively. In both β1AR and A2AR, mutating allosteric hubs lead to poor thermostability, and therefore, these positions can be eliminated from potential positions to mutate. Using MD simulations, we mapped the major allosteric communication pathways in the inactive states of β1AR and A2AR. These pathways connect distant receptor domains that show correlated motion. In GPCRs, concerted motion of distant domains help to preserve the structural stability of specific functional states such as the inactive state. When the correlated movement between distant domains in the inactive state is disrupted, by mutating allosteric hubs, the inactive state could be destabilized. We recently showed that in β2AR, mutating allosteric hubs that mediate allosteric communication in the inactive state led to increased constitutive or agonist induced activity. On the other hand, mutating hub residues that mediate allosteric communication in the active state lead to reduced activity.[28] Thus, mutating the allosteric hubs in β1AR and A2AR could destabilize the inactive state leading to decrease in thermostability.

Conclusions

The LITiConDesign method is a computational method for rapid prediction of thermostable mutation positions in a given GPCR or any helical membrane protein. It is based on generating an ensemble of conformations generated using the conformational sampling method called LITiCon for GPCRs. We have analyzed the performance of the computational method LITiConDesign in predicting thermostabilizing mutations for GPCRs. We have compared the calculated thermostability scores to the experimental values for 434 single mutants from three class A GPCRs, β1AR, A2AR, and NTSR1. We found that mutations that have a high thermostability score are more probable to be thermostable experimentally. We also showed that using an ensemble of conformations was a better choice for predicting thermostability compared to a single structure. We observed that the ensemble of conformations from LITiCon performs similarly in predicting the thermostability as the ensembles from MD simulations. However, the LITiCon method takes 1/10th the computational time as the MD method. Typically, scanning a GPCR with 140 mutants takes 10 h using 32 Intel Xeon CPUs. Therefore, it can be very useful for fast predictions of thermostable single point mutants. We identified two important properties obtained from MD simulations of GPCR structures, which can be used to improve thermostability prediction. Using microseconds of MD simulations of the GPCR crystal structures, we calculated the net force (stress) on each residue in the receptor structures. We also identified the residues that communicate multiple allosteric pipelines (allosteric hubs). In both β1AR and A2AR, mutating allosteric hubs lead to poor thermostability, and therefore, these positions can be eliminated from potential positions to mutate. We observed that mutating residues with high level of stress in the wild type receptors to alanine does not lead to thermostability of these single point mutants, showing a weak inverse correlation between thermostability and stress. These observations hold true for single point mutations to alanine and not to other amino acids. Thus, including the residue based stress and allosteric hub information improved the prediction of the thermostable mutation positions in GPCRs. Our method for predicting thermostability lays the groundwork for computational design of thermostabilizing mutants of GPCRs in future.

36 in total

Review 1. A crystal clear solution for determining G-protein-coupled receptor structures.

Authors: Christopher G Tate
Journal: Trends Biochem Sci Date: 2012-07-10 Impact factor: 13.807

2. Structural insights into conformational stability of wild-type and mutant beta1-adrenergic receptor.

Authors: Gouthaman S Balaraman; Supriyo Bhattacharya; Nagarajan Vaidehi
Journal: Biophys J Date: 2010-07-21 Impact factor: 4.033

Review 3. CHARMM: the biomolecular simulation program.

Authors: B R Brooks; C L Brooks; A D Mackerell; L Nilsson; R J Petrella; B Roux; Y Won; G Archontis; C Bartels; S Boresch; A Caflisch; L Caves; Q Cui; A R Dinner; M Feig; S Fischer; J Gao; M Hodoscek; W Im; K Kuczera; T Lazaridis; J Ma; V Ovchinnikov; E Paci; R W Pastor; C B Post; J Z Pu; M Schaefer; B Tidor; R M Venable; H L Woodcock; X Wu; W Yang; D M York; M Karplus
Journal: J Comput Chem Date: 2009-07-30 Impact factor: 3.376

4. GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit.

Authors: Sander Pronk; Szilárd Páll; Roland Schulz; Per Larsson; Pär Bjelkmar; Rossen Apostolov; Michael R Shirts; Jeremy C Smith; Peter M Kasson; David van der Spoel; Berk Hess; Erik Lindahl
Journal: Bioinformatics Date: 2013-02-13 Impact factor: 6.937

5. Structure of a class C GPCR metabotropic glutamate receptor 1 bound to an allosteric modulator.

Authors: Huixian Wu; Chong Wang; Karen J Gregory; Gye Won Han; Hyekyung P Cho; Yan Xia; Colleen M Niswender; Vsevolod Katritch; Jens Meiler; Vadim Cherezov; P Jeffrey Conn; Raymond C Stevens
Journal: Science Date: 2014-03-06 Impact factor: 47.728

6. Constitutive activation of the neurotensin receptor 1 by mutation of Phe(358) in Helix seven.

Authors: Séverine Barroso; Françoise Richard; Delphine Nicolas-Ethève; Patrick Kitabgi; Catherine Labbé-Jullié
Journal: Br J Pharmacol Date: 2002-02 Impact factor: 8.739

Review 7. Nanobody stabilization of G protein-coupled receptor conformational states.

Authors: Jan Steyaert; Brian K Kobilka
Journal: Curr Opin Struct Biol Date: 2011-07-21 Impact factor: 6.809

8. Ligand-stabilized conformational states of human beta(2) adrenergic receptor: insight into G-protein-coupled receptor activation.

Authors: Supriyo Bhattacharya; Spencer E Hall; Hubert Li; Nagarajan Vaidehi
Journal: Biophys J Date: 2007-12-07 Impact factor: 4.033

9. Crystal structures of a stabilized β1-adrenoceptor bound to the biased agonists bucindolol and carvedilol.

Authors: Tony Warne; Patricia C Edwards; Andrew G W Leslie; Christopher G Tate
Journal: Structure Date: 2012-05-09 Impact factor: 5.006

10. Engineering a more thermostable blue light photo receptor Bacillus subtilis YtvA LOV domain by a computer aided rational design method.

Authors: Xiangfei Song; Yefei Wang; Zhiyu Shu; Jingbo Hong; Tong Li; Lishan Yao
Journal: PLoS Comput Biol Date: 2013-07-04 Impact factor: 4.475

13 in total

1. Rapid Bioinformatic Identification of Thermostabilizing Mutations.

Authors: David B Sauer; Nathan K Karpowich; Jin Mei Song; Da-Neng Wang
Journal: Biophys J Date: 2015-10-06 Impact factor: 4.033

2. Prediction of Conformation Specific Thermostabilizing Mutations for Class A G Protein-Coupled Receptors.

Authors: Suvamay Jana; Soumadwip Ghosh; Sanychen Muk; Benjamin Levy; Nagarajan Vaidehi
Journal: J Chem Inf Model Date: 2019-08-27 Impact factor: 4.956

Review 3. Theoretical identification of thermostabilizing amino acid mutations for G-protein-coupled receptors.

Authors: Takeshi Murata; Satoshi Yasuda; Tomohiko Hayashi; Masahiro Kinoshita
Journal: Biophys Rev Date: 2020-04-08

4. Machine Learning for Prioritization of Thermostabilizing Mutations for G-Protein Coupled Receptors.

Authors: Sanychen Muk; Soumadwip Ghosh; Srisairam Achuthan; Xiaomin Chen; XiaoJie Yao; Manbir Sandhu; Matthew C Griffor; Kimberly F Fennell; Ye Che; Veerabahu Shanmugasundaram; Xiayang Qiu; Christopher G Tate; Nagarajan Vaidehi
Journal: Biophys J Date: 2019-10-24 Impact factor: 4.033

5. Engineering Salt Bridge Networks between Transmembrane Helices Confers Thermostability in G-Protein-Coupled Receptors.

Authors: Soumadwip Ghosh; Tobias Bierig; Sangbae Lee; Suvamay Jana; Adelheid Löhle; Gisela Schnapp; Christofer S Tautermann; Nagarajan Vaidehi
Journal: J Chem Theory Comput Date: 2018-11-06 Impact factor: 6.006