Literature DB >> 33180505

Ground-State Destabilization by Active-Site Hydrophobicity Controls the Selectivity of a Cofactor-Free Decarboxylase.

Michal Biler¹, Rory M Crean¹, Anna K Schweiger², Robert Kourist², Shina Caroline Lynn Kamerlin¹.

Abstract

Bacterial arylmalonate decarboxylase (AMDase) and evolved variants have become a valuable tool with which to access both enantiomers of a broad range of chiral arylaliphatic acids with high optical purity. Yet, the molecular principles responsible for the substrate scope, activity, and selectivity of this enzyme are only poorly understood to date, greatly hampering the predictability and design of improved enzyme variants for specific applications. In this work, empirical valence bond and metadynamics simulations were performed on wild-type AMDase and variants thereof to obtain a better understanding of the underlying molecular processes determining reaction outcome. Our results clearly reproduce the experimentally observed substrate scope and support a mechanism driven by ground-state destabilization of the carboxylate group being cleaved by the enzyme. In addition, our results indicate that, in the case of the nonconverted or poorly converted substrates studied in this work, increased solvent exposure of the active site upon binding of these substrates can disturb the vulnerable network of interactions responsible for facilitating the AMDase-catalyzed cleavage of CO2. Finally, our results indicate a switch from preferential cleavage of the pro-(R) to the pro-(S) carboxylate group in the CLG-IPL variant of AMDase for all substrates studied. This appears to be due to the emergence of a new hydrophobic pocket generated by the insertion of the six amino acid substitutions, into which the pro-(S) carboxylate binds. Our results allow insight into the tight interaction network determining AMDase selectivity, which in turn provides guidance for the identification of target residues for future enzyme engineering.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2020 PMID： 33180505 PMCID： PMC7735706 DOI： 10.1021/jacs.0c10701

Source DB: PubMed Journal: J Am Chem Soc ISSN： 0002-7863 Impact factor: 15.419

Introduction

Enzymatic catalysis of the formation and breaking of C–C bonds is currently receiving increasing attention.[1] In this context, enzymatic decarboxylation in particular has become highly attractive for the synthesis of optically pure building blocks[2] and the synthesis of alkenes[1,3−5] and alkanes from biobased precursors.[6] The release of gaseous CO2 renders decarboxylases quasi-irreversible, which has been exploited to drive numerous enzymatic cascade reactions.[7−11] In general, enzymatic decarboxylation can proceed in both an oxidative[4] and a nonoxidative[1] manner. Most nonoxidative decarboxylases employ organic cofactors such as pyridoxyl phosphate, thiamine diphosphate, or an N-terminal pyruvyl group as electron sinks to accommodate the intermediary charge after cleavage of carbon dioxide. Interestingly, three different types of cofactor-independent decarboxylases use substrate-assisted catalysis and thus have the ability to cleave C–C bonds without an internal electron sink. With its highly unusual mechanism, orotidine-5′-phosphate decarboxylase has emerged as a model to study enzymes using ground-state destabilization as a catalytic principle.[12] Among several discussed mechanisms, one uses a so-called “Circe”-effect, in which binding of the phosphate group accommodates the substrate in a binding mode where unfavorable interactions lead to cleavage of a carboxylate group of the substrate. In this vein, the mechanism of phenolic acid decarboxylase (PAD) has been suggested to proceed via a quinone methide intermediate formed by protonation of the substrate double bond.[3] This explicitly requires hydrogen bonding of the p-hydroxy group of the substrate with two tyrosine residues. In both cases, the involvement of functional groups of the substrate strictly limits the substrate scope. For instance, PAD decarboxylates differently substituted cinnamic acid derivatives, but all substrates must bear a p-hydroxy group.[1,13] Bacterial arylmalonate decarboxylase from Bordetella bronchiseptica (AMDase, EC 4.1.1.76) was discovered by the Ohta group in the early 1990s, on the basis of a functional screen.[14,15] AMDase catalyzes the stereospecific decarboxylation of α-disubstituted malonic acids, resulting in pure enantiomers of the respective monoacids (Scheme ). While the acid-catalyzed decarboxylation of prochiral arylmalonates forms racemic product, AMDase catalyzes this reaction stereoselectively. Due to its outstanding stereoselectivity, AMDase has been utilized for the synthesis of a wide range of α-chiral carboxylic acids,[14] including several α-arylpropionates with pharmaceutical activity, such as naproxen[16,17] and flurbiprofen,[18−20] α-hydroxy and α-amino acids,[21] and α-heterocyclic[22] and α-alkenyl[23] propionates. Furthermore, combination with metal-catalyzed reduction allows for the synthesis of optically pure α-alkyl propionates.[9]

Scheme 1

Reaction Mechanism of Wild-Type AMDase and Its Variants with Inverted Enantioselectivity (When Introducing the G74C Substitution, i.e., Swapping the Catalytic Cysteine from Position 188 to Position 74) and Promiscuous Racemic Activity (When Introducing/Maintaining Cysteines at Both Positions 74 and 188 Simultaneously)

The pro-(R) carboxylate is shown in black, and the pro-(S) carboxylate in red.

Reaction Mechanism of Wild-Type AMDase and Its Variants with Inverted Enantioselectivity (When Introducing the G74C Substitution, i.e., Swapping the Catalytic Cysteine from Position 188 to Position 74) and Promiscuous Racemic Activity (When Introducing/Maintaining Cysteines at Both Positions 74 and 188 Simultaneously)

The pro-(R) carboxylate is shown in black, and the pro-(S) carboxylate in red. Initial studies of AMDase, performed in the absence of a crystal structure, showed that it requires a substituent with a delocalized π-electron system,[15] which can be provided either by an aromatic group or an alkene. The smaller substituent can be a hydrogen or fluorine atom, a methyl group, or an amino or hydroxy group; larger substituents such as an ethyl group are not accepted.[2,15] Several AMDases have been isolated from different bacteria.[24−27] All show strict preference for the formation of the (R)-enantiomers. Using both enantiomers of pseudochiral 13C-labeled malonates, it was shown that AMDase exclusively cleaves the pro-(R)-carboxylate.[28] Following from this, the elucidation of several structures of AMDase in both its unliganded and ligand-bound forms[23,29−31] revealed the presence of two binding pockets in the active site. While the first contains several hydrogen-bond donors, the second is mostly composed of hydrophobic residues. Micklefield and co-workers suggested a mechanism that proceeds in two steps: (1) Binding of the pro-(S)-carboxylate in the former pocket, stabilized by several H-bonds, pushes the pro-(R)-carboxylate into a configuration with very unfavorable interactions in the hydrophobic pocket, leading to facile cleavage of the C–C bond and the formation of a planar intermediate.[31] (2) The donation of a proton by cysteine 188 from one side explains the formation of the pure (R)-products. Ohta and co-workers shifted the position of the catalytic cysteine to the other side, resulting in the formation of pure (S)-enantiomers[32] (Scheme ). While the stereoinversion led the G74C/C188S variant to lose its activity by 20 000-fold, iterative saturation mutagenesis of the hydrophobic pocket partly restored the activity.[33−35] Decarboxylation of isotope-labeled malonates confirmed that the (S)-selective variants also cleave the pro-(R)-carboxylate.[33] A variant with both catalytic cysteines present (i.e., C188 intact and the artificial C74 introduced by the G74C substitution) has racemizing activity, which allows for study of the second half-reaction of the mechanism.[36,37] Semiempirical QM/MM calculations[37] showed that the racemization proceeds in a stepwise fashion, through stepwise deprotonation and reprotonation of the planar intermediate shown in Scheme . Stabilization of this intermediate requires a delocalized π-electron system. The 3.5 kcal mol–1 energy barrier to the deprotonation step was lower than that of the initial deprotonation of the cysteines (at 25 kcal mol–1), which might explain the drastic pH-dependence of the G74C/C188G variant. A quantum mechanical model of AMDase[38] confirmed that in the decarboxylation of methylphenyl malonate 1a, C–C bond cleavage is rate-determining. It was argued that enantioselectivity is already determined during substrate binding, as only one binding mode was found to be energetically viable. In the case of a smaller vinyl malonate substrate, it was argued that due to the energetic accessibility of multiple binding modes, both the binding step and the subsequent transition states contribute to the observed selectivity. We note that these calculations were performed with truncated AMDase models, and the results were heavily dependent on model size. A smaller 81 atom model composed of only the substrate and residues forming the dioxyanion hole yielded a small energy difference of only 1.5 kcal mol–1 between the cleavage of the pro-(R) and the pro-(S) carboxylate groups. However, extension of the model to include several other key residues (to a total of 223 atoms) increased this energy difference to 18.3 kcal mol–1. A more recent computational study[39] has studied AMDase using the same two cluster models as that found in ref (38), but using soft harmonic confining potentials on the boundaries of the system, rather than the fixed atom model of ref (38). This yielded a smaller energy difference of 6.4 kcal mol–1 with the larger cluster model, which could also reproduce the enantioselectivity. These differences disclose the complexities found when modeling the system using truncated models. A full enzyme model would provide a better overview of the molecular origins of the observed selectivity. This can be achieved by a complete electrostatic and dynamic treatment within either a QM/MM, an empirical valence bond, or a related framework. In particular, the somewhat nonintuitive results obtained from iterative saturation mutagenesis require a model that takes into account at least the complete first coordination sphere. The hypothetical mechanism for AMDase presented in ref (38) explains the strict preference of AMDase for cleaving the pro-(R)-carboxylate, the inversion of stereopreference in the G74C/C188X variants, and the racemizing activity of the G74C variant. It also provides an energy profile for the reaction and indicates a plausible substrate binding mode. Yet, the predictability of the outcome of amino acid substitutions in the active site is very limited. Saturation mutagenesis of (R)-selective[18,23] and (S)-selective[34,35] AMDase variants allowed for significant increases in AMDase activity through very conservative substitutions in the active site. So far, it is very difficult to rationalize why exchanges like L40V, V43I/L, V156L and M159L exert such a remarkable effect on AMDase activity. Moreover, the substrate selectivity of AMDase (Scheme ) is very difficult to explain: that is, while AMDase catalyzes the decarboxylation of a large series of arylmalonates with a small second substituent (such as H, F, Me), α-ethyl arylmalonates are not converted.[2,15] In addition, while the second substituent might be quite large, AMDase does accept p-isobutylphenyl malonate (which would lead to optically pure ibuprofen) only with very poor catalytic efficiency.[35] In both poorly or nonconverted substrates, the inductive effect of the alkyl substituents might impede the stabilization of the planar, charged dienoate intermediate, or their size might lead to steric hindrance.

Scheme 2

Model Compounds Used in This Study and Their Experimentally Observed Acceptance by Wild-Type AMDase

The pro-(R) carboxylate is shown in black, and the pro-(S) carboxylate is shown in red. Shown here are also the specific activities for each compound (U mg–1), based on data presented in refs (15, 18, 34, and 35). We note that 1d is fully not converted (n.c.) by AMDase, wherease 1e is converted, but with very low conversion efficiency as shown in Table .

Model Compounds Used in This Study and Their Experimentally Observed Acceptance by Wild-Type AMDase

Table 1

Calculated Activation (ΔG‡) and Reaction Free Energies (ΔG0), Obtained Using the Empirical Valence Bond Approach, As Well As Relevant Corresponding Experimental Observables, For the Decarboxylation of Compounds 1a through 1e by Wild-Type AMDase and Variantsa

system		Pro-(R)		Pro-(S)		experimental data
		ΔG^‡	ΔG⁰	ΔG^‡	ΔG⁰	selectivity	k_cat	ΔG^‡_exp
1a	WT	15.6 ± 0.4	14.0 ± 0.6	26.6 ± 0.6	24.9 ± 0.6	(R)	279[23]	14.1[23]
	G74C/C188G	23.1 ± 0.6	21.4 ± 0.6	30.3 ± 0.7	28.7 ± 0.6	(S)	0.004[35]	21.6[35]
	CLG-IPL	26.8 ± 0.7	24.6 ± 0.7	18.1 ± 0.4	17.2 ± 0.4	(S)	3.8[35]	17.4[35]
1b	WT	15.9 ± 0.7	12.9 ± 0.9	20.2 ± 0.7	18.5 ± 0.8	(R)	15.1,[18] 31[20]	16.1,[18] 15.4[20]
	G74C/C188G	17.9 ± 0.9	14.1 ± 0.9	23.4 ± 0.7	21.7 ± 0.7	(S)
	G74C/C188A	20.7 ± 0.9	17.2 ± 1.0	21.2 ± 0.7	18.0 ± 0.9	(S)
	CLG-IPL	16.7 ± 0.5	14.0 ± 0.7	15.8 ± 0.6	12.5 ± 0.7	(S)	23.7,[18] 70[20]	15.9,[18] 15.0[20]
1c	WT	18.0 ± 0.3	17.1 ± 0.3	24.6 ± 0.9	21.5 ± 1.0	(R)	38.7[18]	15.6[18]
	G74C/C188G	22.7 ± 0.4	20.6 ± 0.5	26.9 ± 0.6	25.3 ± 0.7	(S)	0.077[34]	19.0[34]
	CLG-IPL	22.3 ± 0.7	20.3 ± 0.7	14.4 ± 0.4	13.7 ± 0.5	(S)	4.3[18]	16.9[18]
1d	WT	28.3 ± 0.8	25.8 ± 0.9	32.9 ± 1.8	29.9 ± 1.7
1e	WT	18.0 ± 0.4	16.3 ± 0.5	35.4 ± 0.7	33.7 ± 0.7	(R)	0.23[35]	19.1[35]
1e	CLG-IPL	34.4 ± 1.7	31.7 ± 1.5	17.1 ± 0.6	15.9 ± 0.6	(S)	0.56[35]	18.6[35]

Obviously, the activity and selectivity of AMDase can be determined by very subtle interactions in the active site. In order to obtain a dynamic model of the decarboxylation, and to obtain insights into the factors determining substrate acceptance and activity of active-site variants, we investigated the rate-determining first half-reaction (the decarboxylation step) of the decarboxylation of substrates shown in Scheme as catalyzed by wild-type enzyme and substituted variants of AMDase, using the empirical valence bond (EVB) approach.[40] We have considered the cleavage of both the pro-(R) and pro-(S) carboxylate groups for each substrate and enzyme variant considered in this work, taking into account multiple potential binding modes of each substrate, and coupled this with metadynamics simulations to explore the relative stability of different binding modes at the Michaelis complex. We have also examined how each enzyme variant modulates the hydrophobicity/hydrophilicity throughout the active site to drive catalysis using analysis based on Grid Inhomogeneous Solvation Theory (GIST).[41] Our calculations produce convincing reaction pathways in agreement with experimental observables, pointing to a strongly favored binding mode leading to production of the (R)-enantiomer in wild-type AMDase and to the (S)-enantiomer in variants with the catalytic cysteine transferred to the opposite side of the active site. They rationalize the origins of the tremendous catalytic efficiency of this enzyme, as well as of mutational effects on this activity. Finally (and importantly), our EVB simulations are able to both reproduce and provide a rationale for the unusual substrate acceptance of this enzyme, laying the groundwork for future protein engineering effort on this enzyme.

Methodology

The empirical valence bond (EVB) approach[40] is our methodology of choice in this study, based on the previous successes of both ourselves and others in using this approach to describe enzyme selectivity.[42−45] Here, we have performed EVB simulations of the decarboxylation of compounds 1a through 1e (Scheme ) by wild-type and mutant variants of AMDase, specifically by the G74C/V156L/C188G/V43I/A125P/M159L (“CLG-IPL”) variant (compounds 1a, 1b, 1c, and 1e), the G74C/C188G and G74C/C188A variants (compound 1b), and the G74C/C188G variant (compound 1a and 1c). These variants were selected based on the availability of experimental data,[18,20,23,34,35] with the exception of the G74C/C188A variant for which experimental data is not available. An in-depth description of our simulation protocol and subsequent simulation analysis is provided in the Supporting Information (SI); we provide here a brief summary of our methodology. Our starting point for simulations of the wild-type enzyme was the structure of wild-type AMDase from Bordetella bronchiseptica, in complex with the potential mechanism-based inhibitor benzylphosphonate (PDB ID: 3IP8(23,46)). Due to the lack of structural data on the enzyme variants of interest to this work, all subsequent mutations were manually generated based on the wild-type crystal structure using the Dunbrack and Cohen backbone-dependent rotamer library,[47] as implemented into the PyMOL Molecular Graphics System.[48] The specific side chain rotamers used in the simulations were chosen based on visual inspection for proximity to nearby side chains (to avoid steric clashes), as well as the calculated percentage probability of finding each side chain in a given rotameric state. Substrates were docked into the active site using AutoDock Vina v. 1.1.2,[49] which resulted in numerous binding poses. These can be grouped into two representative highly ranked binding poses (Figure S1), the top ranking of which (“Mode I”) has been the focus of this work, for reasons described in the Supplementary Methodology. System setup was performed as described in the SI. Once system setup was complete, all enzyme–substrate complex variants of interest to this work were first equilibrated at the approximate EVB transition state (λ = 0.5) for 30 ns, followed by EVB simulations performed on the end points of the equilibration runs and propagated from the approximate EVB transition states, using the valence bond states shown in Figure S2. Each EVB simulation was performed in 51 individual mapping windows per trajectory of 200 ps length each. For each system, we performed two independent sets of equilibrations and EVB systems, taking into account the cleavage of each of the pro-(R) and pro-(S) carboxylate groups per compound (the separate equilibrations were necessary as we are propagating from the transition states). Each set of simulations for the cleavage of each carboxylate group was performed in 30 individual replicates (60 per substrate), leading to total cumulative equilibration and EVB simulation time scales of 1.8 and 0.612 μs per enzyme–substrate complex, respectively. Calibration of the EVB parameters was performed as described in Section S1 of the SI. All EVB simulations were performed using the Q6 simulation package[50] and the OPLS-AA force field,[51] and all EVB parameters necessary to reproduce our work can be found in the SI. As our EVB simulations appear to sample distinct binding poses for the cleavage of the pro-(R) and pro-(S) carboxylate groups, we also performed well-tempered metadynamics (WT-MetaD)[52] simulations to calculate the relative populations of the two reactive binding modes at the Michaelis complex. WT-MetaD simulations were performed on the same set of the substrates and enzymes as used in our EVB simulations. Following a standard MD system preparation and equilibration procedure (see the SI Methodology), WT-MetaD simulations were performed in the NPT ensemble (298 K, 1 atm) using the Amber ff14SB[53] and GAFF2[54] force fields (for protein and ligand atoms respectively) and the TIP3P[55] water model. WT-MetaD simulations were performed using AMBER 18[56] interfaced with PLUMED v2.7,[57] with subsequent MD simulation analysis performed using a combination of PLUMED v2.7[57] and CPPTRAJ.[58] We used a single collective variable (CV) for all WT-MetaD simulations, which was the mean angle of both carboxylate groups’ orientation in the active site (Figure S3). The combination of both carboxylate groups in a single CV allowed for discrimination of either binding pose independent of which (identical in simulation terms) carboxylate group was orientated where. To prevent the dissociation of any substrate from the active site (or a catalytically competent pose) we applied “Boresch style” restraints[59] (Figure S4) between atoms on each substrates’ 6-membered ring (which is conserved for all substrates) and Leu77 of the oxyanion hole. Convergence was assessed by monitoring the time evolution of the free energy profile (Figure S5) alongside checking for “diffusive dynamics” (Figure S6) along the CV for each system. To determine the thermodynamic properties of the water molecules within the AMDase active site, we performed grid inhomogeneous solvation theory (GIST)[41,60] analysis using CPPTRAJ[58] on the unliganded active sites of the four enzyme variants investigated in this manuscript, as well as three additional variants which are intermediates along the trajectory of improvement in iterative saturation mutagenesis[35] from G74C/C188G to CLG-IPL (see the SI Methodology). For this, an additional MD simulation was run for each enzyme for 100 ns, with all protein heavy atoms restrained (as is standard with this approach, see the SI Methodology).[60] The output of the GIST analysis was used to determine and project the “surface mapped hydrophobicity” onto each substrate atom, using the approach described by Kraml et al.[61] We note that as the GIST analysis was performed on the unliganded states of each enzyme (to identify how each enzyme modulates the active site environment), and the optimal positions of both carboxyl groups are essentially identical across the different substrates for the same binding pose, we focused our GIST analysis on only compound 1b (as this compound was studied by EVB and metadynamics simulations for all four enzymes).

Results and Discussion

Empirical Valence Bond Simulations of AMDase Selectivity Toward Different Compounds

In this work we study decarboxylation of five π-conjugated compounds (Scheme ) differing in their degree of aromaticity and attached substituents, by both wild-type AMDase and its variants (CLG-IPL, G74C/C188G, and G74C/C188A). The choice of the enzyme to study was led by the fact that wild-type AMDase from B. bronchiseptica converts compounds 1a–c in an (R)-selective fashion,[15,18] whereas compounds 1d–e are curiously either not converted at all (1d) or only very poorly converted (1e).[15,35] The CLG-IPL variant, which carries six amino acid substitutions, was studied here because of its shift to (S)-selectivity[18,35] and the doubly substituted variants were studied for their overall low activity levels after introducing the substitutions.[34,35] Moreover, it has been experimentally demonstrated that even a simple interchange to glycine or alanine at position 188 can have a crucial influence on the enzyme kinetics,[32,34] and therefore we considered variants with both glycine and alanine present at position 188. The AMDase-catalyzed breakdown of compounds 1a through 1e to produce optically pure (R)- and/or (S)-products is a multistep reaction, initiated through the rate-limiting cleavage of a carboxylic group to yield an sp2-hybridized planar intermediate. This is followed by proton transfer to the intermediate from a nearby amino acid side chain. Critically, it is unclear which carboxylic group of the substrate is preferentially cleaved during this process, as this is not seen in the stereochemistry of the final product. On the basis of isotope-labeling experiments it would appear that, in both the wild-type enzyme[28,31] and the (S)-selective S36N/G74C/C188S variant of AMDase,[33] there is a strong preference for cleavage of the pro-(R) carboxylate group of the substrate. However, as described in the Methodology section, our docking simulations provided multiple possible binding modes in the active site for each substrate considered in this work, although only Mode I-like conformations such as that illustrated in Figure S1 are catalytically productive. Following from this, it can be argued that while variants with the G74C/C188S motif would produce (S)-enantiomers from the same binding mode as would produce (R)-enantiomers in the wild-type enzyme, multiple binding modes would lead to a mixture of the two enantiomers of the α-arylpropionates formed. In Mode I, the pro-(S) carboxylate of the substrate is closer to Cys188 and is stabilized by hydrogen bonding interactions from the diaoxyanion hole of AMDase, while the pro-(R) carboxylate of the substrate is partly located in the hydrophobic pocket. Upon equilibration (Figure ), the substrate rotates slightly such that the pro-(R) carboxylate is fully in the hydrophobic pocket. In contrast, in Mode II, the substrate is rotated by 180° along the z-axis, such that the pro-(R) carboxylate group is instead closer to Cys188, and the pro-(S) carboxylate group is located in the hydrophobic pocket, in contrast to what would be expected from experimental studies.[28,31,33] In addition, EVB simulations of enzyme–substrate complexes with the substrate bound in Mode II provided very high activation free energies in the range of 24–41 kcal mol–1, further suggesting that this is not a catalytically viable binding mode, and therefore we have not considered Mode II further for detailed analysis. Finally, we independently simulate the cleavage of each of the two carboxylate groups of the substrate, resulting in two different potential decarboxylation routes per compound, allowing us to obtain computational predictions of the pro-(R) vs pro-(S) preference of AMDase toward each compound studied here.

Figure 1

An illustration of the catalytically preferred binding mode of compound 1b, “Mode I”, after molecular dynamics equilibration in preparation for EVB simulations. (A) An overview of the AMDase binding pocket. (B) A detailed overview of the interactions between the substrate and oxyanion hole. (C) A detailed overview of substrate positioning in the hydrophobic pocket. The corresponding amino acids main chains are for simplicity excluded from the figure. As can be seen, after initial equilibration, the substrate rotates slightly compared to the initial docking pose (Figure S1) such that the pro-(S) carboxylate group of the substrate is stabilized by the dioxyanion hole, and the pro-(R) carboxylate group points toward the hydrophobic pocket. The initial docking poses for both Mode I and Mode II prior to equilibration are shown in Figure S1. We note that compound 1b is selected merely for illustration purposes, and similar binding modes were obtained for all compounds studied in this work. The results of our EVB simulations of the decarboxylation of compounds 1a through 1e (Scheme ) by wild-type and variants of AMDase are summarized in Table and Figure . This table also shows the corresponding selectivities, kinetics (kcat), and activation free energies estimated based on experimentally measured activities of each variant toward each compound studied here, where experimental data is available.[18,20,23,34,35] From this data, it can be seen that our EVB models only show turnover of compounds 1a–c and 1e, in good agreement with experimental observables,[18,20,23,28,34,35] whereas the activation free energies for compound 1d are very high for the cleavage of both carboxylic groups, suggesting that 1d is not transformed by the enzyme. In cases where experimental data was available to allow for activation free energies to be estimated, we typically obtain activation barriers within ∼3 kcal mol–1 of the experimental value for cleavage of the energetically preferred carboxylate group. We consider this acceptable due to the lack of experimental data on the reference reaction, necessitating our calculations to be calibrated to density functional theory (DFT) calculations (see SI Section S2), thus introducing uncertainty. In addition, our calculations are able, with reasonable quantitative accuracy, to reproduce the experimentally observed loss of activity upon substitution of C188 to either glycine or alanine,[34,35] as observed in the G74C/C188G and G74C/C188A variants, as well as the fact that the substitution to alanine is more detrimental to the activity of the enzyme than the substitution to glycine.[32]

Figure 2

Calculated (pro-(R) and pro-(S)) and, where available, experimental (Exp) activation free energies (ΔG‡, kcal mol–1) for the decarboxylation of compounds 1a through 1e (Scheme ) by wild-type (WT) AMDase and its variants. All calculated values are averages and standard error of the mean over 30 individual EVB trajectories per system, as described in the SI Methodology section. The raw data is provided in Table .

All calculated values are averages and standard error of the mean over 30 individual EVB trajectories per system, as described in the Methodology section, and shown here are data obtained from modeling the decarboxylation of each compound through cleavage of either the pro-(R) or pro-(S) carboxylate groups. WT denotes the wild-type enzyme. Both experimental and calculated activation and reaction free energies are presented in kcal mol–1. Shown here are also the experimentally observed selectivities for each compound, as well as the corresponding kinetics (kcat, s–1) and activation free energies (ΔG‡exp) derived from the experimentally observed activities toward each compound by each variant, as presented in refs (18, 20, 23, 34, and 35). The kcat values were either taken directly from the literature, or were estimated by using the relationship kcat = (specific activity × molecular weight). The calculated activation free energies were obtained from the kcat values using transition state theory at temperature 30 °C (for ref (18)), 37 °C (for ref (35)), and 25 °C for the rest. Note that the specific activities were obtained from bar graphs provided in ref (18) and therefore the experimental kinetics and energetics are only approximate. Blank cells denote that experimental data is not available for a given system. Calculated (pro-(R) and pro-(S)) and, where available, experimental (Exp) activation free energies (ΔG‡, kcal mol–1) for the decarboxylation of compounds 1a through 1e (Scheme ) by wild-type (WT) AMDase and its variants. All calculated values are averages and standard error of the mean over 30 individual EVB trajectories per system, as described in the SI Methodology section. The raw data is provided in Table . In terms of selectivity, it is important to bear in mind that the preference for the cleavage of the bond to a given carboxylate group in the initial decarboxylation step (Scheme and Table ) does not translate directly to the final product selectivity. That is, all reactions proceed through a common planar intermediate, with the selectivity being determined in the second step of the reaction upon reprotonation of the planar intermediate. This, in turn, is dependent on the binding pose of the substrate in the Michaelis complex, which can, in principle, be any of the three theoretical substrate binding poses to the wild-type AMDase active site as discussed in Section S3 of the SI and illustrated in Figure S7. Nevertheless, we typically observe Michaelis complexes with the substrate in Pose A (Figure A) when we model cleavage of the pro-(R) carboxylate group, and Pose B (Figure B) when we model cleavage of the pro-(S) carboxylate group. We distinguish here between binding “Modes” (the initial conformations for the equilibration, Figures and S1) and “Poses” (the conformations obtained at the Michaelis complexes following EVB simulations, Figure S7). However, this distinction is purely semantic and made only for clarity of discussion. For representative structures of key stationary points for the cleavage of compounds 1a to 1e by wild-type AMDase, see Figures and S8–S11.

Figure 3

Representative structures of the Michaelis complexes (MC), transition states (TS), and intermediate states (IS), for cleavage of (A) the pro-(R) and (B) the pro-(S) carboxylate groups of compound 1a by wild-type AMDase, as obtained from EVB simulations of these reactions. For the full reaction mechanism, see Scheme . The structures shown here are the centroids of the top ranked cluster obtained from clustering on RMSD, performed as described in the SI. The labeled C–C distances are averages at each stationary point over all trajectories (see Table S1). Corresponding representative structures of key stationary points during simulations of the wild-type AMDase catalyzed decarboxylation of compounds 1b to 1e can be found in Figures S8–S11. The color-coding of key residues follows that of Figure A. For all compounds studied (Scheme and Table ), we observe preferential cleavage of the pro-(R) carboxylate by wild-type AMDase by 1.5–11 kcal mol–1 depending on the substrate, as is to be expected due to the destabilization of the pro-(R) carboxylate by unfavorable interactions in the hydrophobic pocket[31] (Figure ). We note that this preference is preserved in the case of compounds 1d and 1e, which are observed to be either not (1d) or only very poorly (1e) converted by AMDase.[15,35] On the basis of the schema presented in Figure S7 and the binding poses observed in Figures and S8–S11, this would be expected to lead to the (R)-product in all cases. This is in agreement with isotope-labeling experiments performed by two independent groups[28,31,33] on the (R)-selective wild-type and the (S)-selective variant S36N/G74C/C188S, which have shown that the preferred carboxylate to be cleaved is the pro-(R) carboxylate in both cases. In the case of the G74C/C188G and G74C/C188A variants, these variants would be expected to result in the formation of pure (S)-enantiomers, due to the proton donating cysteine side chain, which is on the opposite face of the intermediate as compared to the wild type enzyme.[34,35] Once again, this stereoselective protonation is independent of which carboxylate group was cleaved beforehand. Our simulations show preferential cleavage of the pro-(R) carboxylate group (Table ) with the Michaelis complex bound in Pose A of Figure S7, which is in agreement with the finding, that also (S)-selective AMDase variants might cleave the pro-(R) carboxylate.[33] Finally, in the case of the CLG-IPL variant (which carries six amino acid substitutions: G74C/M159L/C188G/V43I/A125P/V156L), we observe preferential cleavage of the bond to the pro-(S) carboxylate group, although as with the G74C/C188X double mutants, this would still be expected to lead to the (S)-product due to the Michaelis complex being bound in Pose B (Figure S7). We note that while no isotope labeling studies have been performed on the CLG-IPL variant, our modeled (S)-selectivity is in good agreement with the experimentally observed production of pure (S)-enatiomer products.[18,34] In addition, our calculations reproduce both the expected formation of the (S)-enantiomer and the experimental activation free energies for the decarboxylation of compounds 1a through 1c, and 1e by the CLG-IPL variant of AMDase with reasonable quantitative accuracy compared to experiment[18,20,34,35] (Table ). We note that this is overall a particularly interesting AMDase variant, as each of the hydrophobic residues introduced into this variant (i.e., proline, leucine, isoleucine) have been shown to be very important determinants of AMDase activity.[18,34,35] Following from this, in addition to an activity increase in the decarboxylation of flurbiprofen malonate 1b, this variant showed also remarkable differences in the relative activity toward differently substituted α-aryl propionates.[18]

Exploring the Molecular Origin of the Observed Effects on the Activation Free Energies

While our EVB models for the reactions catalyzed by wild-type AMDase and its variants do not provide perfect quantitative agreement with experiment, due to the uncertainties involved in the energetics of the corresponding nonenzymatic reactions (see Section S2 of the SI), they nevertheless appear to provide meaningful qualitative insights into both AMDase substrate preference as well as selectivity toward cleavage of a given carboxylate group. In particular, our model only shows turnover of compounds 1a through 1c and 1e, in good agreement with experiment. We also obtain very high activation barriers for compound 1d, in agreement with the fact that decarboxylation of this substrate is not experimentally observed. In addition, experimentally, the activity of AMDase toward substrate 1e is significantly lower than toward other substrates 1a through 1c.[18,20,23,34,35] This could be due to the presence of sterically bulky and/or flexible ethyl and isobutyl groups, which would make compounds 1d and 1e challenging to accommodate in the hydrophobic pocket of the AMDase active site, resulting in nonproductive binding modes. In our simulations, we observe larger motions of these substrates (RMSD of up to 1.9 Å compared to the starting structure) compared to substrates such as 1a, where the substrate RMSD over the course of the simulation is 1 Å or less compared to the starting structure (see Figures S12 and S13). In addition to this, the ethyl and isobutyl groups of compounds 1d and 1e, respectively, are also highly “floppy” and fluctuate extensively across the simulation time (Figure ), making it more challenging for these compounds to settle into a productive binding mode in the AMDase active site. In conjunction with this, in the case of compounds 1d and 1e we observe greater solvent penetration of the active site compared to the other compounds studied in this work, which will counteract the destabilizing effect of the hydrophobic pocket. Finally, the inductive effect of the alkyl substituents would be expected to destabilize the charged intermediate formed upon cleavage of either carboxylate group, thus making the corresponding decarboxylation also energetically unfavorable through a Hammond effect. Indeed, our EVB simulations (Table ) support this at least in the case of compound 1d, as the reaction free energy for formation of this charged intermediate is significantly higher (by up to 12.9 kcal mol–1, in the case of cleavage of the bond to the pro-(R) carboxylate group) for the decarboxylation of this compound compared to the other compounds studied in this work.

Figure 4

Joint distribution of the dihedral angles along the ethyl and isobutyl groups of compounds (A) 1d and (B) 1e, as well as the root-mean-square deviations of the substrate (RMSD), during 30 ns molecular dynamics simulations of each compound in complex with wild-type AMDase in preparation for subsequent EVB simulations. In the case of the dihedral angles, the C1–C2–C3–C4 and C1–C2–C3–H1 atoms of the ethyl group and of isobutyl group of 1d and 1e, respectively, were chosen for analysis in each case (see Figure S2). Snapshots were taken every 100 ps of the 30 ns simulations, and thus this analysis was performed on 9000 discrete data points per plot. In terms of structural effects, we considered the impact of substrate binding on the active site volume of AMDase, calculated at the Michaelis complexes of wild-type AMDase and its variants in complex with each of compounds 1a through 1e. These were calculated using POcket Volume MEasurer (POVME) 3.0,[62] as in our previous work.[63] As can be seen from Figure and Table S2, the calculated active site volumes largely follow substrate size. That is, the smallest active site volumes are observed in the case of compounds 1a and 1d, which differ only by substitutent (methyl for 1a, ethyl for 1d). This is followed by compound 1e, which has an additional isopropyl group compared to compounds 1a, and finally the multiring substrates 1b and 1c. The standard deviations on the calculated values also increase with increasing substrate size, but only slightly compared to the absolute volumes, suggesting the active site is flexible enough to also accommodate the bulky larger substrates, without being excessively “floppy”.

Figure 5

Average active site volumes during simulations of wild-type AMDase and its variants in complex with compounds 1b to 1e, calculated using POcket Volume MEasurer (POVME) 3.0.[62] Data is presented as average values and standard deviations over structures obtained at the Michaelis complexes of 30 independent EVB trajaectories, and analysis was performed on 600 snapshots per system (extracting data every 10 ps of the 200 ps mapping window corresponding to the Michaelis complex of each individual EVB trajectory). The corresponding raw data is presented in Table S2. We also considered the solvent-accessibility of the active site in our simulations, taking into account that one of the two carboxylate groups is stabilized by a dioxyanion hole while the other (more likely to be cleaved) carboxylate group is located in a hydrophobic pocket. As can be seen from Figure and Table S3, there is significant variety in the number of water molecules in close proximity (within 4 Å) of the carboxylate group being cleaved, with compounds that are turned over by AMDase typically having less than one water molecule close to the reacting group at the transition state, and with this number increasing to as many as two to four (from close to none) in the case of compounds 1d and 1e which either do not or are unlikely to react in the AMDase active site. This is likely due to the high flexibility of these substrates when in complex with AMDase (Figure ), which provides space for additional water molecules to enter the active site. We note that the number of water molecules for G74C/C188X variants is up to two, which may also contribute unfavorably to their low activity. The importance of sequestering the active site from solvent has been discussed in several prior studies,[64−67] and, in particular, a clear correlation between activity loss and increased active site solvation has been shown for several enzymes.[64,66,68,69] Therefore, it is perhaps unsurprising to see yet again for AMDase increased solvent exposure of the active site in conjunction with the binding of compounds 1d and 1e, which are either not turned over at all or only poorly converted by this enzyme, respectively, despite not being significantly structurally different from other compounds that are reactive (Table and Scheme ).

Figure 6

Average number of water molecules within 4 Å of the carboxylate group being cleaved (either pro-(R) or pro-(S), as relevant) during the last 25 ns of our 30 ns equilibration runs at the transition state for each reaction modeled in this work. Data is presented as average values and standard error of the mean over 30 individual trajectories per system, with data collected every 10 ps of simulation time. For the corresponding raw data associated with this figure, see Table S3. Finally, although hydrophobic effects clearly dominate in determining the selectivity of AMDase (through destabilizing one carboxylate group and sequestering the active site from solvent), we have also considered the electrostatic contributions of individual amino acids to the calculated activation free energies (Figure and Table S4). This is of particular interest to us because, as discussed in Section S4 of the SI, any structural differences between the different transition states involved are minimal. This suggests that energetic differences between different substrates and variants are driven by differences caused by the initial binding pose of the substrate rather than structural effects at the transition state. Electrostatic contributions were estimated by applying the linear response approximation (LRA)[70,71] to our EVB trajectories, as in previous work.[64,66,72] From this data, it can be seen that in the case of wild-type AMDase, where the preferred carboxylate group being cleaved is the pro-(R) carboxylate, the T75 and Y126 side chains from the dioxyanion hole provide modest stabilizing contributions to the developing charge at the transition state, by stabilizing the pro-(S) carboxylate group, although this contribution is offset by a destabilizing contribution from the S76 side chain.

Figure 7

Electrostatic contributions of individual amino acids (ΔΔG‡elec, kcal mol–1) to the calculated activation free energies for the decarboxylation of compounds 1a to 1e by wild-type AMDase. Data is presented as average values over 30 individual trajectories per system. The corresponding raw data and associated standard error of the mean for each value is shown in Table S4. Amino acids forming the oxyanion hole are highlighted in red, those forming the hydrophobic pocket in blue, and the catalytically important residues at positions 74 and 188 in green. Shown here is data corresponding to the energetically preferred cleavage of the pro-(R) carboxylate group (Table ). The corresponding figure and raw data for the cleavage of the pro-(S) carboxylate group are shown in Figure S14 and Table S5. In the case of cleavage of the pro-(S) carboxylate group (Figure S14 and Table S5), this is inversed with stabilizing contributions from T75 and S76, offset by a destabilizing contribution from Y126. Similarly, in the case of the side chains forming the hydrophobic pocket, contributions from all residues but M159 are destabilizing to the cleavage of the pro-(R) carboxylate group (Figure ), whereas the inverse is observed for cleavage of the pro-(S) carboxylate (Figure S14) where the residues from the hydrophobic pocket make modest stabilizing contributions to the activation free energy for the decarboxylation reaction, and the side chain of M159 is destabilizing. Overall, these contributions are in conceptual agreement with how charge development is localized in the respective transition state. However, the fact that not all residues in the dioxyanion hole or hydrophobic pocket make stabilizing or destabilizing contributions for any given system also indicates that the residue contributions are more complex than that of a simple model where one set of residues stabilizes and the other set of residues destabilizes the decarboxylation reaction. Finally, we also examined the corresponding contributions to the reactions catalyzed by the G74C/C188G, G74C/C188A, and CLG-IPL variants (Figures S15–S18 and Tables S6–S9). We note that while there are some subtle quantitative differences compared to the wild-type enzyme, these are not significant enough to account for the large energetic differences observed between different systems, as shown in Table . Rather, these appear to be determined by changes in solvent penetration of the active site between different variants (due to changes in active site volumes), as well as ground-state effects, as described in the subsequent section.

Exploring Ground-State Effects on the Observed Selectivities

To probe the role of ground-state destabilization in driving AMDase catalysis, we turned to grid inhomogeneous solvation theory (GIST)[41,60] to measure the local hydrophobicity/hydrophilicity throughout the active site. In GIST (see the Methodology for further details), molecular dynamics simulations are analyzed using inhomogeneous solvation theory to produce a detailed grid map of the thermodynamic properties of water for a defined region of interest (i.e., an active site). Here, we used GIST to calculate the solvation free energy of the active site and used this as a measure of the hydrophobicity.[61] This approach explicity considers both nonadditive and cooperative effects on the local hydrophobicity,[41,60,61] both of which are known to play significant roles in modulating the hydrophobicity/solvation free energy.[61,73] We projected the local hydrophobicity onto both possible reactive binding Poses (A and B) of compound 1b for each enzyme (Figure A,B for the wild-type enzyme and the CLG-IPL variant, and Figure S19 for the G74C/C188G and G74C/C188A variants). We first note that the majority of the AMDase active site is hydrophobic, which not only complements its typical range of substrates (Scheme ) but also likely helps drive substrate binding (through the release of energetically unfavorable water molecules in the active site upon substrate binding). Focusing on the reacting carboxylate groups for wild-type AMDase in Pose A (Figure A), we identify clear evidence for ground-state destabilization driving AMDase catalysis, as the cleaving (pro-(R)) carboxylate group is placed into a destabilizing hydrophobic environment, while the pro-(S) carboxylate group is in a stabilizing hydrophilic environment created by the oxyanion hole residues. Consistent with our EVB simulations for wild-type AMDase (Table ), reactivity through Pose B to cleave the pro-(S) carboxylate group appears to be significantly less favorable.

Figure 8

Projection of the local active site hydrophobicity onto the two potentially reactive binding poses for (A) wild-type AMDase and (B) the CLG-IPL variant. For both enzyme variants, the local hydrophobicity surrounding each atom of compound 1b is colored according to the scale on the right-hand side, with more negative values indicating a more hydrophilic environment for that atom. For both variants, an overview picture is shown with the catalytic residues colored yellow, the oxyanion hole residues colored green, the (original) hydrophobic pocket residues colored brown, and residues in orange denoting those substituted to obtain the CLG-IPL variant. The smaller pictures associated with both variants describe the local hydrophobicity for either potentially reactive binding mode, with the pro-(R) and pro-(S) carboxylate groups labeled throughout. (C) Progressive construction of the second hydrophobic pocket to allow AMDase activity through binding Pose B. Each enzyme is shown in binding Pose B and colored as described in panels A and B, with the exception that point mutations accumulated along the pathway from G74C/C188G are progressively recolored from orange to red. Calculation and projection of the active site hydrophobicities onto each ligand atom was performed by determining the solvation free energy with GIST[41,60] and then using the mapping procedure described in ref (61). Equivalent projections as in panels (A) and (B) are provided in Figure S19 for the G74C/C188G and G74C/C188A AMDase variants. In contrast to wild-type AMDase, the CLG-IPL variant was determined by our EVB simulations to preferably react through binding Pose B to cleave the pro-(S) carboxylate group (Table ). Analysis of Figure B shows clear evidence of ground-state destabilization of the pro-(S) carboxylate group in binding Pose B, due to the fact that the six mutations introduced between the wild-type enzyme and the CLG-IPL variant have led to the formation of a new hydrophobic pocket, enabling the CLG-IPL variant to cleave the pro-(S) carboxylate group. Interestingly, the original hydrophobic pocket in the CLG-IPL variant does not appear to have been substantially impacted by these mutations, suggesting that binding Pose A could still be a reasonably reactive binding pose (Figure B). This is supported by our EVB simulations, which indicate that while cleavage of the pro-(S) carboxylate of compound 1b is energetically preferred in the CLG-IPL variant, the activation free energy for cleavage of the pro-(R) carboxylate group is only slightly higher than that obtained for cleavage of the pro-(S) carboxylate group. The CLG-IPL variant was generated from the G74C/C188G variant, using iterative rounds of simultaneous saturation mutagenesis (SSM) experiments,[35] in which after each SSM round, a single additional mutation was taken forward for the next round of screening. We aimed to see if we could reproduce the formation of this new hydrophobic pocket over its engineered evolutionary pathway, and therefore performed additional MD simulations and GIST analysis on the three intermediates connecting the G74C/C188G and CLG-IPL variants, projecting the obtained results onto compound 1b in its catalytically preferred binding Pose B (Figure C and Table ). Transitioning from the wild-type enzyme to the G74C/C188G variant removes the steric clash induced by the side chain of C188 with the pro-(S) carboxylate group, allowing the substrate to more optimally orient into the active site and improve the stabilization of the pro-(R) carboxyl in the oxyanion hole. The hydrophobicity of the environment surrounding the pro-(S) carboxylate group notably increases upon the introduction of the M159L subtitution to the G74C/C188G variant, which is consistent with the experimentally observed large increase in activity upon mutation (∼1700-fold increase in kcat/Km[35]). The remaining substitutions from the triple mutant to the sextuple mutant (CLG-IPL) generally have more subtle impact on the substrates’ environment, including alterations in nonreacting regions of the substrate (see e.g., the transition from the triple to quadruple mutant). Nevertheless, there is a clear gradual increase of the hydrophobicity over the evolutionary trajectory, demonstrating that ground-state destabilization through increasing active site hydrophobicity is used to both control selectivity toward cleavage of a given carboxylate group and enhance AMDase catalysis. We note that, in the case of the CLG-IPL variant, the generation of this new hydrophobic pocket was not by design but rather was a serendipitous outcome of the in vitro evolution.[35] However, engineering such pockets is clearly an example of a strategy that can also be harnessed in a targeted fashion for the rational engineering of challenging systems such as AMDase and related enzymes, where the selectivity is not being determined at the level of steric hindrance or specific hydrogen bonding interactions (which are much easier to target through rational design).

Differences in the Ground-State Binding Pose Populations

Alongside differences in activation free energies already explored by our EVB simulations, AMDase’s stereoselectivity could (partially) be being regulated at the Michaelis complex, through the differential stabilization of the two plausible reactive binding Poses A and B. To determine the extent to which this controls AMDase selectivty, we performed well-tempered metadynamics (WT-MetaD)[52] simulations (see the Methodology section) to calculate the relative free energy difference between the two plausible binding poses (Figure ). Our WT-MetaD simulations used a single collective variable (CV, i.e., a reaction coordinate) to describe the relative orientation of both carboxylate groups independent of which carboxylate group is orientated in whichever direction (see the SI Methodology and Figure S3 for further details). We note that these simulations calculate the relative favorability of either binding pose (which ultimately controls stereoselectivty), and therefore do not inform on differences in binding affinities.

Figure 9

(A) Free energy profiles describing the relative populations of either binding Pose A or B (Figure S7) for the same combinations of substrates and enzymes as used in our EVB simulations (Table ). The catalytically preferred binding pose, based on the calculated activation free energies from our EVB simulations, is denoted with a * [colored to match the line color for each enzyme variant, as shown in the color key of panel (A)]. Profiles were obtained using well-tempered metadynamics (WT-MetaD) simulations with a single collective variable (CV1) used to describe the relative orientation of both carboxylate groups of the substrate in the active site. The approximate regions of both binding poses are indicated on each graph. (B) Representative structures (obtained from clustering, see the SI Methodology) of both binding poses and the approximate transition state (TS) between them for wild-type AMDase in complex with compound 1b. Hydrogen-bonding interactions between the substrate and oxyanion hole residues are indicated by dashed lines. Our WT-MetaD simulations are presented in Figure , and show that different substrates and enzyme variants can clearly have a notable impact on the calculated free energy profiles. Regardless, in all cases, we identify two energy minima, which describe Pose A and B (Figure S7) respectively, alongside a TS barrier (located at ∼1.15 rad along the x-axis, Figure A) for interconversion between the two binding poses. This barrier describes the approximate point at which interactions between one carboxylate group and the oxyanion hole are breaking, while interactions between the other carboxylate group and the oxyanion hole are forming (Figure B). The global free energy minima for the wild-type enzyme and both doubly substituted variants are always located in binding Pose A, which is also their most catalytically favorable reactive pose based on our EVB simulations (Table ). In contrast, in the CLG-IPL variant, this variant has its free energy minimum located in its EVB determined reactive binding pose (Pose B) for all substrates but compound 1c. However, in the case of this compound, the free energy difference between Poses A and B is only ∼1.5 kcal mol–1, meaning that these two binding poses can easily interconvert. In fact, if we correct our EVB calculated activation free energy of 14.4 kcal mol–1 (Table ) by the approximate free enery required to reach Pose B from Pose A (∼1.5 kcal mol–1), then we obtain a corrected activation free energy of 15.9 kcal mol–1, which is in better quantitative agreement with the experimentally observed value of 16.9 kcal mol–1.[18] Our WT-MetaD results therefore suggest that the optimal reactive binding pose is also the free energy mimina (or very close in energy to it, as in CLG-IPL with compound 1c). Therefore, the preference of this enzyme for cleavage of one carboxylate group over another appears to be determined at multiple stages in the catalytic cycle: first through preferential binding of one substrate binding pose over another, then through selective destabilization of the carboxylate group that is preferentially cleaved by its placement in a solvent-excluded hydrophobic pocket which makes cleavage of this group facile, and finally, through differences in transition state stabilization for cleavage of the pro-(R) and pro-(S) carboxylate groups for each variant.

Conclusions

The unique capacity of cofactor-free decarboxylases to cleave C–C bonds under mild reaction conditions raises several questions regarding the destabilization of carbon–carbon bonds and the stabilization of an intermediary charge without the aid of an electron sink provided by an external cofactor. While its biological role has still not been clarified,[2] arylmalonate decarboxylase is a unique biocatalyst for the production of optically pure carboxylic acids from prochiral arylaliphatic malonic acids. However, despite increasing insight into the underlying molecular processes involved in the reaction mechanism,[2,23,30] several important questions remain unanswered. While a hydrophobic pocket in the active site was revealed to be a key determinant for AMDase activity,[30,31] the results of amino acid exchanges in this region have often been counterintuitive.[23,34,35] Similarly, restrictions on the substrate scope of this enzyme were difficult to understand.[2,15,35] It remained unclear to which extent steric effects or the reactivity of the substrate control the acceptance of different substrates by AMDase. We were particularly curious to which degree a possible “ground-state destabilization”-driven mechanism or “Circe-effect” guides substrate acceptance, activity, and selectivity. We note that prior computational work has suggested that the enantioselectivity is determined already at substrate binding, although in some cases the energy differences between the binding modes can be small enough for the decarboxylation transition state to contribute to the enantiodiscrimination.[38] Experimentally, the stereoselectivity of the enzyme is believed to depend on the binding mode of the substrate, something that was suggested as early as 1992, where it was also shown that only one carboxylate group is cleaved.[28] In contrast, and as we also show here, the substrate spectrum is difficult to rationalize on the basis of binding alone. That is, on the one hand, a delocalized π-electron system (either an aromate or an olefin) is required on the substrate,[15] indicating that transition state stabilization is crucial and that the transition state energy must not be too high for conversion. Here, the ground-state stabilization of flurbiprofen (1b) and naproxen (1c) should be similar, but we observe faster conversion of the former by several variants of AMDase,[18] which reflects the higher reactivity due to the electron-withdrawing fluorine substituent. On the other hand, a slight difference such as one additional carbon atom between 1a and 1d is decisive for conversion,[15] which is difficult to explain by electronic properties alone. Our EVB simulations help us obtain molecular-level insight into the drivers for the experimentally observed substrate acceptance of AMDase. We are able to reproduce activation free energies for the AMDase-catalyzed decarboxylation of compounds 1a to 1c and 1e by all AMDase variants studied to within 3 kcal mol–1 of the experimental value (where known). The quantitative accuracy of our calculations is limited by the lack of experimental data against which to calibrate the corresponding nonenzymatic reaction to in all cases, thus limiting us to calibration to quantum chemical calculations as outlined in the SI. However, despite this caveat, in all cases, our simulations are able to correctly predict both the product-selectivity and the substrate discrimination of AMDase. For all compounds, the preference for cleavage of the pro-(R) vs the pro-(S) carboxylate group appears to be driven by substrate positioning in the Michaelis complex, with preferential cleavage of the carboxylate group that interacts most closely with the hydrophobic pocket. In the case of compound 1d, where no turnover is observed experimentally, our EVB calculations also yield activation free energies of >28 kcal mol–1, depending on carboxylate group being cleaved. Further analysis of our simulations indicate that this is due to a combination of inadequate substrate binding in the active site due to the presence of the bulky ethyl group, combined with greater solvent penetration into the active site, which is unfavorable for the decarboxylation reaction, and ties in with other computational work[64−68] emphasizing the importance of sequestering the active site from solvent. Similar observations are made in the case of compound 1e which is only poorly converted by AMDase. Following from this, analysis of electrostatic effects have highlighted the complex interplay between individual stabilizing and destabilizing contributions of residues from the dioxyanion hole and hydrophobic pocket for cleavage of the pro-(R) and pro-(S) carboxylate groups. This interplay between stabilizing and destabilizing contributions from the dioxyanion hole and hydrophobic pocket reflect the fact that, in turn, enzyme catalysis can, in theory, be facilitated by either stabilization of the transition state or destabilization of the ground-state, for example by placing the charged carboxylate group to be cleaved in a hydrophobic pocket as in the case of AMDase. While this may seem counterintuitive, there exist many examples of ground-state destabilization playing an important role in catalysis by both natural[74−76] and designed enzymes.[77−81] In particular, the concept of ground-state destabilization in catalysis of decarboxylation reactions has been discussed extensively in the case of orotidine-5′-phosphate decarboxylase (OMPDC),[82−86] a tremendously proficient decarboxylase that provides 31 kcal mol–1 of transition state stabilization compared to the nonenzymatic reaction.[87] Like AMDase, OMPDC is one of the few cofactor-free decarboxylases. In the case of OMPDC, evidence has been put forward that catalysis is not due to desolvation effects or ground-state destabilization, but rather due to electrostatic stabilization of the transition state for the decarboxylation reaction,[12,88,89] as well as the involvement of a ligand-gated conformational change that drives catalysis.[90,91] In the present case, our data indicate that electrostatic interactions play a clear role in stabilizing the individual transition states for the AMDase-catalyzed decarboxylation reaction. However, ground-state destabilization clearly appears to be critical for determining the selectivity between different potential transition states, leading to the observed substrate- and product-selectivities. That is, our GIST analysis provides clear evidence of AMDase’s use of ground-state destabilization (through the construction of a hydrophobic environment for the cleaving carboxylate group) to drive enzyme catalysis. We also identified a newly formed hydrophobic pocket present in the CLG-IPL variant which enables catalysis through binding Pose B. Additional simulations of variants along the evolution pathway to CLG-IPL showed a progressive optimization of the hydrophobicity of the active site toward reacting via Pose B. Our WT-MetaD simulations show that for all substrates considered in this work, the optimal reactive binding pose (determined from our EVB simulations) is in almost all cases the most populated binding pose at the Michaelis complex, or very close in energy to it. This indicates that there is already a preference for one binding pose over another at the Michaelis complexes of most variants studies here. Therefore, our simulations clearly demonstrate a role for ground-state destabilization, through creating a hydrophobic cage for the carboxylate group being cleaved, with loss of activity in the case of compounds 1d and 1e being linked to increased stabilization of the carboxylate group being cleaved through greater solvent exposure of the active site, coupled with destabilization of the resulting cationic intermediate through inductive effects. Our simulations therefore provide clear insights into effects that can be easily manipulated in further engineering of this biocatalytically important enzyme. This is significant both for being able to rationalize the effect of amino acid substitutions on AMDase selectivity, as well as for understanding the mechanistic principles in cofactor-free enzymes that have the capacity to cleave C–C bonds with the limited catalytic set of functional groups provided by the 20 canonical amino acids. This is needed, because enzyme design efforts on this system have been, in large part, hampered by the counterintuitive effects observed after the introduction of mutations, which has negatively impacted predictability. For example, while three substitutions in the active site pocket sufficed to alter the activity of the enzyme by 900-fold,[34] the tremendous effect of the substitution of a valine or methionine to a leucine or isoleucine on the enzymatic activity is difficult to understand.[18,23,34,35,92] In addition, the consequences of mutagenesis on the accommodation of water molecules in the active site or complex stabilizing and destabilizing interactions are extremely difficult to predict, which explains often observed counterintuitive effects, such as a decrease of the racemising activity after creating space in the active site, and an increase after introducing a larger hydrophobic side-chain.[92] In the case of AMDase, site-directed random mutagenesis is currently the engineering method of choice. The complexity of the active-site interactions demonstrated by us, particularly in the hydrophobic pocket, indicates why this is the case. Still, our results point out concrete targets for improvement, such as the putative second hydrophobic pocket identified in our GIST analysis. That is, amino acid variation based on the assumption that a sequence space defined by some positions contains improved variants has been successfully demonstrated by us and other, whereas the outcome of defined amino acid substitutions is very hard to predict.[18,23,34] However, as for example in the case of the CLG-IPL variant shown here, it appears that targeting the ground-state destabilization of the substrate by engineering of new hydrophobic cavities into which the substrate can bind could be one straightforward way to rationally manipulate the selectivity of this enzyme.

68 in total

1. Development and testing of a general amber force field.

Authors: Junmei Wang; Romain M Wolf; James W Caldwell; Peter A Kollman; David A Case
Journal: J Comput Chem Date: 2004-07-15 Impact factor: 3.376

2. A proficient enzyme revisited: the predicted mechanism for orotidine monophosphate decarboxylase.

Authors: J K Lee; K N Houk
Journal: Science Date: 1997-05-09 Impact factor: 47.728

3. GTP Hydrolysis Without an Active Site Base: A Unifying Mechanism for Ras and Related GTPases.

Authors: Ana R Calixto; Cátia Moreira; Anna Pabis; Carsten Kötting; Klaus Gerwert; Till Rudack; Shina C L Kamerlin
Journal: J Am Chem Soc Date: 2019-06-26 Impact factor: 15.419

4. POVME 3.0: Software for Mapping Binding Pocket Flexibility.

Authors: Jeffrey R Wagner; Jesper Sørensen; Nathan Hensley; Celia Wong; Clare Zhu; Taylor Perison; Rommie E Amaro
Journal: J Chem Theory Comput Date: 2017-08-30 Impact factor: 6.006

5. A proficient enzyme.

Authors: A Radzicka; R Wolfenden
Journal: Science Date: 1995-01-06 Impact factor: 47.728

Review 6. Catalytic proficiency: the unusual case of OMP decarboxylase.

Authors: Brian G Miller; Richard Wolfenden
Journal: Annu Rev Biochem Date: 2001-11-09 Impact factor: 23.643

7. Increased Diels-Alderase activity through backbone remodeling guided by Foldit players.

Authors: Christopher B Eiben; Justin B Siegel; Jacob B Bale; Seth Cooper; Firas Khatib; Betty W Shen; Foldit Players; Barry L Stoddard; Zoran Popovic; David Baker
Journal: Nat Biotechnol Date: 2012-01-22 Impact factor: 54.908

8. Improvement of the Process Stability of Arylmalonate Decarboxylase by Immobilization for Biocatalytic Profen Synthesis.

Authors: Miriam Aßmann; Carolin Mügge; Sarah Katharina Gaßmeyer; Junichi Enoki; Lutz Hilterhaus; Robert Kourist; Andreas Liese; Selin Kara
Journal: Front Microbiol Date: 2017-03-16 Impact factor: 5.640

9. Active Site Hydrophobicity and the Convergent Evolution of Paraoxonase Activity in Structurally Divergent Enzymes: The Case of Serum Paraoxonase 1.

Authors: David Blaha-Nelson; Dennis M Krüger; Klaudia Szeler; Moshe Ben-David; Shina Caroline Lynn Kamerlin
Journal: J Am Chem Soc Date: 2017-01-11 Impact factor: 15.419

10. Conformational diversity and enantioconvergence in potato epoxide hydrolase 1.

Authors: P Bauer; Å Janfalk Carlsson; B A Amrein; D Dobritzsch; M Widersten; S C L Kamerlin
Journal: Org Biomol Chem Date: 2016-04-06 Impact factor: 3.876

2 in total

1. Engineered P450 Atom-Transfer Radical Cyclases are Bifunctional Biocatalysts: Reaction Mechanism and Origin of Enantioselectivity.

Authors: Yue Fu; Heyu Chen; Wenzhen Fu; Marc Garcia-Borràs; Yang Yang; Peng Liu
Journal: J Am Chem Soc Date: 2022-07-13 Impact factor: 16.383

2. Key difference between transition state stabilization and ground state destabilization: increasing atomic charge densities before or during enzyme-substrate binding.

Authors: Deliang Chen; Yibao Li; Xun Li; Xuechuan Hong; Xiaolin Fan; Tor Savidge
Journal: Chem Sci Date: 2022-06-21 Impact factor: 9.969

2 in total