Enzymes are conformationally dynamic, and their dynamical properties play an important role in regulating their specificity and evolvability. In this context, substantial attention has been paid to the role of ligand-gated conformational changes in enzyme catalysis; however, such studies have focused on tremendously proficient enzymes such as triosephosphate isomerase and orotidine 5'-monophosphate decarboxylase, where the rapid (μs timescale) motion of a single loop dominates the transition between catalytically inactive and active conformations. In contrast, the (βα)8-barrels of tryptophan and histidine biosynthesis, such as the specialist isomerase enzymes HisA and TrpF, and the bifunctional isomerase PriA, are decorated by multiple long loops that undergo conformational transitions on the ms (or slower) timescale. Studying the interdependent motions of multiple slow loops, and their role in catalysis, poses a significant computational challenge. This work combines conventional and enhanced molecular dynamics simulations with empirical valence bond simulations to provide rich details of the conformational behavior of the catalytic loops in HisA, PriA, and TrpF, and the role of their plasticity in facilitating bifunctionality in PriA and evolved HisA variants. In addition, we demonstrate that, similar to other enzymes activated by ligand-gated conformational changes, loops 3 and 4 of HisA and PriA act as gripper loops, facilitating the isomerization of the large bulky substrate ProFAR, albeit now on much slower timescales. This hints at convergent evolution on these different (βα)8-barrel scaffolds. Finally, our work reemphasizes the potential of engineering loop dynamics as a tool to artificially manipulate the catalytic repertoire of TIM-barrel proteins.
Enzymes are conformationally dynamic, and their dynamical properties play an important role in regulating their specificity and evolvability. In this context, substantial attention has been paid to the role of ligand-gated conformational changes in enzyme catalysis; however, such studies have focused on tremendously proficient enzymes such as triosephosphate isomerase and orotidine 5'-monophosphate decarboxylase, where the rapid (μs timescale) motion of a single loop dominates the transition between catalytically inactive and active conformations. In contrast, the (βα)8-barrels of tryptophan and histidine biosynthesis, such as the specialist isomerase enzymes HisA and TrpF, and the bifunctional isomerase PriA, are decorated by multiple long loops that undergo conformational transitions on the ms (or slower) timescale. Studying the interdependent motions of multiple slow loops, and their role in catalysis, poses a significant computational challenge. This work combines conventional and enhanced molecular dynamics simulations with empirical valence bond simulations to provide rich details of the conformational behavior of the catalytic loops in HisA, PriA, and TrpF, and the role of their plasticity in facilitating bifunctionality in PriA and evolved HisA variants. In addition, we demonstrate that, similar to other enzymes activated by ligand-gated conformational changes, loops 3 and 4 of HisA and PriA act as gripper loops, facilitating the isomerization of the large bulky substrate ProFAR, albeit now on much slower timescales. This hints at convergent evolution on these different (βα)8-barrel scaffolds. Finally, our work reemphasizes the potential of engineering loop dynamics as a tool to artificially manipulate the catalytic repertoire of TIM-barrel proteins.
Enzymes are conformationally
dynamic systems, and their dynamical
properties are clearly connected to their biological function.[1−7] Further, conformational dynamics, in particular, that of decorating
loops that cover enzyme active sites, can regulate substrate selectivity,
the evolution of new activities, and potentially also turnover rates.[3,4,8−29] Because of this, enzyme conformational ensembles can, in principle,
be engineered to allow enzymes to acquire new catalytic functions
and/or physiochemical properties.[23,24,29−33] Therefore, there is substantial interest in understanding loop dynamics
and its impact on selectivity and catalysis, to improve our ability
to manipulate enzyme conformational ensembles for engineering purposes.There have been extensive studies of a range of enzymes, such as
triosephosphate isomerase (TPI),[34,35] orotidine
5′-monophosphate decarboxylase, (OMPDC)[36] glycerol-3-phosphate dehydrogenase (GPDH),[37,38] 1-deoxy-d-xylulose-5-phosphate reductoisomerase,[39,40] and β-phosphoglucomutase,[41] that
are activated by ligand-gated conformational changes. Specifically,
interactions between a key “gripper” loop decorating
the active site and the nonreactive phosphodianion groups of these
enzymes’ substrates trigger substantial conformational catalytically
critical changes in the gripper loop, facilitating energetically unfavorable
transitions from inactive open to active closed conformations.It is noteworthy that several of these enzymes have TIM-barrel
((βα)8-barrel) folds. Most if not all TIM-barrel
proteins possess decorating loops,[8,12,21,42] the conformational
diversity of which likely plays an important role in regulating specificity
and function.[15,18] These flexible loops can vary
in length,[8] but are typically used to bind
substrate, and to sequester the active site from solvent by closing
over the active site.[43] However, these
well-characterized examples of proteins activated by ligand-gated
conformational changes all focus on the roles and importance of single
loops, such as gripper loop 6 in TPI. Studying the ligand-gated motion
of a single loop can already pose substantial challenges;[44] systems with multiple active site loops undergoing
substantial conformational changes are even more complex and therefore
understudied.We have sought to address this knowledge gap by
studying active
site loop dynamics in the (βα)8-barrels of
tryptophan and histidine biosynthesis. The enzymes HisA, TrpF, and
PriA are model systems for the evolution of enzyme specificity and
activity (Figure ).[18,45−48] HisA catalyzes the isomerization of N′-[(5′-phosphoribosyl)-formimino]-5-aminoimidazole-4-carboxamide-ribonucleotide
(ProFAR) into N′-[(5′-phosphoribulosyl)-formimino]-5-aminoimidazole-4-carboxamide-ribonucleotide
(PRFAR). TrpF catalyzes the same Amadori rearrangement on N-(5′-phosphoribosyl)anthranilate (PRA), producing
1-(2-carboxy-phenylamino)-1′-deoxyribulose-5′-phosphate
(CdRP). This rearrangement proceeds via a Schiff acid–base
mechanism, utilizing aspartate (and in the case of TrpF) cysteine
residues as acid–base pairs.[49]
Figure 1
Tertiary
structures of (A) SeHisA in complex with
ProFAR (PDB ID: 5A5W(54,55)), (B) MtPriA in complex with PRFAR
(PDB ID: 3ZS4(55)), and (C) TmTrpF in
complex with product analogue rCdRP (PDB ID: 1LBM(49)). Loop 1 (residues 11–29 in HisA, 15–33 in MtPriA), loop 5 (residues 142–147 in SeHisA, 141–151 in PriA and 30–39 in TmTrpF) (loop 3), and loop 6 (residues 172–182 in SeHisA, 172–181 in MtPriA and 128–139
in TmTrpF) are highlighted in light green, yellow,
and dark red, respectively. Loop 1 in TmTrpF is short
(four residues); thus, no corresponding loop is annotated on this
panel. For clarity, N7D and A176D reversions were applied to the structure
of SeHisA in complex with ProFAR (these reversions
were also applied in our simulations). (D) Proposed mechanism for
the Amadori rearrangement leading to the isomerization of substrates
ProFAR and PRA by the different enzymes.[49]
Tertiary
structures of (A) SeHisA in complex with
ProFAR (PDB ID: 5A5W(54,55)), (B) MtPriA in complex with PRFAR
(PDB ID: 3ZS4(55)), and (C) TmTrpF in
complex with product analogue rCdRP (PDB ID: 1LBM(49)). Loop 1 (residues 11–29 in HisA, 15–33 in MtPriA), loop 5 (residues 142–147 in SeHisA, 141–151 in PriA and 30–39 in TmTrpF) (loop 3), and loop 6 (residues 172–182 in SeHisA, 172–181 in MtPriA and 128–139
in TmTrpF) are highlighted in light green, yellow,
and dark red, respectively. Loop 1 in TmTrpF is short
(four residues); thus, no corresponding loop is annotated on this
panel. For clarity, N7D and A176D reversions were applied to the structure
of SeHisA in complex with ProFAR (these reversions
were also applied in our simulations). (D) Proposed mechanism for
the Amadori rearrangement leading to the isomerization of substrates
ProFAR and PRA by the different enzymes.[49]Many actinobacteria lack the trpF gene, but instead
possess a gene for a naturally bifunctional isomerase, PriA.[50] The PriA from Mycobacterium tuberculosis (MtPriA) has been well characterized, with kcat/KM values of
∼104 M–1 s–1 for HisA activity and ∼105–106 M–1 s–1 for TrpF activity.[15,51] In addition, promiscuous TrpF activity has been detected on ancestral
and extant specialist HisA enzymes, with kcat values ranging from 10–4 to 10–2 s–1.[52] HisA has also
been converted into TrpF by directed evolution,[53] and in serial passaging experiments with Salmonella enterica,[46] where 3,000 generations of laboratory evolution yielded an extensive
suite of mutations in the ProFAR-specific S. enterica HisA (SeHisA) resulting in specialist HisA enzymes,
specialist TrpF enzymes, and PriA-like bifunctional enzymes.As shown in Figure , HisA and PriA are decorated by three long catalytic loops, loops
1, 5, and 6 (or two analogous loops in the case of TrpF, which has
lost most of loop 1).[15,18] Of these, loop 5 carries key
residues that are important for substrate binding, loop 6 carries
the catalytically important aspartic acid side chain, and position
15 of loop 1 is important for interaction with substrate PRA.[18,45,54] In the case of PriA, structural
analysis[15] has suggested that bifunctionality
is facilitated by transitions between a “knot-like”
pro-ProFAR conformation of loop 5, with W145 pointing “in”
toward the substrate ProFAR (PDB ID: 3ZS4(55)), and a
pro-PRA β-hairpin conformation of loop 5, with R143 pointing
in toward the substrate PRA (PDB ID: 2Y85(15,55)). The evolved SeHisA variants generated by Näsvall et al.[46] have also been the subject of detailed structural
and biochemical analyses.[18] As with PriA,[15] this analysis indicated that bifunctionality
is driven by competition between not just the substrates ProFAR and
PRA but also between structurally distinct conformations of loops
1 and 5, in particular[18] (Figure S1).Although isomerization of both ProFAR and
PRA proceeds through
the same Amadori rearrangement (Figure ), ProFAR is a much larger molecule containing a second
phosphate group. This group forms hydrogen bonds with the side chains
of R83 from loop 3 and S103 from loop 4 of SeHisA
and with the corresponding side chains of R85 and T105 in MtPriA (Figure ). Although neither of these residues are on the primary mobile
loops of either enzyme (Figure ), these interactions are very similar to analogous interactions
in other enzymes activated by ligand-gated conformational changes,[34−41] suggesting that loops 3 and 4 may similarly act as “gripper
loops”, allowing these enzymes to attain relevant catalytically
active conformations for the isomerization of the larger substrate.
This effect would not be present when the smaller substrate PRA is
bound to the active site, as it lacks a nonreactive phosphodianion
group to interact with these loops.
Figure 2
Comparison of the active sites of (A) SeHisA in
complex with substrate ProFAR (PDB ID: 5A5W(54,55)), (B) MtPriA in complex with product PRFAR (PDB ID: 3ZS4(55)), and (C) TmTrpF in complex with product
analogue rCdRP (PDB ID: 1LBM(49,55) ). Highlighted are key catalytic
residues, including the catalytic active site aspartic acid (D176
in SeHisA, D175 in MtPriA, and D126
in TmTrpF), the active site tryptophan that forms
stacking interactions with the larger substrate ProFAR in SeHisA and MtPriA (W145 in both enzymes),
as well as the gripper residues that interact with the distal phosphate
group of the larger substrate/product in SeHisA and MtPriA (R83 and S103 in SeHisA and R85
and T105 in MtPriA). Key hydrogen bonding interactions
are also highlighted, using the distances (Å) found in the corresponding
crystal structures. For clarity, N7D and A176D reversions were applied
in SeHisA in complex with substrate ProFAR (these
reversions were also applied in our simulations).
Comparison of the active sites of (A) SeHisA in
complex with substrate ProFAR (PDB ID: 5A5W(54,55)), (B) MtPriA in complex with product PRFAR (PDB ID: 3ZS4(55)), and (C) TmTrpF in complex with product
analogue rCdRP (PDB ID: 1LBM(49,55) ). Highlighted are key catalytic
residues, including the catalytic active site aspartic acid (D176
in SeHisA, D175 in MtPriA, and D126
in TmTrpF), the active site tryptophan that forms
stacking interactions with the larger substrate ProFAR in SeHisA and MtPriA (W145 in both enzymes),
as well as the gripper residues that interact with the distal phosphate
group of the larger substrate/product in SeHisA and MtPriA (R83 and S103 in SeHisA and R85
and T105 in MtPriA). Key hydrogen bonding interactions
are also highlighted, using the distances (Å) found in the corresponding
crystal structures. For clarity, N7D and A176D reversions were applied
in SeHisA in complex with substrate ProFAR (these
reversions were also applied in our simulations).Here, we combine conventional molecular dynamics (MD), enhanced
sampling, and empirical valence bond (EVB) simulations to present
a comprehensive computational study of a number of wild-type and variant
forms of SeHisA, MtPriA, and the Thermotoga maritima TrpF (TmTrpF).
All variants studied in this work, and the corresponding structures
used, are summarized in Table S1. We chose
these systems because high-quality structural data were available
in unliganded and ligand-bound forms. For SeHisA,
we have studied the unliganded and substrate-bound forms,[54] as well as key SeHisA variants
from ref (18) that
were selected based on their specificity patterns (specialists vs
generalists, Table S2). MtPriA and TmTrpF were similarly selected on the basis
of structural data in both unliganded and product (PRFAR), or product
analogue (rCdRP), bound forms, respectively (Table S1).Prior simulations have emphasized that even studying
the motion
of one large dominating conformational change is computationally nontrivial,[22,44,56] and the current systems involve
the interdependent conformational rearrangements of multiple loops
simultaneously. Our simulations of HisA, PriA, and TrpF (1) provide
rich details of the conformational behavior of the catalytic loops
in the different systems and (2) provide insight into the link between
conformational dynamics, catalytic activity, and functional evolution
in the different enzymes.
Methodology
Methodological details
are presented here in brief. Full details
of all simulations and any nonconventional parameters used are provided
in the Supporting Information.
System Preparation
for Molecular Dynamics Simulations
Simulations were performed
on wild-type SeHisA, MtPriA, and TmTrpF, as well as relevant
enzyme variants, in both their unliganded forms and in complex with
various ligands (substrates ProFAR and PRA and, in the case of the
enhanced sampling simulations, products PRFAR and CdRP). A total of
14 crystal structures were used to generate starting points for these
simulations, and all simulations performed, as well as the structures,
are summarized in Table S1. Where present,
the D7N, D11N, and D176A substitutions were reverted to wild type
using the Dunbrack 2010 Rotamer Library,[57] as implemented in UCSF Chimera, v. 1.14.[58] Missing regions in the catalytic loops were reconstructed using
Modeller v. 9.23.[59] The catalytic aspartic
acid side chain in the active site of each enzyme (D176 in HisA, D175
in PriA and D126 in TrpF) was kept protonated in line with the mechanism
shown in Figure .
All other residues (except H50 in PriA, which was doubly protonated)
were kept in their default protonation states at physiological pH
determined by the use of PROPKA 3.1,[60] and
visual inspection. The substrates ProFAR and PRA were manually placed
into the relevant active sites in the same conformation as found in
the structure of the HisA wild-type enzyme in complex with ProFAR.
In the case of PRA (Figure D), the substrate was placed into the relevant active sites
by manual overlay of the reactive part of PRA with the reactive part
of ProFAR, and with the carboxylate group of PRA keeping key interactions
with active site residues. Partial charges for ligands ProFAR, PRA,
PRFAR, and CdRP were calculated using the standard restrained electrostatic
potential (RESP) protocol using Antechamber v. 17.3,[61] and based on the vacuum electrostatic potential calculated
at the HF/6-31G(d) level of theory, using Gaussian 09 Rev. E.01.[62] All other simulation parameters were described
using the general Amber force field 2 (GAFF2)[63] (see Tables S3–S6). Finally, to
keep the substrate stably bound in the enzyme active sites, weak distance
restraints were applied to protein–substrate distances in the
conventional molecular dynamics (MD) simulations, as described in Table S7.
Conventional Molecular
Dynamics Simulations
Conventional
MD simulations were performed using the CUDA version of the PMEMD
module of the AMBER 16 simulation package.[64] The protein, ligands, and solvent were described using the ff14SB
force field,[65] the General AMBER Force
Field 2 (GAFF2),[63] and the TIP3P water
model,[66] respectively. Following initial
minimization and equilibration, each system (summarized in Table S1) was subjected to 10 × 500 ns of
molecular dynamics simulations controlled by the Langevin thermostat
with a collision frequency of 2 ps–1,[67] and the Berendsen barostat with a 1 ps coupling
constant.[68] This led to a cumulative 5
μs of production simulations per system, and a cumulative total
of 70 μs of conventional MD simulations over all systems studied
(Table S1).
Enhanced Sampling Molecular
Dynamics Simulations
Steered
molecular dynamics simulations (sMD) were performed using GROMACS
2018.4 to pull products PRFAR and CdRP out of the active site of the SeHisA(dup13–15/D10G/G102A/Q24L) variant, as described
in the Supporting Information. System preparation
was performed as for the conventional MD simulations, using the same
force fields and water models. Following initial minimization and
equilibration, 10 × 50 ns production MD simulations were performed
on each system. The first 5 ns of production MD were unrestrained,
after which an external force with a force constant of 10 kcal mol–1 Å–2 was applied to pull the
product out of the active site. This external force was then released
for the last 5 ns of each MD simulation run.
Empirical Valence Bond
Simulations
We extended our
previous empirical valence bond (EVB) approach[69−73] to study the enzyme-catalyzed opening of the ribose
ring of substrates ProFAR and PRA (Figure ), as catalyzed by wild-type and variant
forms of HisA, PriA, and TrpF. This is the first step of the mechanism
shown in Figure D,
and used the valence bond states shown in Figure . Simulations were performed on wild-type SeHisA, MtPriA, and TmTrpF as well as selected variants, as described in the Supporting Information. All simulations were
performed using the Q6 simulation package,[74,75] using the all-atom optimized potentials for liquid simulations (OPLS-AA)
force field.[76] All EVB parameters and methods
necessary to reproduce our work can be found in the Supporting Information, with the full parameters also uploaded
to Zenodo (DOI: 10.5281/zenodo.5893598). Each system was simulated
using 30 individual replicas, with each replica first equilibrated
for 20 ns and the endpoint of that equilibration being used as the
starting point for propagating an EVB trajectory. Each EVB free energy
perturbation/umbrella sampling (EVB-FEP/US)[69] simulation was simulated using 51 individual mapping windows of
200 ps of simulation time each, leading to a total of 10.2 ns of simulation
time per individual EVB trajectory. The cumulative totals were 600
ns equilibration and 306 ns EVB simulation time per individual system,
and a cumulative 12 μs of equilibration and 6.12 μs of
EVB simulation time over all 20 systems studied.
Figure 3
Valence bond states used
to describe the ribose ring opening catalyzed
by the different enzyme variants studied in this work. Only the ribose
moiety of the substrate was placed in the EVB region, with the remainder
of the substrate treated purely classically. The corresponding EVB
parameters are provided in the Supporting Information and on Zenodo, DOI: 10.5281/zenodo.5893598.
Valence bond states used
to describe the ribose ring opening catalyzed
by the different enzyme variants studied in this work. Only the ribose
moiety of the substrate was placed in the EVB region, with the remainder
of the substrate treated purely classically. The corresponding EVB
parameters are provided in the Supporting Information and on Zenodo, DOI: 10.5281/zenodo.5893598.
Analysis of Conventional and Enhanced Sampling Molecular Dynamics
Simulations
Unless stated otherwise, all analysis of all
conventional and enhanced sampling MD simulations was performed using
CPPTRAJ.[77] Hydrogen bonds were defined
as formed if the donor–acceptor distance was ≤3.0 Å
and the donor–hydrogen-acceptor angle was within 180 ±
45°. Principal component analysis (PCA) was performed by first
RMS fitting to a whole protein Cα carbon atoms and
then performing PCA on the Cα carbon atoms of loops
1, 5, and 6, as well as loop 1 for HisA loop 1 elongated systems analysis.
Other analyses were performed as described in the Supporting Information.
Results and Discussion
Active
Site Plasticity and Substrate Binding in the Different
Enzymes
SeHisA and MtPriA
have similar structures, with a Cα-RMSD of 1.09 Å
when PDB IDs 3ZS4 [55] and 5A5W [54,55] are compared.
In contrast, TmTrpF is significantly smaller, with
40 fewer residues. We calculated the mean and standard deviations
of the active site volumes of each unliganded enzyme during 10 ×
500 ns conventional MD simulations using MDpocket,[78] obtaining average volumes of 745.4 ± 129.6, 1033.8
± 217.0, and 1173.5 ± 158.0 Å3, for TrpF,
PriA, and HisA, respectively (Table S8).
The TrpF-active site is more compact than those of PriA and HisA,
which have successively larger and more “flexible” active
site pockets (using the standard deviation on the volume as a proxy
for flexibility). For comparison, the substrates ProFAR and PRA have
volumes of 829 and 559 Å3, respectively, calculated
using Mol_Volume Version 1.0, with default radii of 1.7 Å and
a probe sphere of 0.5 Å. This confirms the structural data indicating
that the active site pocket of TrpF is too compact to accommodate
the larger substrate, ProFAR, leading to the selectivity of this enzyme
toward PRA.[79] Furthermore, the PriA active
site is the most flexible of the three, in line with structural data,[15] indicating that PriA is capable of significantly
rearranging its active site (in particular, loop 5 conformation) when
accommodating the different substrates ProFAR and PriA (Figure S1).Two key active site side chains
in HisA and PriA are important for binding ProFAR:[15,18,54,80] W145, which
forms a stabilizing stacking interaction with the substrate, and R83
(R85), which interacts with the second phosphodianion group of the
substrate (Figure ). To explore the conformational diversity of these side chains,
we examined their joint dihedral angle distributions in simulations
of unliganded HisA and PriA, as well as HisA and PriA in complex with
substrates ProFAR and PRA. The corresponding dihedral data are shown
in Figures and S2.
Figure 4
Joint distribution of the side-chain dihedral
angles of the side
chains of (A–C) W145 and (D–F) R83 in (A, D) unliganded
HisA, (B, E) HisA in complex with substrate ProFAR, and (C, F) HisA
in complex with substrate PRA. Data were extracted every 5 ps of 10
× 500 ns production simulations of each system, performed as
described in the Supporting Information.
Joint distribution of the side-chain dihedral
angles of the side
chains of (A–C) W145 and (D–F) R83 in (A, D) unliganded
HisA, (B, E) HisA in complex with substrate ProFAR, and (C, F) HisA
in complex with substrate PRA. Data were extracted every 5 ps of 10
× 500 ns production simulations of each system, performed as
described in the Supporting Information.These data show that the tryptophan
and arginine side chains are
both highly conformationally flexible in the unliganded enzymes. However,
while ProFAR binding to HisA restricts the conformational space of
the arginine side chain on the gripper loop 3 to a catalytically competent
position that helps stabilize the bound substrate, in PriA, the R143
side chain is only 4.8 Å from the gripper residue R85 (distance
between the two side-chain carbon atoms, based on PDB ID: 3ZS4 [55]). This creates electrostatic repulsion between
the two arginine side chains, destabilizing both loop 5 as well as
the interaction between ProFAR and the R85 side chain (Figure S3). In contrast, when PRA is bound to
either HisA or PriA, the R83/R85 side chains increase their conformational
flexibility, sampling roughly the same conformational space as in
the unliganded enzyme (Figure S2). Therefore,
the interaction of these residues with the larger substrate ProFAR
is likely playing an important role in the ability of these enzymes
to bind and isomerize it.The W145 side chain forms a stabilizing
stacking interaction with
the substrate and slightly increases its conformational flexibility
when the smaller substrate PRA is bound to HisA (note that PRA was
placed manually in the active site by overlay with ProFAR), although
the flexibility of this side chain is still substantially reduced
compared to the unliganded enzyme (Figure A). However, in the case of PriA, both the
W145 and R143 side chains are conformationally restricted to a catalytically
competent position due to a rearrangement of loop 5 upon PRA binding
that swaps the position between these two residues compared to when
ProFAR is bound to PriA, preventing the electrostatic repulsion between
the R85 and R143 side chains that is observed when ProFAR (PRFAR)
is bound to the active site (Figure S1).[15]As noted previously, the R83/R85 side
chain is one of a number
of key residues on the gripper loop that interact with the distal
phosphodianion group of the larger substrate ProFAR, contributing
to the stabilization of the substrate in the HisA active site. Hence,
in PriA, the loop 5 rearrangement required for ProFAR binding[15] prevents R85 from gripping the second phosphodianion
group, leading to a preference for the isomerization of PRA. These
gripper interactions are similar to corresponding interactions in
enzymes such as TIM, OMPDC, and GPDH,[34−38] where interactions with the nonreactive phosphodianion
group of the substrate drive catalytically important ligand-gated
conformational changes. Here, the interaction between the HisA gripper
residues and the remote phosphodianion group of ProFAR appears to
be similarly important for maintaining the closed conformation of
loop 5, and when this interaction is lost, as in PriA, we see the
corresponding opening of loop 5 (Figure S3). This supports the likelihood that HisA and PriA are also activated
by ligand-gated conformational changes, albeit with more complex loop
dynamics (due to the involvement of not one but three highly mobile
and long catalytic loops) than in other previously characterized systems.
Conformational Dynamics of Key Catalytic Loops of HisA, PriA,
and TrpF
To further explore the impact of ProFAR and PRA
binding on loop dynamics, we extended our simulations to include TrpF
in both its unliganded form, and in complex with PRA (Table S1). We then performed principal component
analysis (PCA) to characterize the motion of the catalytic loops during
our simulations of each system, similar to prior analysis we have
performed on TPI.[44] The PCA was performed
on the mass-weighted Cartesian coordinates of each enzyme compared
to the coordinates of the corresponding closed state, allowing us
to explore the variation of the conformations of these loops in coordinate
space.Figure shows an overview of structural changes in the catalytic loops along
the first two principal components, PC1 and PC2, as well as the minimum
and maximum deviations of each of the catalytic loops from the corresponding
closed reference state along each PC for each enzyme. PC1 and PC2
make contributions of 41.2, 70.9, and 51.0% and 16.4, 6.0, and 13.9%
to the overall variance in each of HisA, TrpF, and PriA, respectively.
Significant conformational motion is observed along both PCs; however,
PC1 primarily describes loop transitions between closed and open conformational
states, whereas PC2 describes conformational variation in the loops
during these transitions, including transitions between different
open conformations.
Figure 5
Overview of the variations in the conformational states
of key
catalytic loops of (A, B) HisA, (C, D) PriA, and (E, F) TrpF in coordinate
space, along the first two principal components (A, C, E) PC1 and
(B, D, F) PC2 obtained from PCA performed on our conventional MD simulations
of each system. PCA was performed relative to the corresponding loop
closed states for each enzyme, and the minimum and maximum deviations
of each loop from this closed state along each PC are highlighted
in blue and red, respectively, with an intermediate state observed
in our analysis highlighted in yellow.
Overview of the variations in the conformational states
of key
catalytic loops of (A, B) HisA, (C, D) PriA, and (E, F) TrpF in coordinate
space, along the first two principal components (A, C, E) PC1 and
(B, D, F) PC2 obtained from PCA performed on our conventional MD simulations
of each system. PCA was performed relative to the corresponding loop
closed states for each enzyme, and the minimum and maximum deviations
of each loop from this closed state along each PC are highlighted
in blue and red, respectively, with an intermediate state observed
in our analysis highlighted in yellow.We subsequently projected the free energies for each enzyme along
the most dominant motions, PC1 and PC2, from simulations of each of
HisA, PriA, and TrpF in their unliganded forms as well as in complex
with substrates ProFAR and PRA, respectively (where PRA was artificially
placed in the HisA active site by manual overlay with the reactive
part of ProFAR). This allowed us to compare both the free energy surfaces
defined by these two PCs between the different enzymes, and the effect
of ligand binding on these surfaces. The resulting combined motion
of all key catalytic loops along each principal component is illustrated
in Figure .
Figure 6
Projected free
energy surfaces (kcal mol–1) along
the first two principal components (PC1 and PC2) obtained from applying
Cartesian principal component analysis (PCA) to our conventional MD
simulations of (A–C) HisA in its (A) unliganded form, and in
complex with (B) ProFAR and (C) PRA, (D–F) PriA in its (D)
unliganded form and in complex with (E) ProFAR and (F) PRA, and (G,
H) TrpF in its (G) unliganded form and in complex with (H) PRA. Free
energies were estimated from the bin populations of the 2D histograms
(using 100 bins on each axis), using the formula Gi = −kBT(Ni/NMax),
where kB is the Boltzmann constant, T is the temperature at which the simulations were performed
(300 K), Ni is the population of each
bin, and NMax is the population of the
most populated bin. The crystallographic loop open and loop closed
states of the enzyme are indicated on this surface using the symbols
▲ and ▼. In the case of HisA, these states are defined
based on the loop conformations found in PDB IDs: 5AHE[54] and 5A5W,[54] for the unliganded open,
ProFAR-bound closed and PRA-bound closed conformations, where the
latter two systems have the same crystallographic closed conformation.
In the case of PriA, these states are defined based on PDB IDs: 2Y89,[15]3ZS4,[55] and 2Y85,[15,55] for the unliganded open, ProFAR-bound closed (yellow triangle),
and PRA-bound closed (blue triangle) conformations. For TrpF, the
conformations of the catalytic loops are defined based on PDB IDs: 1NSJ [55,81] and 1LBM.[49,55] The overlap of these triangles indicates that the loops are found
in similar positions in the crystal structure, irrespective of whether
the starting structure is liganded or unliganded. Note that this figure
considers the motion of all relevant catalytic loops along these two
PC, as illustrated in Figure .
Projected free
energy surfaces (kcal mol–1) along
the first two principal components (PC1 and PC2) obtained from applying
Cartesian principal component analysis (PCA) to our conventional MD
simulations of (A–C) HisA in its (A) unliganded form, and in
complex with (B) ProFAR and (C) PRA, (D–F) PriA in its (D)
unliganded form and in complex with (E) ProFAR and (F) PRA, and (G,
H) TrpF in its (G) unliganded form and in complex with (H) PRA. Free
energies were estimated from the bin populations of the 2D histograms
(using 100 bins on each axis), using the formula Gi = −kBT(Ni/NMax),
where kB is the Boltzmann constant, T is the temperature at which the simulations were performed
(300 K), Ni is the population of each
bin, and NMax is the population of the
most populated bin. The crystallographic loop open and loop closed
states of the enzyme are indicated on this surface using the symbols
▲ and ▼. In the case of HisA, these states are defined
based on the loop conformations found in PDB IDs: 5AHE[54] and 5A5W,[54] for the unliganded open,
ProFAR-bound closed and PRA-bound closed conformations, where the
latter two systems have the same crystallographic closed conformation.
In the case of PriA, these states are defined based on PDB IDs: 2Y89,[15]3ZS4,[55] and 2Y85,[15,55] for the unliganded open, ProFAR-bound closed (yellow triangle),
and PRA-bound closed (blue triangle) conformations. For TrpF, the
conformations of the catalytic loops are defined based on PDB IDs: 1NSJ [55,81] and 1LBM.[49,55] The overlap of these triangles indicates that the loops are found
in similar positions in the crystal structure, irrespective of whether
the starting structure is liganded or unliganded. Note that this figure
considers the motion of all relevant catalytic loops along these two
PC, as illustrated in Figure .In the unliganded form of each
enzyme, the catalytic loops are
highly conformationally diverse and explore a range of “wide-open”
conformations (Figure ). This is consistent with our prior simulation studies on both TPI,[44] and the protein tyrosine phosphatases PTP1B
and YopH,[22] as well as chimeric forms of
these enzymes.[82] However, and consistent
with structural data, ProFAR binding to HisA fully restricts the conformational
sampling of all three loops (Figure B). In the case of PriA, the binding of both ProFAR
and PRA also stabilizes the closed conformation of the three active
site loops, but still allows for some conformational flexibility in
these loops (Figure , based on both the topologies of the projected free energy surfaces
and the corresponding energies). In sharp contrast, in the liganded
form of TrpF, our MD simulations show that PRA is not stable in the
active site due to the flexibility of loop 6, which explores transitions
toward open conformations, similarly to the unliganded system (Figures E,F and 6G,H). This in turn increases the mobility of PRA
in the active site even with the restraints applied in the simulations
(Table S7), allowing it to sample noncatalytic
conformations with Asp-H...PRA ring oxygen distances as high as 10.8
Å (and averaging 4.4 ± 1.4 Å during our simulations).
In the crystal structure of the unliganded enzyme (PDB ID: 1NSJ(55,81)), this loop is present in a closed position but with missing density,
whereas in our simulations of both the liganded and unliganded form
of the enzyme, loop 6 samples open conformations, suggesting that
the crystal structure does not reveal the correct closed state to
stabilize the substrate PRA in the active site. We note that the structure
used for these simulations (PDB ID 1LBM(49,55)) was solved in complex
with the product analogue rCdRP. Our simulations suggest that the
loop conformations observed in this structure are a conformational
state on the trajectory to product release, rather than an ideal conformational
state for stabilizing the Michaelis complex.During our simulations
of HisA and PriA, we observed the formation
of a stacking interaction between ProFAR and the side chain of W145
(Figure S4), with an average distance of
3.9 ± 0.3 Å and an angle γ = 11.7 ± 5.5°
between the center of mass of the ProFAR imidazole ring and the W145
indole ring. Our results thus confirm the role for W145 that was proposed
based on experimentally determined structures,[54] and is consistent with experiments indicating that HisA
activity is abolished in SeHisA(dup13–15)
because the extended conformation of loop 1 blocks this side chain
from interacting with ProFAR.[18]In
contrast, in the case of PriA, and again in agreement with prior
structural analysis,[15] we observe two possible
conformations of loop 5, depending on which substrate is bound to
the active site. When ProFAR is bound, we sample a conformation similar
to that observed in wild-type HisA (Figure S1A) with a similar stacking interaction between ProFAR and W145 (Figure S4). However, the loop 5 rearrangement
required to optimize the stacking position of W145 with ProFAR creates
transient electrostatic repulsion between loop 5 and the rest of the
enzyme, making this loop more conformationally dynamic, which we observe
in our analysis in the form of an increased standard deviation in
the distance and angle of the corresponding stacking interaction (d = 4.1 ± 0.6 Å, γ = 19.2 ± 13.9°)
(Figures S3 and S4). The greater plasticity
of this interaction, in turn, decreases substrate stability in the
active site (the ProFAR RMSF increases from 18.4 Å in our simulations
of wild-type HisA to 22.2 Å in our simulations of wild-type PriA),
and thus the corresponding ProFAR isomerization activity of PriA.For comparison, in our simulations of PRA-bound PriA, we sampled
an active site conformation in which the R143 side chain (loop 5 residue)
forms salt bridges with the side chains of D130 and D175, and the
arginine acts as a “shield”, dampening the electrostatic
repulsion between the D130 side chain and the anthranilate carboxylate
group of PRA. This interaction also stabilizes the catalytic aspartic
(D175), placing it in an optimal position for catalysis (Figure S5 and Table S9), as shown in previous
studies.[15] This PriA conformation is similar
to the “TrpF-active” conformation observed in the SeHisA(dup13–15/D10G/Q24L/G102A) crystal structure[18] (PDB ID: 5AB3,[18,55]Figure S6, with manual placement of PRA in the active site),
where the arginine is close to residue D129. We do not observe the
formation of a corresponding interaction in our simulations of PRA-bound
wild-type HisA, as the negative charge repulsion between PRA and the
D129 side chain destabilizes the PRA position in the active site (Figure S7), as well as the stability of loop
6 carrying the key catalytic aspartate (Figure S7). We do, however, observe a similar interaction with the
R169 side chain in the SeHisA(L169R) variant, with
interactions with D129 and, in this case, a salt-bridge interaction
with the anthranilate carboxylate group of PRA (Figure S7 and Table S9), consistent with experimental work
demonstrating that the introduction of the L169R substitution in HisA
induces TrpF activity (Table S2).[46]Finally, in the case of PRA-bound TrpF,
we observe a salt-bridge
interaction between E184 and the side chain of R36 on loop 3 (Figure S6 and Table S9), and we can see that
as for HisA and PriA, R36 is again acting as a shield avoiding possible
negative repulsion interactions between the substrate and the negatively
charged side chain. However, we do not observe clear interactions
between the R36 side chain and the substrate PRA (Figure S6 and Table S9, with the fraction of simulation time
in which this interaction is observed being <0.1).Overall,
we observed that this arginine plays an essential role
in the introduction of TrpF activity, by shielding electrostatic repulsion
between negatively charged side chains and the anthranilate carboxylate
group of PRA. This is in agreement with experiments where introducing
an arginine or removing the negative residue introduces TrpF activity
in HisA systems.[46]
Conformational Dynamics
of Loop 1 in HisA and PriA and Its Impact
on Selectivity
The importance of loops 5 and 6 of HisA and
PriA for binding and catalysis is documented.[15,18,54] In contrast, the precise catalytic role
of loop 1 remains unclear, although substitutions at position 15 on
this loop are important for facilitating the TrpF activity of this
enzyme,[18,45,54] and extending
the conformation of loop 1 through duplication of residues 13–15
(HisA(dup13–15)) plays an important role in the acquisition
of bifunctionality.[18,46,83] In this context, SeHisA(dup13–15/D10G) is
a bifunctional enzyme that can catalyze the isomerization of both
ProFAR and PRA with modest catalytic efficiencies,[18] and the corresponding crystal structure (PDB ID: 5AC7(18,55)) shows the enzyme in a “PRA-active” conformation with
loops 1 and 6 in a closed state and loop 5 in an open state.When initiating simulations of the unliganded bifunctional SeHisA(dup13–15/D10G) variant starting from the loop
1 closed conformation, we did not observe any opening of loop 1. This
contrasts with our simulations of the unliganded wild-type enzyme,
where we sampled open conformations of this loop when we started from
the closed conformation (PDB ID: 5A5W(54)) and removed
ProFAR. This provides evidence, in addition to the discussion in ref (18), that the elongation of
loop 1 heavily stabilizes this PRA-active conformation. This is further
supported by examining the root-mean-square fluctuations (RMSF) of
all Cα atoms in our simulations of these two enzymes
starting from loop 1 in the closed conformation (Figure ), where we observe that loop
1 is more flexible than either of loops 5 or 6 in simulations of the
wild-type, but has reduced flexibility in simulations of the SeHisA(dup13–15/D10G) variant.
Figure 7
(A) Structural comparison
of wild-type HisA (PDB ID: 5A5W(54,55)) and the SeHisA(dup13–15/D10G)
variant (PDB
ID: 5AC7(18,55)). The position of the key catalytic loops 1 and 5 is highlighted
(loop 6 is not clearly visible as it is behind loops 1 and 5 but has
the same conformation in both systems). A comparison between the loop
1 sequences in wild-type HisA and in the SeHisA(dup13–15/D10G)
variant is provided as an inset. (B) Root-mean-square fluctuation
(RMSF, Å) of all backbone Cα atoms in each enzyme.
(A) Structural comparison
of wild-type HisA (PDB ID: 5A5W(54,55)) and the SeHisA(dup13–15/D10G)
variant (PDB
ID: 5AC7(18,55)). The position of the key catalytic loops 1 and 5 is highlighted
(loop 6 is not clearly visible as it is behind loops 1 and 5 but has
the same conformation in both systems). A comparison between the loop
1 sequences in wild-type HisA and in the SeHisA(dup13–15/D10G)
variant is provided as an inset. (B) Root-mean-square fluctuation
(RMSF, Å) of all backbone Cα atoms in each enzyme.It is possible that product release could trigger
a change from
a closed to a wide-open conformation of loop 1, so we also performed
steered molecular dynamics (sMD) simulations of the only crystal structure
of a loop-elongated HisA variant with product bound to the active
site (SeHisA(dup13–15/D10G/G102A/Q24L), PDB
ID: 5AB3(18)). Specifically, we performed pulling simulations
of products PRFAR and CdRP out of the enzyme from the bottom of the
TIM-barrel. In doing so, we observed that the larger molecule, PRFAR,
always induced a conformational change in loops 1 and 6 from a closed
to an open and/or wide-open conformation upon pulling it out of the
active site (Figure S8A,C, backbone RMSD
of loop residues of up to ∼4.0 Å in loop 1 and ∼3.5
Å in loop 6, compared to the starting closed conformation). In
contrast, CdRP is able to leave the active site without inducing large
conformational changes in loop 1 (loop 1 opens in only one out of
ten replicates), but pulling it out from the active site still induces
a conformational change in loop 6 (Figure S8B,D, backbone RMSD of loop residues of up to ∼2.0 Å in loop
1 and ∼3.5 Å in loop 6, compared to the starting closed
conformation). One interpretation consistent with this observation
is that substantial loop rearrangement is required for efficient product
release, and this in turn makes it possible that a slow (rate-limiting)
product release step might be the reason for the low turnover numbers
observed for catalyzing an intrinsically facile reaction[84] (Table S2).Interestingly, while the elongation of loop 1 through the duplication
of residues 13–15 (VVR) appears to be essential for the change
of specificity toward the isomerization of PRA (facilitated by the
presence of a new stabilizing arginine side chain close to the active
site, Figure S1),[18] simply the loop duplication by itself is not enough to induce bifunctionality.
That is, while the duplication elongates loop 1, it also rigidifies
it such that SeHisA(dup13–15) does show some
ability to isomerize PRA (kcat > 0.15
s–1), but at the expense of losing all ability to
isomerize ProFAR.[18] Critical to bifunctionality
is the inclusion of an additional substitution, D10G (SeHisA(dup13–15/D10G)).In all HisA variants we studied,
when performing conventional MD
simulations starting from the closed conformation of loop 1, this
loop is very stable and remains closed over our simulation timescales
(Table S1). We therefore also initiated
trajectories starting from the loop 1 open conformations of these
variants to see whether the loop prefers to remain open, transition
to wide-open conformations, or transition back to a closed conformation.
PCA was then performed on the mass-weighted Cartesian coordinates
of each enzyme compared to the coordinates of the corresponding loop
1, with closed, open, and wide-open conformations defined as described
in the captions to Figures and S9.
Figure 8
(A) Global projected
loop 1 distance surfaces (Å) along the
first two principal components (PC1 and PC2) obtained from applying
Cartesian PCA to combined analysis of our conventional MD simulations
of unliganded SeHisA(dup13–15) (PDB ID: 5G2I [18]), SeHisA (dup13–15/D10G)
(PDB ID: 5AC7(18,55)), SeHisA(dup13–15/D10G/G102A)
(PDB ID: 5AC8 [18]), and SeHisA(dup13–15/D10G/G102A/Q24L/V15[b]M)
(PDB ID: 5G1Y [18]). The starting SeHisA(dup13–15) loop 1 open state is indicated on this surface
by a yellow triangle, ▲, and the same starting loop 1 conformation
is used for simulations of all four systems. Note that this figure
considers the motion of loop 1 along the PCs, as shown for PC1, projections
along which include transitions from closed to wide-open states of
loop 1 (B). PCA was performed on the mass-weighted Cartesian coordinates
of the HisA variants compared to the coordinates of the corresponding
open state, allowing us to explore the variation of the conformations
of this loop in coordinate space. The loop 1 distance is defined as
the center of mass of residues 15–25 of loop 1 and residue
129 from the barrel scaffold. Here, we considered distances <22.0
Å as corresponding to closed states (blue), between 23.0 and
28.0 Å as corresponding to open states (yellow), and >29.0
Å
as corresponding to wide-open conformations (red), based on a combination
of visual inspection and comparison of the closed-state crystal structures
(PDB IDs: 5AB3(18,55) and 5AC7(18,55)), and using the SeHisA(dup13–15)
variant as a reference. A 1 Å difference between states was applied
to avoid fuzzy states between transition from closed to open state
and from open to wide-open state. (C–F) Snapshots from our
simulations, illustrating the conformational sampling of loop 1 in
(C) wild-type SeHisA, (D) SeHisA(D10G),
(E) SeHisA(dup13–15), and (F) SeHisA (dup13–15/D10G). The loops are colored mapped from red
(most flexible) through white and to blue (least flexible), according
to their calculated Cα RMSF.
(A) Global projected
loop 1 distance surfaces (Å) along the
first two principal components (PC1 and PC2) obtained from applying
Cartesian PCA to combined analysis of our conventional MD simulations
of unliganded SeHisA(dup13–15) (PDB ID: 5G2I [18]), SeHisA (dup13–15/D10G)
(PDB ID: 5AC7(18,55)), SeHisA(dup13–15/D10G/G102A)
(PDB ID: 5AC8 [18]), and SeHisA(dup13–15/D10G/G102A/Q24L/V15[b]M)
(PDB ID: 5G1Y [18]). The starting SeHisA(dup13–15) loop 1 open state is indicated on this surface
by a yellow triangle, ▲, and the same starting loop 1 conformation
is used for simulations of all four systems. Note that this figure
considers the motion of loop 1 along the PCs, as shown for PC1, projections
along which include transitions from closed to wide-open states of
loop 1 (B). PCA was performed on the mass-weighted Cartesian coordinates
of the HisA variants compared to the coordinates of the corresponding
open state, allowing us to explore the variation of the conformations
of this loop in coordinate space. The loop 1 distance is defined as
the center of mass of residues 15–25 of loop 1 and residue
129 from the barrel scaffold. Here, we considered distances <22.0
Å as corresponding to closed states (blue), between 23.0 and
28.0 Å as corresponding to open states (yellow), and >29.0
Å
as corresponding to wide-open conformations (red), based on a combination
of visual inspection and comparison of the closed-state crystal structures
(PDB IDs: 5AB3(18,55) and 5AC7(18,55)), and using the SeHisA(dup13–15)
variant as a reference. A 1 Å difference between states was applied
to avoid fuzzy states between transition from closed to open state
and from open to wide-open state. (C–F) Snapshots from our
simulations, illustrating the conformational sampling of loop 1 in
(C) wild-type SeHisA, (D) SeHisA(D10G),
(E) SeHisA(dup13–15), and (F) SeHisA (dup13–15/D10G). The loops are colored mapped from red
(most flexible) through white and to blue (least flexible), according
to their calculated Cα RMSF.Inclusion of the D10G substitution (SeHisA(dup13–15/D10G))
increases the conformational space sampled by the elongated loop 1
(Figure ) compared
to SeHisA(dup13–15), allowing for the loop
to take on wide-open configurations which in turn facilitate the entry
and binding of ProFAR to the active site (Figure B, wide-open conformation). More specifically,
as can be seen in Figure , the inclusion of the D10G substitution alone is enough to
increase the conformational space sampled by loop 1 in SeHisA(D10G). The dup13–15 elongation rigidifies the whole loop
compared to the wild type, and adding the D10G variant on the background
of the elongation (SeHisA(dup13–15/D10G))
once again expands the conformational space sampled by the loop.Our conventional MD simulations (500 ns per trajectory) are short,
taking into account the slow turnover numbers of these enzymes (that
suggest loop motions on the ms to s timescale).[18] However, our observation that SeHisA(D10G)
has an expanded conformational space sampled by loop 1, compared to SeHisA(dup13–15/D10G), agrees with prior NMR relaxation
dispersion experiments.[18] These detected
μs to ms motions at 14 backbone 15N positions in
the SeHisA(dup13–15/D10G) variant, compared
to only three positions for the SeHisA(dup13–15)
variant, and with two resonances that are unique to SeHisA(dup13–15/D10G). As a result, adding this substitution
is sufficient to convert SeHisA(dup13–15/D10G)
back to a bifunctional enzyme, through the exploitation of conformational
dynamics, with kcat of 0.09 s–1 for the isomerization of PRA, and 0.05 s–1 for
the isomerization of ProFAR (Table S2).[18]To explore this further, we examined simulations
of four loop-elongated
variants, specifically: dup13–15 (PDB ID: 5G2I(18)), dup13–15/D10G (PDB ID: 5AC7(18,55)), dup13–15/D10G/G102A
(PDB ID: 5AC8(18)), and dup13–15/D10G/G102A/Q24L/V15[b]M
(PDB ID: 5G1Y(18)). The first and last of these variants
are only active toward the isomerization of PRA, whereas the middle
two variants are bifunctional toward both PRA and ProFAR (Table S2).Our simulations revealed how
all of these enzymes are able to populate
closed, open, and wide-open conformations (Table ). However, the relative populations of these
states depend strongly on enzyme variant: already, the D10G substitution
by itself appears to be sufficient to cause a conformational shift
toward a closed conformation compared to the distribution observed
in the wild-type enzyme, and the population of the wide-open conformation
is further reduced in the SeHisA(dup13–15)
variant, consistent with the rigidification of loop 1. Introduction
of the D10G and G102A substitutions reinstates sampling of the wide-open
conformation and results in bifunctional variants that are also able
to isomerize ProFAR. Finally, the SeHisA(dup13–15/D10G/Q102A/Q24L/V15[b]M)
variant, which shows the highest TrpF activity of all variants studied
in ref (18) (Table S2), also shows the most significant population
shift toward sampling a closed conformation of loop 1.
Table 1
Relative Populations of the Closed,
Open, and Wide-Open Conformations of Loop 1 Sampled in 10 × 500
ns MD Simulations of Wild-Type SeHisA and Variantsa
system
function
closed
open
wide-open
wild-type
HisA
13
60
27
HisA(D10G)
HisA
33
49
18
HisA(dup13–15)
TrpF
35
59
6
HisA(dup13–15/D10G)
HisA + TrpF
29
53
18
HisA(dup13–15/D10G/G102A)
HisA + TrpF
13
65
22
HisA(dup13–15/D10G/G102A/Q24L/V15bM)
TrpF
64
31
5
The relative populations (%) of
the different loop 1 conformational states sampled during our simulations
with HisA dup13–15 variants were determined by the number of
frames with distances <22.0 Å defined as closed states, between
23.0 and 28.0 Å defined as open states, and >29.0 Å defined
as wide-open conformations, as shown schematically in Figure , divided by the sum of all
states considered for this analysis. A 1 Å difference between
states was applied to avoid fuzzy states between transition from closed
to open state and from open to wide-open state for the SeHisA(dup13–15) variants.
The relative populations (%) of
the different loop 1 conformational states sampled during our simulations
with HisA dup13–15 variants were determined by the number of
frames with distances <22.0 Å defined as closed states, between
23.0 and 28.0 Å defined as open states, and >29.0 Å defined
as wide-open conformations, as shown schematically in Figure , divided by the sum of all
states considered for this analysis. A 1 Å difference between
states was applied to avoid fuzzy states between transition from closed
to open state and from open to wide-open state for the SeHisA(dup13–15) variants.These observations are consistent with structural analyses[18] indicating that the Q24L substitution is important
because it introduces a new stabilizing interaction with V15b, as
well as an even better interaction with V15[b]M such that the Q24L
interaction is just as important as the VVR duplication for the adaptive
benefit of the V15[b]M substitution to be realized. Both this variant
and the SeHisA(dup13–15) variant, which show
the lowest population of the hyper-open conformation, are TrpF specialists.
We hypothesize that in the case of these variants, this is due to
the presence of a Gly-Gly dyad in the hinge of loop 1, which provides
the loop with enough flexibility to explore these wide-open conformations.
Empirical Valence Bond Simulations of the Enzyme-Catalyzed Ribose
Ring-Opening Step in the Isomerization of Substrates ProFAR and PRA
To further probe the role of loop 1 in HisA and PriA, we performed
empirical valence bond (EVB)[69] simulations
of the initial ribose ring opening of substrates ProFAR and PRA (Figure ), as catalyzed by
wild-type and variant forms of the two enzymes. The calculated activation
free energies for this step are not trivial to directly compare with
the experimental turnover numbers; unfortunately, no kinetic data
exist for the rates of the individual chemical steps. Furthermore,
the observed kcat values for these systems
are extremely low—on the order of 1 s–1 (or
lower) for both the HisA and TrpF reactions catalyzed by HisA and
its variants.[18] However, in the case of Escherichia coli TrpF, which has a higher turnover
number than any of the enzymes studied here (kcat of 30–40 s–1, see Table S2 for comparison), the rate-limiting step
is an uncatalyzed keto-enol tautomerization that occurs after the
ring-opening step (Figure ).[85]When also taking into
account the potential involvement of loop dynamics in determining
turnover rates, as seen in other enzymes with catalytically important
conformational changes such as protein tyrosine phosphatases[16,19,22,82] and indole-3-glycerol phosphate synthase,[86] this means the experimental turnover numbers for the isomerization
of ProFAR and PRA by the enzymes of interest do not correspond to
a chemical step occurring in the enzyme active site. Furthermore,
the Amadori reaction that occurs between the enzyme-catalyzed ring-opening
reaction and the nonenzymatic tautomerization is likely to be fast
(the uncatalyzed reaction occurs spontaneously at 25 °C).[84] Thus, the ring-opening reaction is likely the
slowest enzyme-catalyzed chemical step in the catalytic cycle, as
supported by QM/MM studies of the HisA-catalyzed reaction.[83] However, even if the experimentally measured
turnover numbers (kcat) do not directly
correspond to this step, they do produce a lower limit for the rate
of this step (and thus an upper limit for the corresponding activation
free energy for the rate-limiting step of the enzymatic reaction).
Thus, comparing the calculated activation free energies for the ring
opening in different variants, and in different conformational states
of loop 1, can provide insight into the impact of loop dynamics and
key amino acid substitutions on the rate of the slowest enzyme-catalyzed
step.Taking these limitations into account, the experimental
and calculated
activation energies are shown in Tables S10 and S11 and Figure , with representative Michaelis complexes (MC), transition states
(TS), and ring-open intermediates for the ribose ring-opening reaction
catalyzed by each wild-type enzyme shown in Figure (ProFAR) and Figure S10 (PRA). We note that PRA has much more conformational freedom
in the HisA and PriA active sites, which are both evolved to (also)
accommodate the larger substrate ProFAR, and is thus capable of sampling
a broad number of different conformations at the Michaelis complex,
leading to an increased standard error of the mean in comparison to
systems where ProFAR is bound to the active site of these enzymes.
Curiously, our calculations reproduce the experimental trend in activation
free energies derived from the turnover numbers relatively well for
both substrates ProFAR (Table S10) and
PRA (Table S11) for nearly all variants,
despite the fact that observed kcat does
not correspond to a chemical step in the enzyme, as discussed above.
Figure 9
Calculated
and experimentally derived activation energies for the
isomerization of substrates (A, D) ProFAR (orange bars) and (B, C)
PRA (blue bars), as catalyzed by (A, C) wild-type SeHisA and variants and (B, D) MtPriA and variants.
The calculated free energies (ΔGcalc‡) correspond to the ribose ring-opening step (Tables S10 and S11). The experimental activation
free energies (ΔGexp‡) were derived from the kinetic data presented in refs (15, 18).
Figure 10
Representative
structures of stationary points at the Michaelis
complexes (MC), transition states (TS), and intermediate states (IS),
respectively, for the ribose ring-opening step of the (A–C)
HisA- and (D–F) PriA-catalyzed isomerization of ProFAR (by
the wild-type enzymes). For EVB simulation details, see the Methodology section. Structures were selected based
on clustering analysis using the hierarchical agglomerative algorithm,
as implemented in CPPTRAJ.[77] Note that
the annotated catalytic distances are average values over 6000 snapshots
extracted for each state from our EVB trajectories (from 30×
individual 200 ps EVB mapping windows per stationary point/system).
For a full list of reacting distances across all variants, see Tables S12 and S13.
Calculated
and experimentally derived activation energies for the
isomerization of substrates (A, D) ProFAR (orange bars) and (B, C)
PRA (blue bars), as catalyzed by (A, C) wild-type SeHisA and variants and (B, D) MtPriA and variants.
The calculated free energies (ΔGcalc‡) correspond to the ribose ring-opening step (Tables S10 and S11). The experimental activation
free energies (ΔGexp‡) were derived from the kinetic data presented in refs (15, 18).Representative
structures of stationary points at the Michaelis
complexes (MC), transition states (TS), and intermediate states (IS),
respectively, for the ribose ring-opening step of the (A–C)
HisA- and (D–F) PriA-catalyzed isomerization of ProFAR (by
the wild-type enzymes). For EVB simulation details, see the Methodology section. Structures were selected based
on clustering analysis using the hierarchical agglomerative algorithm,
as implemented in CPPTRAJ.[77] Note that
the annotated catalytic distances are average values over 6000 snapshots
extracted for each state from our EVB trajectories (from 30×
individual 200 ps EVB mapping windows per stationary point/system).
For a full list of reacting distances across all variants, see Tables S12 and S13.We used MtPriA in its pro-ProFAR conformation
as a reference state to calibrate our EVB simulations of the initial
ring-opening of ProFAR, obtaining an activation free energy of 18.3
± 0.5 kcal mol–1 for this reaction. The resulting
EVB parameters were then used unchanged in all relevant systems, to
obtain the data presented in Table S10.
We obtained a slightly lower activation free energy of 17.5 ±
0.6 kcal mol–1 for the analogous reaction catalyzed
by wild-type HisA, which is in agreement with the experimental observation
that SeHisA is a better catalyst of ProFAR isomerization
than MtPriA (Table S10). To further validate our results, we performed single amino acid
substitutions in each enzyme (R19A and D130A in PriA and S202A, D129N
and D10G in HisA), and performed EVB simulations on these variants.
In nearly all cases, we observed similar or higher activation free
energies for the ring-opening step compared to the respective wild-type
enzymes, in agreement with experimentally observed loss of activity
upon these substitutions.[15,18] We note, however, that
this effect is less pronounced than the experiment in the case of
the HisA(S202A) and PriA(D130A) variants, suggesting that the experimentally
observed loss of activity is due to either a change in substrate positioning
or loop conformation or dynamics, which we are not necessarily able
to capture in our simulations when simply starting from the wild-type
crystal structure and manually truncating these residues.In
the case of PRA (Table S11), we again
used the reaction catalyzed by MtPriA as our EVB
reference state, this time with loop 5 in its pro-PRA conformation.
While wild-type SeHisA does not show TrpF activity,[18] we nevertheless modeled this reaction in SeHisA for comparison, starting from an idealized position
of PRA in the active site (based on overlay with substrate ProFAR),
and obtained an activation free energy of 19.5 ± 0.6 kcal mol–1. This is higher than the value obtained for ProFAR
ring-opening by SeHisA (17.5 ± 0.6 kcal mol–1, Table S10), but is very
similar to the value obtained for SeHisA(dup13–15)
(19.9 ± 1.2 kcal mol–1, Table S11). This suggests that, in theory, SeHisA could catalyze PRA ring-opening if all loops are in the correct
conformation and the substrate is optimally positioned, and that the
lack of activity is not because of a high barrier to the chemical
reaction catalyzed by this enzyme. All subsequent variants show a
steady decrease in calculated activation free energies for the ring-opening
reaction catalyzed from the loop closed conformation, in line with
the increased population of the loop 1 closed conformation (Table ) compensating for
the lack of interactions between the substrate PRA and the stabilizing
gripper loop 3 (Figure ).In the case of MtPriA we modeled three
individual
substitutions (R19A, D130A, and R143A) and extended our EVB simulations
(Table S11 and Figure ), to specifically capture the impact of
losing the electrostatic contribution of each truncated side chain
on the activation free energy. In the case of MtPriA(R19A),
we obtained a slight increase in activation free energy that trends
with the loss of activity observed experimentally.[15] Structurally, this side chain is located on loop 1, and
has been suggested to be important for the substrate-specific formation
of the active site into pro-ProFAR and pro-PRA conformations.[15] Our EVB calculations on both ProFAR and PRA
ring-opening as catalyzed by MtPRA(R19A) (Tables S10 and S11) suggest an important electrostatic
role for this residue. Similarly, kinetic and structural data suggest
that truncation of R143 and D130 to alanine negatively impacts PRA
isomerization (in the case of R143A, kcat/KM is reduced from 1.7 × 105 to 6.0 × 103 M–1 s–1 on the introduction of this substitution).[15] In contrast, we obtain lower activation free
energies for PRA ring-opening catalyzed by both the D130A and R143A
variants. However, especially in the case of the R143A variant, it
is unclear if the experimentally observed effect is expressed on kcat, KM, or both,
and it is plausible that the loss of activity is due to structural
effects that prohibit productive substrate binding, and therefore
are not captured in our simulations. This scenario would be similar
to our observations from an analogous system activated by a ligand-gated
conformational change, glycerol-3-phosphate dehydrogenase (GPDH).[38]Finally, to examine how important loop
1 closure is for PRA isomerization,
we performed EVB simulations of ProFAR and PRA ring-opening as catalyzed
by SeHisA (both wild-type and the SeHisA(dup13–15/D10G/G102A/Q24L/V15[b]M variant)) and MtPriA (wild-type) with loop 1 in the open conformation
(Tables S10 and S11). Due to the lack of
crystal structures, we extracted structures from our conventional
MD simulations as starting points for these EVB simulations. For PRA
isomerization, we obtained much higher activation free energies for
ring-opening when loop 1 was modeled in an open conformation than
a closed conformation, due to a combination of the loss of key interactions
between loop 1 and the substrate, and also extra solvent exposure
of the active site when this loop opens. However, we observed no impact
on the activation free energy when modeling ProFAR isomerization starting
with loop 1 open. Therefore, while the catalytic importance of loops
5 and 6 is well established,[15,18,54] our EVB calculations clearly show that, in addition, full closure
of loop 1 into a catalytically competent conformation is essential
for efficient PRA isomerization.
Overview and Conclusions
In the present work, we combined conventional and enhanced sampling
MD simulations, with EVB calculations, to explore the role of loop
dynamics in dictating the selectivity and evolvability of the evolutionarily
important model enzymes, HisA, TrpF, and PriA.[18,45−48,87] The roles of loop dynamics and
ligand-gated conformational changes in TIM-barrel proteins and proteins
from other folds have been a topic of substantial research interest
(e.g., refs (8, 34−38, 88−97)). However, what makes the current enzymes stand out from these prior
studies is the importance of not one but two (TrpF) or even three
(HisA and PriA) long, mobile loops (Figure ), the specific conformations of which have
been suggested to play an important role in facilitating the substrate
selectivity of PriA and evolved HisA variants.[15,18,54] Thus, these enzymes undergo particularly
complex loop dynamics. In addition, prior enzymes that have been characterized
as being activated by ligand-gated conformational changes, such as
TPI and OMPDC, are extremely proficient.[34−38,98−100] In contrast, the enzymes studied here are relatively inefficient
(Table S2),[15,18,79] with turnover numbers of ∼10 s–1 or less,[18,52,101] despite catalyzing an intrinsically facile reaction.[84] Furthermore, while loop motions in TIM-barrel
enzymes such as TPI are relatively fast (on the μs timescale),[96] motions on the ms timescale have been detected
in the evolved HisA variants,[18] and thus
loop motions are likely to be (at least partially) rate-limiting in
these enzymes.At the simplest level, our simulations agree
with structural data[15] that TrpF, PriA,
and HisA have increasingly
large (in terms of active site volume) and “breathable”
active sites (Table S8), allowing for the
accommodation of ProFAR by HisA and PriA but not PRA-specific TrpF,
the active site of which is too small to accommodate the larger substrate.[15] More importantly, HisA and PriA both possess
“gripper residues” interacting with the nonreactive
phosphodianion group of ProFAR (R83 and S103 in HisA; R85 and T105
in PriA, Figure ),
that are very similar to analogous interactions in other enzymes activated
by ligand-gated conformational changes, such as TPI.[34,35] Of note, however, is that the HisA/PriA gripper residues are contributed
from the less mobile loops 3 and 4, unlike the primary gripper loop
in TPI (loop 6), which undergoes a substantial conformational change
upon ligand binding.[12] Other TIM-barrel
proteins such as OMPDC possess analogous gripper loops to TPI loop
6,[12] showing evidence for convergent evolution
on these different (βα)8-barrel scaffolds.Our simulations show that while the gripper interaction is stable
in HisA throughout the simulations, in PriA, there is electrostatic
repulsion between R85 and an additional active site arginine, R143,
which causes instabilities in the catalytic loops (in particular loop
5, Figure S3), as well as in the substrate
positioning in the active site such that ProFAR is less stable in
the PriA active site than in the HisA active site. This is in effect
a ligand-gated effect, where interaction with the nonreactive phosphodianion
(which is not present in the smaller substrate, PRA) facilitates the
stability of catalytically important loop 5. Thus, the underlying
principles driving loop stability are similar to those of other enzymes
that are activated by ligand-gated changes.PCA on our simulations
showed that the active site loops in these
enzymes can sample a range of wide-open conformations with transitions
between them, with their conformational flexibility being stabilized
by ligand binding (although less so in the bifunctional MtPriA than the ProFAR-specific SeHisA). In contrast,
the active site loops in TmTrpF, which binds a smaller
substrate and lacks the gripper, remain dynamic. In particular, loop
6 (Figure ) samples
a range of open conformations even with PRA bound to the active site.
Additionally, all HisA variants from the real-time evolution experiment[46] studied here also sample a range of open and
wide-open conformations. In this context, however, the D10G substitution
on loop 1 appears to be sufficient by itself to increase the population
of the closed conformation sampled during our simulations (Table ), and the highest
proportion of closed conformation is observed in simulations of the SeHisA(dup13–15/D10G/Q102A/Q24L/V15[b]M) variant,
which has the highest TrpF activity[18] (Table S2).Pulling simulations, where we
pulled products PRFAR and CdRP out
of the SeHisA(dup13–15/D10G/G102A/Q24L) active
site, showed significant conformational changes in both loops 1 and
6 when pulling PRFAR out, whereas CdRP was pulled out with loop 1
remaining stable and loop 6 opening. This suggests that loop 1 dynamics
are more important for binding of ProFAR and subsequent release of
PRFAR, than for the smaller substrate PRA and its product CdRP, whereas
loop 1 dynamics appears to be critical to catalysis (Table S11). In addition, as mentioned above, these enzymes
are relatively inefficient (Table S2).[18,52,101] When taking this into account,
the substantial rearrangements that we observe for both compounds
suggests that a slow (potentially rate-limiting) product release step
that requires substantial loop rearrangement may be the reason for
the low turnover numbers observed for an otherwise facile reaction,
further providing evidence that turnover rates are being regulated
by loop dynamics.In contrast to HisA, which undergoes conformational
changes of
loop 1 during the real-time evolution experiment that changes its
selectivity from ProFAR-specific to PRA-specific,[18,46] the bifunctional PriA is already able to rearrange its active site
in its wild-type form, to accommodate the different substrates through
alternation between pro-ProFAR and pro-PRA conformations of loop 5.[15] These conformational changes both reduce repulsion
between the two active site arginine side chains that in turn destabilize
loop 5 dynamics (Figure S3), as well as
disrupting the stacking interaction between the W145 side chain and
the substrate ProFAR (Figure S4). This
rationalizes the preference of this enzyme toward PRA rather than
ProFAR, despite its similarities with the HisA active site. We also
note that loss of the stacking interaction between W145 and ProFAR
was one aspect of the gain of PRA isomerization activity in the SeHisA(dup13–15) variant from the real-time evolution
experiment.[18,46]Finally, we performed EVB
simulations of the first ring-opening
step of ProFAR and (where relevant) PRA isomerization (Figure ) by wild-type HisA, PriA,
and variants. As described above, the actual rate of the chemical
step in these enzymes is unknown, since the rate-limiting step in,
at least tryptophan biosynthesis, appears to be an uncatalyzed keto-enol
tautomerization step.[85] Nevertheless, we
saw that our calculated activation free energies for the ring-opening
reaction trend well with the differences in activation free energies
derived using the measured turnover numbers as an upper limit for
this value, suggesting there is both a chemical and a dynamical component
to the observed changes in activity upon substitution and/or duplication
of key residues. Furthermore, to reproduce the relevant PriA activation
free energies, it was necessary to start from different structures
of loop 5, following earlier structural analysis that demonstrated
the loop can exist in either a knot-like pro-PRA or β-hairpin
pro-ProFAR conformation (Figure S1), depending
on which substrate is bound.[15] Clearly,
the ease with which this rearrangement can occur will also impact
the selectivity of this enzyme. Finally, EVB simulations of wild-type SeHisA, MtPriA and the SeHisA(dup13–15/D10G/G102A/Q24L/V15[b]M) variant with loop 1
in an open conformation all yielded substantially higher energies
for PRA isomerization, whereas ProFAR isomerization (by wild-type
HisA) seems to be unaffected (Tables S10 and S11). This further emphasizes the importance of the correct closure
of loop 1 for isomerization of the smaller substrate PRA. This is
in contrast to substrate binding, where the conformational plasticity
of loop 1 appears to be far more important for facilitating the correct
binding of ProFAR than PRA (Figure S8).Taken together, these observations highlight the critical role
of multiple decorating loops of HisA, PriA, and TrpF in facilitating
catalysis. These enzymes stand out from prior systems that have been
demonstrated to be activated by ligand-gated conformational changes[34−41] due to a number of factors. First, we have shown the interdependent
motion of three long loops (or two in TrpF), none of which dominates
and each of which is capable of undergoing substantial conformational
changes to facilitate the turnover of different substrates. Second,
unlike prior systems which show substantial rate accelerations compared
to the uncatalyzed reactions with comparatively rapid loop motions,
in these enzymes, the catalyzed reaction is already intrinsically
fast,[84] whereas loop motion is slow and
appears to be controlling the reaction rate.It has been argued
that a PriA-like gene product could have been
the common evolutionary ancestor for both HisA and TrpF.[53,87,102] Ancestral sequence reconstruction
has also been used to suggest that ancient HisA precursors were likely
bifunctional and that this bifunctionality persisted over at least
a 2 billion year time span.[52] However,
as shown in Figure , HisA and PriA exploit loops 1, 5, and 6 to facilitate activity,
whereas TrpF lacks an analogue for loop 1, and isomerizes PRA harnessing
just two catalytic loops 3 and 6. Our results suggest that an evolutionary
trajectory from a PriA-like ancestor to an extant TrpF would be surprisingly
complex. Loop 1 would need to be truncated (and not extended, as when SeHisA was artificially evolved into a TrpF),[18] and the interdependency of this third loop would
need to be lost, raising the question of what evolutionary path would
take a PriA-like precursor to TrpF, while completely abolishing this
loop. We note that the natural or evolved bifunctionality in these
enzymes being driven by the ability to adapt different conformational
states is consistent with prior studies on the laboratory evolved
bifunctionality of a phosphotriesterase from Pseudomonas
diminuta,[30,103] suggesting that it
may be a more common phenomenon not limited to these enzymes.Despite the novel aspects of the systems studied here, a key similarity
with prior systems is the generalizability of ligand-gated conformational
changes across a wide range of systems, in particular TIM-barrel proteins,[34] which tend to possess flexible loops decorating
their active sites. The conservation of such ligand-gated conformational
changes—albeit triggered in different loops—suggests
that these loops evolve independently of the barrel providing a starting
point for the emergence and divergence of new enzyme activities.[21,47] In addition, TrpF, for example, has been shown to be highly tolerant
of variations in loop 6 sequence such that grafting sequences from
related enzymes such as TrpA, HisA, and PriA onto the TrpF scaffold
did not abolish activity.[104] This is significant
considering the high evolvability of this scaffold,[8] and the wide range of chemistry it supports,[47] which makes it very desirable as a starting
point for protein engineering efforts. In addition, it could be argued
that the real-time evolution experiment that bestowed PRA isomerization
activity to HisA[46] effectively performed
“natural” loop-engineering by altering the conformations
of key active site loops.[18] Our work therefore
suggests that, more broadly, loop grafting and engineering is a powerful
tool for generating novel enzymes with tailored activities and specificities,
even in complex systems with multiple highly mobile and interdependent
catalytic loops.