Literature DB >> 24720347

Mechanistic and bioinformatic investigation of a conserved active site helix in α-isopropylmalate synthase from Mycobacterium tuberculosis, a member of the DRE-TIM metallolyase superfamily.

Ashley K Casey¹, Michael A Hicks, Jordyn L Johnson, Patricia C Babbitt, Patrick A Frantom.

Abstract

The characterization of functionally diverse enzyme superfamilies provides the opportunity to identify evolutionarily conserved catalytic strategies, as well as amino acid substitutions responsible for the evolution of new functions or specificities. Isopropylmalate synthase (IPMS) belongs to the DRE-TIM metallolyase superfamily. Members of this superfamily share common active site elements, including a conserved active site helix and an HXH divalent metal binding motif, associated with stabilization of a common enolate anion intermediate. These common elements are overlaid by variations in active site architecture resulting in the evolution of a diverse set of reactions that include condensation, lyase/aldolase, and carboxyl transfer activities. Here, using IPMS, an integrated biochemical and bioinformatics approach has been utilized to investigate the catalytic role of residues on an active site helix that is conserved across the superfamily. The construction of a sequence similarity network for the DRE-TIM metallolyase superfamily allows for the biochemical results obtained with IPMS variants to be compared across superfamily members and within other condensation-catalyzing enzymes related to IPMS. A comparison of our results with previous biochemical data indicates an active site arginine residue (R80 in IPMS) is strictly required for activity across the superfamily, suggesting that it plays a key role in catalysis, most likely through enolate stabilization. In contrast, differential results obtained from substitution of the C-terminal residue of the helix (Q84 in IPMS) suggest that this residue plays a role in reaction specificity within the superfamily.

Entities: CellLine Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2014 PMID： 24720347 PMCID： PMC4025573 DOI： 10.1021/bi500246z

Source DB: PubMed Journal: Biochemistry ISSN： 0006-2960 Impact factor: 3.162

The enzyme isopropylmalate synthase (IPMS) catalyzes the first step in the biosynthesis of l-leucine in bacteria, archaea, and some eukaryotes. This pathway is absent in mammals, making IPMS a possible target for the development of novel antibiotic and antifungal therapeutics.[1] IPMS also serves as a model system for the study of allosteric mechanisms, as it is subject to allosteric feedback inhibition by l-leucine.[2] The enzyme catalyzes a Claisen-like condensation between α-ketoisovalerate (KIV) and acetyl-CoA (AcCoA) forming α-isopropylmalate and CoA (Scheme 1). Structural studies show that IPMS utilizes a distinct active site architecture to accomplish this type of chemistry when compared with malate synthase[3] and citrate synthase,[4] which catalyze similar reactions. In fact, the active site architecture exhibited by IPMS is more similar to a collection of enzymes catalyzing a diverse set of reactions including 3-hydroxy-3-methylglutaryl-CoA (HMG-CoA) lyase, 2-hydroxy-4-ketovalerate aldolase, and pyruvate carboxylase (Scheme 1). It has been proposed that these enzymes belong to a mechanistically diverse group known as the DRE-TIM metallolyase superfamily, a group of evolutionarily related enzymes that catalyze different reactions using distinct mechanisms.[5]

Scheme 1

Despite this diversity in function, enzymes in a superfamily share a common mechanistic aspect in the stabilization of an intermediate mediated through a set of conserved residues.[6] Members of the DRE-TIM metallolyase superfamily share a TIM-barrel fold (Figure 1A), a D-R-E active site motif, and rely on a divalent cation for activity. Catalytically, members of this superfamily are hypothesized to stabilize an enolate intermediate in their respective reactions using the conserved arginine in the DRE motif (Scheme 2). Additionally, several characterized members of the superfamily respond to allosteric regulation.[7,8] The active site architecture for the DRE-TIM metallolyase superfamily is highlighted by a D-R-E motif composed of a well-conserved active site α-helix containing the R and D residues adjacent to one another (Figure 1B). The conserved glutamate residue is found on an adjacent β sheet and is proposed to orient the arginine residue. The aspartate acts as a ligand to a required divalent cation along with two well-conserved histidine residues in a HXH motif. At least one additional metal ligand in each enzyme is provided by a substrate; however, the nature of the substrate–metal interaction is not conserved.

Figure 1

Scheme 2

(A) Ribbon diagram of the DRE-TIM metallolyase catalytic domain from MtIPMS (1pdb id: 1sr9(63)). Conserved active site residues are shown as sticks. (B) Conserved active site architecture of the DRE-TIM metallolyase superfamily. The superposition was created using the Matchmaker algorithm in Chimera with the following pdb files: 1sr9,[63] IPMS (tan); 1ydo,[5] HMG-CoA lyase (blue); 2qf7,[23] pyruvate carboxylase (pink); 1nvm,[7] 4-hydroxy-2-ketovalerate aldolase (green). Labels designate the residues of the DRE-motif, the differentially conserved Q/X residue, and the HXH motif. The divalent metal is also indicated. While the aspartate and arginine residues have been subjected to site-directed mutagenesis in several DRE-TIM metallolyase superfamily members,[9−13] they have not been investigated in IPMS. Recently, the active site helix containing the conserved aspartate and arginine was implicated as the target of an inhibitory allosteric signal in IPMS from Mycobacterium tuberculosis (MtIPMS), raising additional questions about the role of the helix in catalysis and regulation in this enzyme.[14] To address these questions, site-directed mutagenesis has been carried out on MtIPMS, and the effects of substitutions on catalysis and regulation have been determined. Analysis of the effects of residue substitution with respect to other superfamily members provides a mechanism for the identification of conserved catalytic strategies and characterization of structure/function relationships responsible for differences in reactivity, substrate selectivity, and regulation. Thus, parallel to the biochemistry studies, a bioinformatics investigation of the DRE-TIM metallolyase superfamily has been initiated and the results illustrated using sequence similarity networks for the DRE-TIM metallolyase superfamily. Sequence similarity networks have been successfully used to organize functionally diverse enzyme superfamilies into subgroups and families of sequences representing discrete reaction specificities.[15] The language of superfamily hierarchies used here is as follows: superfamily, a set of evolutionary related enzymes that share a common mechanistic step, such as stabilization of the same type of intermediate, but whose overall reactions may be different; subgroup, a subset of a superfamily whose members share more similarity in sequence with one another than they do with proteins in other subgroups; family, a subset of a subgroup whose members catalyze the same reaction in essentially the same way. This organization allows for the rapid detection of conserved residues at differing hierarchies within the superfamily. For instance, more recently evolved residues (such as those conserved at the subgroup or family level) may be critical specificity determinants or provide information for unique regulatory mechanisms.[16] Applying this methodology to the DRE-TIM metallolyase superfamily provides insight into the conservation and diversity of residues in the DRE active site helix and aids in teasing out differentially conserved interactions in each reaction class.

Materials and Methods

Materials

Oligonucleotides for the mutagenesis of MtIPMS were obtained from Eurofins MWG Operon (Huntsville, AL). Acetyl CoA (AcCoA) and ketoisovalerate (KIV) were purchased from Sigma-Aldrich. 4,4′-Dithiodipyridine (DTP) was purchased from Acros Organics. All other buffers and reagents were obtained from VWR or were of the highest quality available. The HisTrap HP column was purchased from GE Healthcare. Competent cells (BL21(DE3)pLysS and Top 10) were obtained from Invitrogen.

MtIPMS Variant Construction and Purification

Wild type MtIPMS and all variants reported here were constructed and isolated as previously described.[17] Briefly, QuikChange Lightning site-directed mutagenesis (Stratagene) was used to create point mutations in the pET28a(+)::leuA plasmid. Primers used for mutagenesis are shown in Table S1 (Supporting Information). BL21(DE3)pLysS cells containing the plasmids were grown in autoinduction media. Overexpressed proteins were purified via metal affinity chromatography using a HisTrap HP column (5 mL). Protein expression and purity were checked by SDS–PAGE analysis. The peak fractions were pooled and dialyzed against 1 L of 20 mM triethanolamine (TEA) (pH 7.8) and repeated three times. The protein was stored in 10% glycerol at −20 °C.

Enzymatic Assays

A typical reaction mixture contained 50 mM potassium phosphate buffer (pH 7.5), 12 mM MgCl2, 50 μM DTP, and 5–10 times the Km value for the nonvaried substrate (AcCoA or KIV). Initial velocities were determined using DTP to detect the formation of CoA at 324 nm (ε = 19.8 mM–1 cm–1). Kinetic constants were determined by fitting the data to the Michaelis–Menten equation (eq 1) using Kaleidagraph (Synergy Software), where v is the velocity, [E]t is the total enzyme concentration, [S] is the concentration of the substrate being varied, Km is the Michaelis–Menten constant, and kcat is the maximal velocity.

Inhibition Assays

Inhibition assays were performed using a standard reaction mixture of 50 mM potassium phosphate buffer (pH 7.5), 12 mM MgCl2, 50 μM DTP, 5–10 times the Km value of the nonvaried substrate (AcCoA or KIV), and varied concentrations of leucine (0–120 μM). Reactions were initiated by the addition of MtIPMS. For assays that displayed nonlinear (biphasic) kinetics, the steady state and final velocities were determined using eq 2, where [P]t is the total product formed (CoA) at time, t, [E]t is the total enzyme concentration, vi is the initial velocity, vf is the final velocity, t is time, kb is the exponential rate constant, and C is a constant.[18] The inhibition parameters were then determined by replotting the velocities versus leucine concentration and fit to eq 3 (for Ki values) or eq 4 (for Ki* values) where Ki is the dissociation constant for the initial enzyme–inhibitor complex, Ki* is the overall dissociation constant for the two-step slow-onset mechanism, vi and vf are the initial and final velocity, respectively, and β is the fractional velocity at saturating concentrations of the inhibitor, I.[19]

Sequence Similarity Network Data Set Sources and Curation

Four structures representing the known reaction diversity in the DRE-TIM metallolyase superfamily[5] (pdb_ids:[20]1nvm,[7]1rqb,[21]3hq1,[22] and 2qf7(23)) were used as a starting point for identifying new homologous sequences through a series of BLAST[24] searches against UniRef100 (Uniprot Release 2012_02).[25] All hits from an initial BLAST search with an E-value less than 1 × 10–5 were kept. This new data set of 13230 sequences was filtered using HMMER 3.0 beta[26] such that any sequence that did not score against the Pfam[27] HMM for PF00682: HMGL-like was dropped. This new data set (4889 unique sequences) was then filtered to 40% identity using CD-HIT v4.5.6,[28] resulting in 54 sequences which were used as a query against UniRef100 again (73764 hits) and filtered in the same manner to produce the final superfamily set (8817 unique sequences).

Generation of Sequence Similarity Networks

Sequence similarity networks were generated using the Pythoscape program.[29] Briefly, the final superfamily set of proteins (8817 unique sequences) were imported into a MySQL database and defined as nodes. All-by-all Blast2seq[24] runs were performed to define edges between nodes, and this information was uploaded into the MySQL database. Additional information associated with the proteins, such as taxonomic information and pdb structures, was mapped to the MySQL database through the Pythoscape interface. A Cytoscape[30]-readable xgmml file was exported from Pythoscape to produce networks that could be explored interactively using Cytoscape. Mapping the network nodes with additional data, such as kinetic characterization data, was done within the Cytoscape program. Because of the large number of edges at the superfamily level, representative nodes and edges were required for visualization of the full network in Cytoscape. A representative node was defined by CD-HIT v4.5.6[28] as a cluster of sequences sharing greater than 60% identity. Representative edges, defined here as the mean BLAST E-value between the set of sequences contained within two connected representative nodes, are shown only if their BLAST scores have a statistical significance value less than (better than) a user-defined E-value cutoff. (See network figure legends for specific cutoffs used.)

Structure-Guided Sequence Alignments

The best aligning chains of MtIPMS (pdb_id: 3u6w(31)), LiCMS (3blf[9]), and SpHCS (2zyf[32]) from the Claisen condensation-like (CC-like) subgroup were aligned using the Needleman–Wunsch algorithm[33] as implemented in the Matchmaker program[34] in Chimera.[35] A multiple sequence alignment based on the structure alignment was created through a companion program, Match → Align. The sequence alignment was then refined by the eye using the aligned structures as a guide. Sequence-based alignments of each cluster were generated based on cluster membership using MAFFT, version 6.[36]

Results

Steady State Kinetics

In order to investigate the catalytic role of the active site helical residues, site-directed mutagenesis was performed. All enzyme variants were characterized using circular dichroism spectroscopy to ensure that substitutions did not affect the overall fold of the enzyme (Figure S1, Supporting Information). The kinetic parameters determined for the enzyme variants are listed in Table 1. The substitutions made to R80 and D81 abolished IPMS activity. However, D81A MtIPMS has the ability to catalyze the hydrolysis of AcCoA producing acetate, KIV, and free CoA (data not shown). In the absence of KIV, kinetic parameters for the hydrolysis of AcCoA were determined (kcat = 4.6 ± 0.3 min–1 and KAcCoA = 283 ± 54 μM). The L79A variant displayed a 60-fold decrease in kcat/KKIV when compared to that of the wild-type enzyme. The N83A and Q84A variants displayed a 150–250-fold decrease in kcat/KAcCoA., while the kcat/KKIV parameters are relatively unchanged. Q84A MtIPMS also exhibits a 30-fold decrease in kcat. N83E MtIPMS was determined to be inactive.

Table 1

Kinetic Parameters Determined for MtIPMS Variantsa

enzyme	k_cat (s^–1)	K_AcCoA (μM)	k_cat/K_AcCoA (μM ^–1 s^–1)	K_KIV (μM)	k_cat/K_KIV (μM ^–1 s^–1)
WT	3.4 ± 0.3	42 ± 12	0.08 ± 0.02	6.4 ± 1.4	0.50 ± 0.13
L79A	1.0 ± 0.1	30 ± 5	0.03 ± 0.01	120 ± 30	0.008 ± 0.002
R80A/K	ndb	nd	nd	nd	nd
D81A/H	nd	nd	nd	nd	nd
N83A	0.31 ± 0.04	990 ± 290	0.0003 ± 0.0001	51 ± 14c	0.006 ± 0.002
N83E	nd	nd	nd	nd	nd
Q84A	0.10 ± 0.02	170 ± 60	0.0006 ± 0.0002	19 ± 6	0.005 ± 0.002

Determined using 4,4-dithiodipyridine (DTP) to detect the formation of CoA at 324 nm (ε = 19.8 mM–1 cm–1) at 25 °C. Standard reaction conditions consisted of 50 mM potassium phosphate buffer (pH 7.5), 12 mM MgCl2, 50 μM DTP, and at least 5–10 times the Km value for the nonvaried substrate (AcCoA or KIV).

Activity could not be determined above the detection limit of the assay.

Determined using 1 mM AcCoA.

Effect of Mutations on l-Leucine Inhibition

Kinetic assays were performed to determine the effect of l-leucine binding on the active enzyme variants. In the presence of l-leucine, all of the enzyme variants displayed slow-onset inhibition kinetics (nonlinear biphasic progression curves) similar to the results seen with the wild-type enzyme.[37] Inhibition parameters for two-step, slow-onset inhibitors are described by two terms, Ki, which describes the inhibition constant for the initial enzyme–inhibitor complex, and Ki*, which describes the overall inhibition constant for the two-step inhibition mechanism.[18] The initial and steady state velocities were determined by varying concentrations of leucine (as described in the Materials and Methods) and fit to eqs 3 and 4 to determine the inhibition parameters Ki and Ki*, respectively (Figure S2, Supporting Information). Inhibition parameters are shown in Table 2. There is no drastic change in the values for the inhibition constants Ki and Ki* relative to those determined with the wild-type enzyme.

Table 2

Leucine Inhibition Parametersa

enzyme	K_i (μM)	K_i* (μM)	β
WT	12 ± 3	2.3 ± 0.2	0.1
L79A	19 ± 3	4.0 ± 0.5	0
N83Ab	8.3 ± 2.2	3.2 ± 0.5	0.2
Q84A	13 ± 4	4.8 ± 0.5	0.04

Determined using a standard reaction mixture of 50 mM potassium phosphate buffer (pH 7.5), 12 mM MgCl2, 50 μM DTP, 5–10 times the Km values of AcCoA and KIV, and varied concentrations of leucine (0–120 μM).

Determined using 750 μM AcCoA.

Sequence Similarity Network for the DRE-TIM Metallolyase Superfamily

The overall sequence similarity network for representative members of the DRE-TIM metallolyase superfamily (1261 representative nodes representing 8817 unique sequences) is shown in Figure 2. Nodes represent sets of proteins sharing greater than 60% identity and are colored if at least one protein in the node has been annotated in Swiss-Prot[38] with a functional activity. Nodes shown as diamonds contain at least one sequence for which a three-dimensional structure has been reported in the Protein Databank.[20] The vast majority of proteins have not been biochemically characterized and are thus left as proteins of unknown function (“unknowns”, uncolored nodes). Correct annotation of these unknowns will require identification of their specificity determinants and is beyond the scope of this work. Four identifiable subgroups are defined based on clustering patterns and characterized functions: Claisen condensation-like (CC-like), carboxylase-like, lyase-like, and aldolase-like.

Figure 2

Representative sequence similarity network for the DRE-TIM metallolyase superfamily. Each node (1261 representative nodes) represents a group of protein sequences sharing greater than 60% identity (8817 unique sequences). Edges are drawn if the mean similarity between a pair of nodes is less than an E-value threshold cutoff of 1 × 10–26 (median alignment length = 379, and median percent identity of pairwise comparisons = 29%). The network is displayed using the organic layout in Cytoscape. Disconnected nodes have been moved for clarity, as the distance between disconnected nodes has no meaning in this layout. The relative sizes of the nodes represent the number of sequences in the representative cluster (small nodes, 1–9 sequences; medium nodes, 10–99 sequences; large nodes, 100–208 sequences). Colored nodes represent sets of proteins in which at least one protein has been annotated in Swiss-Prot with a functional activity according to the inset color key. Conversely, nodes are left uncolored if no protein has been reported to be characterized in Swiss-Prot. Diamond shaped nodes contain at least one protein with a solved crystal structure in the PDB.[20] Dotted boxes display the four subgroups of the superfamily, as defined by clustering patterns: HMG-CoA lyase-like, CC-like (comprising proteins of the IPMS, homocitrate synthase, and citramalate synthase families, along with unknowns), aldolase-like, and carboxylase-like.

Sequence Similarity Network for the Claisen Condensation-Like Subgroup

The 583 representative nodes (representing 4298 nonidentical proteins) of the CC-like subgroup defined in Figure 2 were selected for the creation of a new network that includes all of the representative nodes of the subgroup and subjected to a more stringent E-value cutoff filter (Figure 3). The edges shown in this figure represent a mean pairwise alignment score for all-by-all comparisons of these 4298 sequences that is better than an E-value cutoff of 1 × 10–80, with a median percent identity of 43% and a median alignment length of 377 residues. Analyses of the sequences found in this network reveal enzymes with at least six unique substrate specificities for the Claisen condensation reaction with AcCoA (Scheme 3). Representative nodes in Figure 3 are colored on the basis of having at least one protein reported to have in vitro characterization of enzymatic activity for IPMS,[39−44] citramalate synthase (CMS),[9,45,46] homocitrate synthase (HCS),[47,48] methylthiolalkylmalate synthase (MAM),[49] R-citrate synthase (R-CS),[50] and 2-phosphinomethylmalic synthase (PMMS).[51] A full table of characterized enzymes with Uniprot identifiers is shown in Table S2 (Supporting Information). Functional assignments shown in Figure 2 are in good agreement with reported Swiss-Prot functional annotation (Figure S3, Supporting Information). The largest cluster contains significant functional diversity, with IPMS, CMS, MAM, and HCS activity represented. Interestingly, reported IPMS, CMS, and HCS activities can be found in multiple clusters. This is consistent with a report proposing multiple origins for IPMS[52] and could be suggestive of additional functional promiscuity.

Figure 3

Representative sequence similarity network for the CC-like subgroup. Each node (583 representative nodes) represents a group of protein sequences sharing greater than 60% identity (4298 unique sequences). Edges are drawn if the similarity between a pair of nodes is better than an E-value threshold cutoff of 1 × 10–80 (median alignment length = 377, and median percent identity of pairwise comparisons = 43%). Colored nodes represent sets of proteins in which at least one protein has confirmed in vitro activity according to the inset color key. Organism names indicate which protein within the representative node has been characterized. Conversely, nodes are left uncolored if no protein has been reported to be characterized in the literature. Diamond shaped nodes contain at least one protein with a solved crystal structure in the PDB.[20]

Scheme 3

Discussion

Targeting evolutionarily conserved residues for site-directed mutagenesis is a common approach used to investigate the chemical mechanism and specificity determinants of an enzyme. In light of the rapid increase in genomic sequences available, the identification of “strictly” conserved residues in a given enzyme is complicated by misannotation in public databases[53] and the identification of functionally diverse enzyme superfamilies that utilize conserved active site architecture associated with a fundamental catalytic strategy or partial reaction to catalyze a variety of chemical reactions. The addition of “genomic enzymology,”[6] describing enzyme catalysis from the context of structure–function relationships among homologous members of enzyme superfamilies, to the toolbox of mechanistic enzymology provides an organizational framework, aids in our understanding of the evolution of function, and helps interpret structure-based mechanisms studies. Here, sequence-similarity networks have been used to summarize some of these relationships across the entire membership of the DRE-TIM metallolyase superfamily and assist in the interpretation of our mechanistic analysis of a conserved active-site helix in MtIPMS.

Analysis of the DRE-TIM Metallolyase Superfamily Network

The superfamily network shown in Figure 2 depicts some of the benefits and challenges associated with a network-based interpretation. The four enzyme activities originally proposed in the DRE-TIM metallolyase superfamily are easily identified in the Claisen condensation-like (CC-like), carboxylase-like, lyase-like, and aldolase-like subgroups. Functional diversity within each subgroup is well established for the CC-like and carboxylase-like subgroups including the identification of two less-documented activities: 2-phosphinomethylmalic acid synthase[51] and α-ketoglutarate carboxylase.[54] Identification of reported functional diversity provides a strong foundation for the discovery of differentially conserved residues specific to each reported activity. Once the roles of these residues are confirmed, sequences of unknown function containing the confirmed residues can be annotated. Additionally, identification of sequences containing unique residues at positions important for substrate selectivity can offer new hypotheses for screening for new functionalities. Functional diversity is also suggested by inspection of the aldolase-like cluster in Figure 2. All of the nodes containing a Swiss-Prot reviewed sequence for 4-OH-2-ketovalerate aldolase activity are located in a single region, while the other half of the subgroup remains unexplored and may contain new functionalities.

Analysis of the CC-Like Subgroup Network

As the stringency for drawing edges between representative nodes is increased, the CC-like subgroup breaks into multiple clusters (Figure 3). One of the more interesting results is the identification of the main three activities (IPMS, CMS, and HCS) in multiple clusters. Results from the analysis of other superfamilies indicate that multiple clusters of an activity can be the result of multiple evolutions from distinct but related progenitors described as “pseudoconvergent evolution”.[55,56] In these cases, different cluster memberships correlate with differences in substrate or stereochemical specificity and differential conservation of residues involved in substrate selectivity. Currently, the IPMS clusters contain the most characterized members. Despite differences in sequence identity (<20% average sequence identity between clusters and 50% average sequence identity within a cluster), no significant differences in function or specificity have been identified, and residues shown to interact with KIV are conserved in both clusters. However, membership in the two IPMS clusters is similar to that reported from a phylogenetic analysis of IPMS sequences suggesting multiple origins for IPMS genes,[52] and differential conservation of residues not directly involved in substrate selectivity can be identified as described below. In the cases of the HCS- and CMS-containing clusters, differential conservation of active site residues predicted to be in contact with the α-ketoacid substrates suggests these activities are examples of pseudoconvergent evolution, but more rigorous phylogenetic and functional analyses are required to further support this hypothesis. The distribution of regulatory domains in the CC-like subgroup also provides a framework for the identification of boundaries within the LeuA dimer regulatory domain. Recoloring the CC-like subgroup network on the basis of predicted domain architecture indicates there are four main clusters which contain sequences predicted to contain an N-terminal DRE-TIM metallolyase catalytic domain and a C-terminal LeuA dimer regulatory domain: IPMS1/CMS1/MAM, IPMS2, CMS2, and CMS3 (Figure S4, Supporting Information). Unfortunately, only two structures of the LeuA dimer domain have been reported (MtIPMS in IPMS2 cluster and CMS from Leptospira interrogans in CMS3 cluster), limiting the ability to draw structural comparisons. Phenomenological mechanisms of allosteric regulation (i.e., V-type or K-type) have been reported for members of the IPMS1, IPMS2, and CMS3 clusters and suggest that multiple mechanisms of regulation are represented within each cluster ruling out allosteric mechanism as a characteristic driving cluster membership.[2] The allosteric mechanism of V-type allosteric inhibition in MtIPMS is due to the perturbation of the hydrolysis step in the reaction.[57] Current work is focused on determining if this mechanism is conserved in members of other clusters exhibiting V-type allosteric regulation.

Superfamily Analysis of Helix Residues in Catalysis

An examination of conservation patterns of residues in the active-site α helix found in characterized members of the superfamily indicates the only amino acid strictly conserved is an arginine residue (Figure 4), suggesting a critical role in catalysis. As described above, substitution of R80 in MtIPMS with alanine or lysine results in inactive enzymes. Our results are in agreement with similar studies on substitution at the arginine position in representative members of the carboxylase-like,[12] aldolase-like,[13] and lyase-like subgroups.[11] Thus, as predicted by conservation, the active site helix arginine is critical for activity across the superfamily. Mechanistically, the arginine residue is predicted to assist in the stabilization of the enolate intermediate in each of the subgroups. However, structural comparison of four representative superfamily members indicates differences in the location of the enolate ion stabilized by the arginine residue, with aldolase-like enzymes having an alternate location (Figure S5, Supporting Information). This suggests that despite differences in substrates and reactions, the arginine is essential to maintain the electronic and catalytic requirements fundamental to catalysis in all members of the superfamily.

Figure 4

Hidden-Markov Model logos for the active site helix of DRE-TIM metallolyase superfamily members. Logos were generated from alignments of sequences in each boxed cluster in Figure 2 using Web Logo.[64] Arrows indicate the main residues discussed in the article. A carboxylic group is strictly conserved in the position adjacent to arginine in the DRE-TIM metallolyase superfamily (D81 in MtIPMS). In HMG-CoA lyase, substitution of the aspartate with alanine, glycine, or histidine results in a 104–105-fold decrease in specific activity in the human enzyme, while substitution with glutamate decreases activity 10-fold.[58] In MtIPMS, substitution with alanine or histidine results in an inactive enzyme; however, a 105 decrease would be at the limit of the spectrophotometric assay for CoA detection. While the aspartate (the “D” in the DRE motif) is favored, in some superfamily sequences a glutamate is found adjacent to the conserved arginine instead (Figure 4). This is discussed below with respect to epistatic interactions within the CC-like subgroup. The other strongly conserved residue in the helix is a glutamine residue found three positions after the aspartate (Q84 in MtIPMS) (Figures 1 and 4). The role of the residue at this position appears to be a major factor in the development of reaction diversity for the DRE-TIM metallolyase superfamily. Members of the carboxylase-like and CC-like subgroups both have conserved glutamine residues at this position. However, substitutions at this position produce drastically different catalytic outcomes. In pyruvate carboxylase from R. etli, substitution of glutamine with alanine or asparagine inactivates pyruvate carboxylase activity and the ability of the enzyme to enolize pyruvate.[12] In sharp contrast to pyruvate carboxylase, substitution of the conserved glutamine with alanine in MtIPMS and homocitrate synthase from S. pombe results in enzymes with only a 10- to 30-fold decrease in the value for kcat relative to the wild-type enzymes.[10] This result suggests the residue plays unique roles in the mechanism of each enzyme. In pyruvate carboxylase, the glutamine is hypothesized to orient pyruvate and maintains an interaction with the helix arginine residue even in the absence of pyruvate.[59,60] In the CC-like subgroup, the glutamine residue can interact with the carbonyl of AcCoA. AcCoA makes additional binding interactions with the enzyme such that disruption of interaction with Q84 is well tolerated. Additionally, reported crystal structures of MtIPMS exhibit different conformations for Q84 (including interactions with AcCoA, R80, R427, and E317) or are missing electron density for the side chain suggesting flexibility at this position. Residues R427 and E317 are strictly conserved in the CC-like subgroup, and future mutagenesis studies will investigate the importance of interactions of the glutamine with E317 and R427 in MtIPMS. A second piece of evidence concerning the importance of the glutamine position in reaction specificity is seen in the aldolase-like subgroup. From the similarity networks, it can be seen that the aldolase-like subgroup contains tyrosine, histidine, or leucine in place of glutamine (Figures 1 and 4). In the class II pyruvate aldolase BphI (see Scheme 1 for the 4-OH-2-ketovalerate aldolase reaction), the histidine residue acts as a general base to deprotonate the C-4 hydroxyl group in the first step of the reaction as substitution with alanine decreases the kcat value by 50-fold relative to that of the wild-type and results in the loss of a pKa value in the pH rate profile.[13] The unique use of histidine as the general base by the aldolase-like subgroup is consistent with the alternate position of the enolate ion relative to the other subgroups. Aldolase-like subgroup members with tyrosine or leucine at this position, however, would be unable to utilize this mechanism. This suggests that these systems utilize architecturally distinct amino acids as the catalytic base or that members lacking the histidine residue catalyze a different reaction. In support of this hypothesis, a member of the aldolase-like subgroup from M. tuberculosis (Rv3469c) containing a tyrosine at this position was reported to lack aldolase activity.[61] The protein did exhibit oxaloacetate decarboxylase activity, albeit with a very low catalytic efficiency (1.6 × 102 M–1 s–1). As a β-keto decarboxylation, this reaction does not require a base to generate the enolate intermediate. The residues flanking the arginine and aspartate on the helix are highly similar in all of the DRE-TIM metallolyase superfamily with a hydrophobic residue preceding the arginine residue and a glycine/alanine following the aspartate residue. These residues are likely important in stabilizing the overall architecture of the active site α helix. The main effect of the L79A substitution in MtIPMS is in agreement with this conjecture as the only significant perturbation is a 20-fold increase in the Km value determined for KIV. This is the only report of a substitution at this position, and the alanine/glycine position has not previously been investigated; therefore, limited context is available to interpret the significance of this result with respect to other members of the superfamily.

Differential Conservation of Helix Residues in the Claisen Condensation-Like Subgroup

Analysis of the activities reported for members of the CC-like subgroup (Scheme 3) indicate this subgroup is specificity diverse, i.e., these enzymes catalyze a common reaction with varying substrate specificities. The majority of sequences are annotated as either IPMS, homocitrate synthase (HCS), or R-citramalate synthase (CMS) (Figure S3, Supporting Information). The arginine and aspartate (or glutamate) have been previously investigated in HCS and CMS with kinetic results similar to those reported here.[9,10] This is not surprising as the helix residues appear to be required for catalysis, while residues involved in substrate selectivity are located elsewhere in the active site. As noted above, each activity has been experimentally demonstrated in multiple clusters of the network shown in Figure 3, suggesting differences in the evolutionary conservation of amino acids between clusters for enzymes catalyzing identical reactions, including possible epistatic constraints within each cluster. Analysis of differential conservation can provide insight into strategies for substrate selectivity or account for differences in other properties, such as allosteric regulation. With respect to the active site helix under investigation, two examples of differential conservation within the CC-like subgroup can be readily identified and are described below. While the DRE motif aspartate is commonly found as a ligand to the metal ion, enzymes involved in fungal lysine biosynthesis (HCS (Lys) cluster of Figure 3) have a strictly conserved glutamate residue in the analogous position (i.e., ERE-motif) (Figure 5A). Structural studies indicate glutamate acts as a ligand to the divalent cation, similar to the more common aspartate.[10] A comparison of structures from HCS (Lys) and the IPMS clusters indicates that the glutamate substitution is linked to a conserved compensatory/epistatic substitution of an isoleucine in place of an asparagine (D81/N321 in MtIPMS and E44/I251 in HCS from S. pombe) (Figure 5B). Without the N → I substitution, the glutamate residue would cause substantial steric clashes in the active site. Moreover, the asparagine residue is well-conserved within the CC-like and HMG-CoA lyase-like subgroups and can act as an additional ligand to the metal ion in IPMS[41] and HMG-CoA lyase.[62] Future mutagenesis studies on MtIPMS will determine if the (D → E)/(N → I) mutations are tolerated in enzymes from other clusters in the CC-like subgroup and explore a role for the alternative metal architecture.

Figure 5

Examples of differential conservation of helix residues in the CC-like subgroup. (A) Alignment of helix residues generated from sequences belonging to IPMS1, IPMS2, and HCS (Lys) clusters. Residues exhibiting differential conservation and mentioned in the discussion are highlighted with a diamond. (B) Superposition of MtIPMS (1sr9,[63] brown) and HCS from S. pombe (3ivt,[10] blue). (C) Superposition of MtIPMS (1sr9, brown) and NmIPMS (3rmj,[31] blue). Comparing active site helices from the two IPMS clusters provides a second example of differential conservation. In IPMS enzymes, the residue corresponding to N83 in MtIPMS is most often an asparagine or glutamate. From the sequence similarity network, it can be concluded that the identity of the residue at this position can be used to categorize each sequence into either the IPMS1 (glutamate) or IPMS2 (asparagine) cluster (Figure 5A). In MtIPMS, the residue is an asparagine (N83) in agreement with its placement in the IPMS2 cluster. In MtIPMS, substitution of N83 with alanine results in a 25-fold elevation in the Km value for AcCoA and a 10-fold decrease in the kcat value. Surprisingly, when the N83E substitution is made, no activity is detectable. A comparison of structures from MtIPMS and IPMS from Neisseria meningitidis (NmIPMS, a member of IPMS1) fails to identify a conserved compensating substitution in the region near the asparagine/glutamate residue (Figure 5C). In fact, in silico modeling of the substitution suggests that the glutamate can be accommodated in the MtIPMS scaffold without causing steric clashes. Circular dichroism studies are consistent with a properly folded enzyme ruling out a large change in structure in the loss of activity. One possibility for the loss of activity in the N83E variant of MtIPMS is that this position is constrained by long-range epistatic factors specific to each cluster.

Role of Helix Residues in the Allosteric Mechanism of MtIPMS

Residues on the conserved helix were recently implicated in the allosteric mechanism of MtIPMS on the basis of l-leucine induced changes in dynamics measured by backbone amide hydrogen/deuterium exchange.[14] These results suggested a plausible mechanism for the allosteric regulation of MtIPMS involving perturbation of the helix by l-leucine binding. However, all active variants exhibit slow-onset inhibition in the presence of l-leucine and give similar values for inhibition parameters relative to those determined with the wild-type enzyme. As the slow-onset mechanism has recently been linked to the movement of a loop in the regulatory domain,[17] it is not surprising that biphasic kinetics is exhibited. However, the lack of perturbation to the inhibition parameters suggests that residues on the helix do not directly participate in the allosteric mechanism, although direct involvement of R80 or D81 cannot be ruled out as substitution at these positions creates inactive enzymes. More recently, kinetic isotope effects were used to identify the hydrolysis step as the allosteric target of inhibition.[57] From the integrated mutagenesis and bioinformatics results, the main role of the conserved active site helix is stabilization of the common enolate intermediate, suggesting that the catalytic machinery for hydrolysis (and the target for allosteric regulation) in the CC-like subgroup lies elsewhere in the active site.

Conclusions

This work represents the first large-scale bioinformatics analysis of the DRE-TIM metallolyase superfamily, used here to provide enhanced context for understanding the mechanistic contributions of residues on an active site α helix conserved across characterized members of the superfamily. Taken together, the experimental and bioinformatics results indicate a critical role for the DRE motif arginine residue for all of the different reactions currently known across the superfamily, most likely through the stabilization of the common enolate ion. The DRE motif aspartate is also essential for metal ion interactions, although alternate architectures using glutamate instead of aspartate are possible. It will be of interest to determine if evolution of glutamate-containing helices provides unique properties to enzymes when compared with the canonical aspartate-containing homologues. Finally, with the similarity network in place, hypotheses concerning mechanisms of substrate and reaction specificity can be investigated for members of the other subgroups.

62 in total

1. Mapping of the allosteric network in the regulation of alpha-isopropylmalate synthase from Mycobacterium tuberculosis by the feedback inhibitor L-leucine: solution-phase H/D exchange monitored by FT-ICR mass spectrometry.

Authors: Patrick A Frantom; Hui-Min Zhang; Mark R Emmett; Alan G Marshall; John S Blanchard
Journal: Biochemistry Date: 2009-08-11 Impact factor: 3.162

2. Recent developments in the MAFFT multiple sequence alignment program.

Authors: Kazutaka Katoh; Hiroyuki Toh
Journal: Brief Bioinform Date: 2008-03-27 Impact factor: 11.622

3. Domain architecture of pyruvate carboxylase, a biotin-dependent multifunctional enzyme.

Authors: Martin St Maurice; Laurie Reinhardt; Kathy H Surinya; Paul V Attwood; John C Wallace; W Wallace Cleland; Ivan Rayment
Journal: Science Date: 2007-08-24 Impact factor: 47.728

4. Crystal structure and functional analysis of homocitrate synthase, an essential enzyme in lysine biosynthesis.

Authors: Stacie L Bulfer; Erin M Scott; Jean-François Couture; Lorraine Pillus; Raymond C Trievel
Journal: J Biol Chem Date: 2009-12-18 Impact factor: 5.157

5. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

Authors: Holly J Atkinson; John H Morris; Thomas E Ferrin; Patricia C Babbitt
Journal: PLoS One Date: 2009-02-03 Impact factor: 3.240

6. Molecular basis of the substrate specificity and the catalytic mechanism of citramalate synthase from Leptospira interrogans.

Authors: Jun Ma; Peng Zhang; Zilong Zhang; Manwu Zha; Hai Xu; Guoping Zhao; Jianping Ding
Journal: Biochem J Date: 2008-10-01 Impact factor: 3.857

7. Evolution of enzymatic activities in the enolase superfamily: stereochemically distinct mechanisms in two families of cis,cis-muconate lactonizing enzymes.

Authors: Ayano Sakai; Alexander A Fedorov; Elena V Fedorov; Alexandra M Schnoes; Margaret E Glasner; Shoshana Brown; Marc E Rutter; Kevin Bain; Shawn Chang; Tarun Gheyi; J Michael Sauder; Stephen K Burley; Patricia C Babbitt; Steven C Almo; John A Gerlt
Journal: Biochemistry Date: 2009-02-24 Impact factor: 3.162

8. Kinetic evidence for interdomain communication in the allosteric regulation of alpha-isopropylmalate synthase from Mycobacterium tuberculosis.

Authors: Luiz Pedro S de Carvalho; Patrick A Frantom; Argyrides Argyrou; John S Blanchard
Journal: Biochemistry Date: 2009-03-10 Impact factor: 3.162

9. The Pfam protein families database.

Authors: Marco Punta; Penny C Coggill; Ruth Y Eberhardt; Jaina Mistry; John Tate; Chris Boursnell; Ningze Pang; Kristoffer Forslund; Goran Ceric; Jody Clements; Andreas Heger; Liisa Holm; Erik L L Sonnhammer; Sean R Eddy; Alex Bateman; Robert D Finn
Journal: Nucleic Acids Res Date: 2011-11-29 Impact factor: 16.971

10. Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.

Authors: Alexandra M Schnoes; Shoshana D Brown; Igor Dodevski; Patricia C Babbitt
Journal: PLoS Comput Biol Date: 2009-12-11 Impact factor: 4.475

4 in total

1. Changing substrate specificity and iteration of amino acid chain elongation in glucosinolate biosynthesis through targeted mutagenesis of Arabidopsis methylthioalkylmalate synthase 1.

Authors: Annette Petersen; Lea Gram Hansen; Nadia Mirza; Christoph Crocoll; Osman Mirza; Barbara Ann Halkier
Journal: Biosci Rep Date: 2019-07-02 Impact factor: 3.840