Azul Zorzoli1, Benjamin H Meyer1, Elaine Adair2, Vladimir I Torgov3, Vladimir V Veselovsky3, Leonid L Danilov3, Dusan Uhrin2, Helge C Dorfmueller4. 1. Division of Molecular Microbiology, School of Life Sciences, University of Dundee, Dundee, DD1 5EH, United Kingdom. 2. EaStCHEM School of Chemistry, University of Edinburgh, Edinburgh, EH9 3FJ, United Kingdom. 3. N. D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences, Moscow 119334, Russia. 4. Division of Molecular Microbiology, School of Life Sciences, University of Dundee, Dundee, DD1 5EH, United Kingdom h.c.z.dorfmueller@dundee.ac.uk.
Abstract
Group A carbohydrate (GAC) is a bacterial peptidoglycan-anchored surface rhamnose polysaccharide (RhaPS) that is essential for growth of Streptococcus pyogenes and contributes to its ability to infect the human host. In this study, using molecular and synthetic biology approaches, biochemistry, radiolabeling techniques, and NMR and MS analyses, we examined the role of GacB, encoded in the S. pyogenes GAC gene cluster, in the GAC biosynthesis pathway. We demonstrate that GacB is the first characterized α-d-GlcNAc-β-1,4-l-rhamnosyltransferase that synthesizes the committed step in the biosynthesis of the GAC virulence determinant. Importantly, the substitution of S. pyogenes gacB with the homologous gene from Streptococcus agalactiae (Group B Streptococcus), Streptococcus equi subsp. zooepidemicus (Group C Streptococcus), Streptococcus dysgalactiae subsp. equisimilis (Group G Streptococcus), or Streptococcus mutans complemented the GAC biosynthesis pathway. These results, combined with those from extensive in silico studies, reveal a common phylogenetic origin of the genes required for this priming step in >40 pathogenic species of the Streptococcus genus, including members from the Lancefield Groups B, C, D, E, G, and H. Importantly, this priming step appears to be unique to streptococcal ABC transporter-dependent RhaPS biosynthesis, whereas the Wzx/Wzy-dependent streptococcal capsular polysaccharide pathways instead require an α-d-Glc-β-1,4-l-rhamnosyltransferase. The insights into the RhaPS priming step obtained here open the door to targeting the early steps of the group carbohydrate biosynthesis pathways in species of the Streptococcus genus of high clinical and veterinary importance.
Group A carbohydrate (GAC) is a bacterial peptidoglycan-anchored surface rhamnose polysaccharide (RhaPS) that is essential for growth of Streptococcus pyogenes and contributes to its ability to infect the human host. In this study, using molecular and synthetic biology approaches, biochemistry, radiolabeling techniques, and NMR and MS analyses, we examined the role of GacB, encoded in the S. pyogenesGAC gene cluster, in the GAC biosynthesis pathway. We demonstrate that GacB is the first characterized α-d-GlcNAc-β-1,4-l-rhamnosyltransferase that synthesizes the committed step in the biosynthesis of the GAC virulence determinant. Importantly, the substitution of S. pyogenesgacB with the homologous gene from Streptococcus agalactiae (Group B Streptococcus), Streptococcus equi subsp. zooepidemicus (Group C Streptococcus), Streptococcus dysgalactiae subsp. equisimilis (Group G Streptococcus), or Streptococcus mutans complemented the GAC biosynthesis pathway. These results, combined with those from extensive in silico studies, reveal a common phylogenetic origin of the genes required for this priming step in >40 pathogenic species of the Streptococcus genus, including members from the Lancefield Groups B, C, D, E, G, and H. Importantly, this priming step appears to be unique to streptococcal ABC transporter-dependent RhaPS biosynthesis, whereas the Wzx/Wzy-dependent streptococcal capsular polysaccharide pathways instead require an α-d-Glc-β-1,4-l-rhamnosyltransferase. The insights into the RhaPS priming step obtained here open the door to targeting the early steps of the group carbohydrate biosynthesis pathways in species of the Streptococcus genus of high clinical and veterinary importance.
Streptococcus pyogenes is a versatile Gram-positive bacterium that infects only humans and is responsible for a remarkable number of mild to severe illnesses. At least 700 million individuals are affected each year worldwide by diseases as varied as impetigo, pharyngitis, scarlet fever, necrotizing fasciitis, meningitis, and toxic shock syndrome, among others (1–4). Moreover, autoimmune postinfection sequelae with high mortality rates, such as acute rheumatic fever, acute glomerulonephritis, or rheumatic heart disease, can affect individuals who had previously suffered from Group A Streptococcus infections, extending the list of clinical manifestations caused by this severe pathogen (5, 6).As suggested by its diverse clinical manifestations, S. pyogenes rely on different mechanisms to withstand the host's defenses (7–11). These mechanisms are supported by the synthesis of a wide array of virulence factors, among which is the Group A carbohydrate (GAC), a surface polysaccharide that constitutes between 40 and 60% of the bacterial cell wall (12–15). GAC is composed of a [→3)α-Rha(1→2)α-Rha(1→] rhamnose polysaccharide (RhaPS) backbone with a β-d-GlcNAc (1→3) side chain modification on every α-1,2–linked rhamnose (Rha) (15–17). Recent structural examinations and composition analysis of the GAC also suggest the presence of glycerol phosphate (GroP) (18), an observation that remained unnoticed for over 50 years (19, 20). Further, Edgar et al. (18) demonstrated that ∼25% of GAC side-chain GlcNAcs are decorated with GroP, imparting a negative charge to this polymer that has implications for S. pyogenes biology and defense mechanisms (19, 21). This feature, previously identified in other surface glycans (22, 23), provided new insight into the structural composition, biosynthesis, and function of GAC.GAC is proposed to be synthesized by 12 proteins, GacABCDEFGHIJKL, encoded in one gene cluster (i.e. MGAS5005_spy0602–0613) that has been found in all S. pyogenes species identified so far (7, 24). Through sequencing of transposon mutant libraries, Le Breton et al. (10, 25) discovered that eight of these genes, gacABCDEFG and gacL, are essential for S. pyogenes survival. This information supports the observation by van Sorge et al. (7), who identified via insertional mutagenesis that the first three genes of the cluster (gacABC) are essential.The present hypothesis is that the GAC is formed in six consecutive steps, depicted in Scheme 1: (i) lipid-linked acceptor initiation, (ii) [→3)α-Rha(1→2)α-Rha(1→] RhaPS backbone synthesis, (iii) membrane translocation, (iv) synthesis of sidechain precursor, (v and vi) post-translocational chain modification steps. This product is linked to peptidoglycan (15). The cytoplasmic pool of TDP-β-l-rhamnose (TDP-Rha) necessary for RhaPS backbone synthesis is supplied by enzymes encoded in two separate gene clusters rmlABC and gacA/rmlD (24). Step (i) involves the addition of a Rha residue to a GlcNAc-PP-undecaprenyl (Und, WT2C8-). Rush et al. proposed that either GacB or GacC are a rhamnosyltransferase that transfers the first Rha residue onto the 3-OH group of the GlcNAc-PP-Und (14). The formation of the [→3)α-Rha(1→2) α-Rha(1→] RhaPS backbone constitutes the elongation step (ii), presumably achieved at the inner leaflet of the membrane by the glycosyltransferases (GTs) GacC, GacF, and GacG (14). However, the precise role of each GTs in the synthesis of the RhaPS backbone remains unknown. Consistent with this hypothesis are findings related to the biosynthesis of the Streptococcus mutans serotype Ccarbohydrate (SCC, previously referred to as RGP). Shibata et al. (26) reported that the genes rgpA/sccB, rgpB/sccC, and rgpF/sccG of this dental pathogen (homologs of gacB, gacC, and gacG, respectively) are required for the biosynthesis of the RhaPS backbone, although the structural or mechanistic examination of these enzymes was never conducted. According to the current gene annotation, the translocation step (iii) is proposed to be catalyzed by an ATP-dependent ABC transporter encoded by gacD and gacE (7, 14, 15). Concerning the post-translocational RhaPS modifications (v, vi), recent insights revealed that GacI synthesizes the sugardonor precursor GlcNAc-P-Und (iv) and that GacJ forms a complex with GacI to enhance the catalytic efficiency of this process (14). The lipid-linked monophosphate sugarGlcNAc-P-Und is proposed to be flipped across the membrane by GacK (14, 15). Furthermore, GacL was shown to utilize the extracellular GlcNAc-P-Und by transferring the GlcNAc onto the RhaPS backbone, to insert the antigenic β-d-GlcNAc (1→3) side chain modification on every α-1,2-linked rhamnose. Finally, GacH is a GroP-transferase that decorates the RhaPS-GlcNAc side chain (18). Enzymes conducting the transfer of GAC to the peptidoglycan as well as its linkage, presumably through phosphodiester bond (18), remain to be elucidated.
SCHEME 1.
Current model for the GAC biosynthesis pathway from
Current model for the GAC biosynthesis pathway fromOften, the production of bacterial cell wall structures determines the fate of a bacterium in a given environment. Glycopolymers in particular, such as cell wall teichoic acids and capsular polysaccharides, are involved in growth, cell division, and bacterial homeostasis (15, 27–34). In the context of human health, these glycopolymers have been shown to play a significant role in biofilm formation, antibiotic resistance, host cell adhesion, and resistance to the host immune response by, for example, blocking antibacterial peptides and evading phagocytosis (8, 24, 35–39).Many species within the Lactobacillales order present a surface RhaPS intertwined with additional components of the thick cell wall that is typical of Gram-positive bacteria. These surface glycans, as with other critical peptidoglycan-anchored structures, are essential for cell viability and play an essential role in pathogenesis (7, 9, 15, 24, 36, 40–42). In the Streptococcus genus, in particular in S. pyogenes, S. agalactiae, and S. mutans, genetic studies conducted on S. pyogenes strains showed that the absence of rhamnoseglycopolymers results in severe growth defects associated with an altered cell wall and an increased susceptibility to muralytic enzymes (7, 25, 43, 43–45). Moreover, in vivo studies revealed that RhaPS depletion or its modification reduced the ability of S. pyogenes, S. agalactiae, and S. mutans to evade the immune system and attenuated their virulence (7, 25, 46–49).Based on its essentiality, its role in pathogenesis, and the current evidence of the importance of RhaPS in bacterial homeostasis, we are confident that the detailed elucidation of all steps in the GAC biosynthesis pathway will play a decisive role in the development of antimicrobial drugs and anti-streptococcal vaccines in the near future (7, 46, 50). Despite the recent progress, several pressing questions remain unanswered regarding the biosynthesis of GAC. For example, the products of six of the 12 genes that constitute the GAC cluster (gacBCDEFG) have not yet been characterized, including the essential steps of GAC initiation, RhaPS backbone biosynthesis, and GAC translocation.As a means of attaining more information on the GAC initiation step, we conducted an in-depth examination of the second enzyme encoded in the GAC gene cluster. Here we demonstrate that GacB, in disagreement with its preliminary genetic annotation and currently proposed mechanism of action (14), is the first rhamnosyltransferase in this pathway that transfers l-rhamnose from TDP-β-l-rhamnose onto the acceptor GlcNAc-PP-Und. GacB thereby forms a β-1,4-glycosidic bond between rhamnose and the lipid-linked GlcNActhrough a metal-independent mechanism. Strikingly, our research on phylogenetically related homologs from other important human pathogenic streptococci, in particular from the Lancefield Groups B, C, and G, reveals that the role of GacB is conserved within the Streptococcus genus, suggesting a common first committed step for the production of RhaPS from all Lancefield groups. Thus, the implication of this research is relevant for those seeking to understand the pathogenesis and potential therapeutic interventions against streptococcal species important to human and animal health.
Results and discussion
Protein sequence analysis of the putative rhamnosyltransferase GacB
The current gene annotation indicates that the second gene of the GAC cluster encodes a putative α-d-GlcNAc-α-1,2-l-rhamnosyltransferase, named GacB (7, 15). Recent findings revealed that GlcNAc-PP-Und is essential for the GACRhaPS backbone biosynthesis in S. pyogenes (14), supporting the current hypothesis that the addition of rhamnose during the initiation step occurs directly onto this lipid-disphosphate sugar acceptor (7, 15, 18). The gacB gene product is a 384-amino-acid-long polypeptide with a theoretical molecular mass of 44 kDa; this is consistent with the size of other known single-reaction GTs (51, 52).In silico topological predictions conducted with SpOctopus- and TMHMM-based algorithms (53–55) indicate that GacB does not contain a signal peptide sequence but possesses a short hydrophobic region at its N terminus end, suggesting that this protein is not secreted but is tethered to the cytosolic side of the membrane (Fig. S1). This prediction is in agreement with the proposed function catalyzing the reaction onto a lipid-linked acceptor, as well as with the observed membrane association during the heterologous protein expression.The most relevant structural information provided by the domain architecture analysis is that GacB belongs to the RfaB superfamily of GTs, a group of nucleotide sugar-dependent enzymes known to be involved in cell wall biosynthesis (54, 55). In the absence of any known function, GacB has been preliminarily classified based on its protein sequence into the GT4 family, a group of GTs with diverse biological functions with a characteristic GT-B type fold. The enzymes composing this group act through a retaining catalytic mechanism, meaning that the resulting stereochemistry at the anomeric carbon is conserved between the sugardonor and the product (56–61).The sequence analysis revealed that GacB has two conserved domains, the GT_1_like6 domain covering the entire primary sequence (residues 1–384) and the domain of unknown function DUF1972 at the N terminus (residues 1–185). The DUF1972 domain is widely spread among glycosyltransferases (and rhamnosyltransferases) but has no defined functionality (55).GacB shares 68% sequence amino acid identity and the same type of predicted domains as S. mutans SccB (Table 1), which has been proposed to initiate the biosynthesis of the SCC in S. mutans, a polysaccharide that also contains a RhaPS backbone (14, 26, 47, 62). Importantly, SccB also requires GlcNAc-PP-Und as an acceptor, as the biosynthesis of the surface polysaccharide is abrogated in an E. coli ΔwecA deletion strain (26, 56). WecA is the enzyme responsible for the GlcNAc-PP-Und biosynthesis in E. coli. GacO is the functional homolog of WecA in S. pyogenes, whereas S. mutans and several other streptococci contain similar homologs (26, 32, 63).
Table 1
Sequence conservation in percentage for GacB and GacO homologous enzymes from 48 species of the
Carbohydrate gene clusters containing either ABC or Wzx/Wzy-encoding genes for carbohydrate translocation and polymerization are marked.
Sequence conservation in percentage for GacB and GacO homologous enzymes from 48 species of theCarbohydrate gene clusters containing either ABC or Wzx/Wzy-encoding genes for carbohydrate translocation and polymerization are marked.Based on GacB's preliminarily annotation as a nucleotide sugar-dependent rhamnosyltransferase, its genomic location within the GAC cluster, and its similarity to SccB, we selected GacB as the most likely candidate catalyzing the first rhamnosylation step in the GAC biosynthesis process. According to our hypothesis, GacB would attach a single rhamnose residue onto a membrane-bound GlcNAc to form Rha-GlcNAc-PP-Und through an unknown glycosidic linkage.
GacB is required for the biosynthesis of the GAC RhaPS chain
To further investigate the GacB function and to identify potential catalytic residues, we used E. coli as a heterologous expression system to study the GACRhaPS backbone biosynthesis. Previously, the transformation of E. coli cells with the S. mutans genes sccA–G was sufficient to produce the SCC RhaPS backbone, whereas the absence of sccB, encoding the GacB homolog, abrogated the RhaPS backbone biosynthesis (26, 64). Inspired by this approach, we constructed two vectors carrying the homologous genes from S. pyogenes, gacACDEFG (gacA–G; ΔgacB) and gacB (Fig. 1A).
Figure 1.
A, complementation strategy and map of S. pyogenes (purple) and S. mutans (red) genes required to produce the RhaPS polyrhamnose backbone. S. mutans cluster: sccA (Smu0824), sccB (Smu0825), sccC (Smu0826), sccD (Smu0827), sccE (Smu0828), sccF (Smu0829), sccG (Smu0830). S. pyogenes cluster: gacA (M5005_Spy_0602), gacB (M5005_Spy_0603), gacC (M5005_Spy_0604), gacD (M5005_Spy_0605), gacE (M5005_Spy_0606), gacF (M5005_Spy_0607), gacG (M5005_Spy_0608). B, bacterial complementation assay in E. coli CS2775 cells. Immunoblot of a Tricine SDS-PAGE with whole-cell samples probed with anti-Group A carbohydrate antibody.
A, complementation strategy and map of S. pyogenes (purple) and S. mutans (red) genes required to produce the RhaPS polyrhamnose backbone. S. mutans cluster: sccA (Smu0824), sccB (Smu0825), sccC (Smu0826), sccD (Smu0827), sccE (Smu0828), sccF (Smu0829), sccG (Smu0830). S. pyogenes cluster: gacA (M5005_Spy_0602), gacB (M5005_Spy_0603), gacC (M5005_Spy_0604), gacD (M5005_Spy_0605), gacE (M5005_Spy_0606), gacF (M5005_Spy_0607), gacG (M5005_Spy_0608). B, bacterial complementation assay in E. coli CS2775 cells. Immunoblot of a Tricine SDS-PAGE with whole-cell samples probed with anti-Group A carbohydrate antibody.The RhaPS chain is presumed to be translocated to the outer membrane in E. coli, which naturally contains rhamnose attached to the lipopolysaccharides. Thus, to avoid nonspecific binding of the anti-GAC antibody, all transformations were made using an rfaS-deficient strain (26). The interruption of the rfaS gene impedes the attachment of rhamnose to the LPS on the bacterial outer membrane, rendering a strain that lacks endogenous rhamnose on its surface (26, 62, 65, 66). The role of GacB was investigated using a traditional complementation strategy (Fig. 1).We investigated the production of RhaPS by gacA–G from our complementation approach using immunoblots of total cell lysates (Fig. 1B). If the expression of GacBCDEFG is sufficient to produce the RhaPS chain, then we should be able to detect the synthesized RhaPS using a specific anti-GAC antibody. The results showed that E. coli cells lacking the gacA-G gene cluster (empty vector) did not produce RhaPS (Fig. 1, lane 2). Likewise, transformants bearing the ΔgacB or ΔsccB plasmids lost reactivity with the GAC antibody (Fig. 1, lanes 3 and 5). Instead, co-transformation of sccB + ΔsccB or gacB + ΔgacB restored the RhaPS production, underlining the essentiality of sccB and gacB for the biosynthesis of the GAC backbone (Fig. 1, lanes 4 and 6).To investigate whether GacB and SccB are catalyzing the same reaction, we tested the ability of GacB to functionally substitute SccB and vice versa by co-transforming ΔsccB + gacB and ΔgacB + sccB. In all cases, SccB and GacB were interchangeable (Fig. S2). GacB's predicted initiation codon was different from S. mutans SccB, with the latter using TTG instead of ATG (Fig. S2). The use of TTG as initiation codon has been identified as a factor that decreases the efficiency of translation in both E. coli and the nematode Caenorhabditis elegans (67, 68). Therefore, it is conceivable that this is a strategy used by the cell to limit the expression of the proteins involved in the RhaPS biosynthesis.Interestingly, the gacB homolog in Streptococcus gallolyticus subsp. gallolyticus (ATCC_43143) is annotated as a shortened, 235-amino-acid-long polypeptide that also contains a TTG codon 153 nucleotides upstream of the annotated GTG start codon. In light of this discovery, a genetic reannotation of S. mutans' sccB and its homolog in S. gallolyticus subsp. gallolyticus would be required, as well as a re-evaluation of these species' genome for potentially new or larger gene products that are initiated by a TTG codon.We decided to test two versions of SccB: one with a TTG as the initiation codon and the other one with an ATG. Both versions rendered an active enzyme that could complement either ΔsccB and ΔgacB (Fig. S2). Unless stated otherwise, all further work was conducted using sccB constructs with the native TTG start codon.
GacB extends a lipid-linked precursor
We investigated whether GacB is a GT that uses GlcNAc-PP-Und as an acceptor. Through an in vivo experiment, we generated radiolabeled lipid-linked saccharides (LLSs), which were isolated from the bacterial membrane and separated via thin-layer chromatography (TLC). Based on the annotation as a rhamnosyltransferase, radiolabeled TDP-β-l-rhamnose would be the preferred sugardonor for GacB. However, this compound is not commercially available; therefore, tritiated glucose was chosen as an alternative. Inside the bacterial cell, glucose is used as a substrate to synthesize a wide array of organic components, including TDP-β-l-rhamnose (69).We hypothesized that GacB transfers an activated sugar from a (radiolabeled) nucleotide sugardonor to a membrane-bound acceptor monosaccharide-PP-Und (e.g. GlcNAc-PP-Und). Therefore, we expect a lower chromatographic mobility (R) value of the membrane-bound acceptor, compared with the R of the monosaccharide lipid-linked acceptor after running the samples in a TLC plate. As a negative control, we used E. coli CS2775 (ΔrfaS) transformed with the empty vector. This transformant showed a chromatographic zone consistent with the generation of monosaccharide-PP-Und (Fig. 2, lane 1). Upon expression of either the gacB or sccB genes, we observed the accumulation of a radioactive signal that migrated more slowly on the TLC plate, suggesting a higher molecular mass for these compounds (Fig. 2, lanes 3 and 4). The same R shift was observed for the sccAB-DEFG (ΔsccC) construct (Fig. 2, lane 2), demonstrating that sccB and gacB can glycosylate a lipid-linked precursor. Based on the literature, we assume that the upper radiolabeled chromatographic signal corresponds to GlcNAc-PP-Und and the lower one to Rha-GlcNAc-PP-Und (14, 15).
Figure 2.
TLC analysis of radiolabeled LLSs extracted from
Below is the symbolic representation of the predicted synthesis products (Und-PP). Gray circle, phosphate; blue square, GlcNAc; green triangle, rhamnose.
TLC analysis of radiolabeled LLSs extracted from
Below is the symbolic representation of the predicted synthesis products (Und-PP). Gray circle, phosphate; blue square, GlcNAc; green triangle, rhamnose.
GacB is a glycosyltransferase transferring rhamnose from TDP-β-l-Rha onto GlcNAc-PP-lipid acceptors
The observed band shift suggested that GacB adds a monosaccharide to a lipid-linked precursor, most likely GlcNAc-PP-Und. We investigated this hypothesis using recombinantly produced and purified GacB WT and amino acid mutants (Fig. S3). We established an in vitro assay using the predicted nucleotide sugardonor, TDP-β-l-rhamnose, and a synthetic acceptor substrate. We tested two of these synthetic substrates designed to mimic the native lipid-linked acceptor: CH3(CH2)12-PP-GlcNAc (acceptor 1) and PhO(CH2)11-PP-GlcNAc (acceptor 2) (Fig. 4C). The enzymatic reaction products were purified and characterized using MALDI-MS in positive-ion mode.
Figure 4.
A, 800-MHz 1H NMR spectra of acceptor substrate 1 (a), product 1 (b), acceptor substrate 2 (c), and product 2 (d). B, partial 2D ROESY spectrum of product 1 showing the correlations between the H1 of β-l-Rha and the protons of rhamnose (R) and protons of GlcNAc (G). The F2 cross-section through H1 of Rha is shown in red. C, chemical structures with proton numbering.
The MALDI-MS spectra of the enzymatic reaction products (Fig. 3) confirmed that GacB catalyzes the addition of one rhamnose residue to both acceptor substrates when incubated with TDP-β-l-Rha (Fig. 3, B and E). Acceptor 1 possesses a molecular mass of 563 Da and is detected at both m/z = 608 [M − 1H + 2Na]+ and m/z = 630 [M − 2H + 3Na]+ (Fig. 3A). GacB-GFP and GacB lacking the GFP tag modified the acceptor, resulting in one predominant peak at m/z = 776 [M − 2H + 3Na]+ (Fig. 3, B and C). In this spectrum, we can also observe an additional peak of lower intensity at m/z = 754 [M − 1H + 2Na]+, corresponding to the modified acceptor 1 coupled with two Na+ ions instead of three Na+ ions. In both cases, the products are shifted by m/z = 146 compared with the unmodified acceptor, which is consistent with the addition of one rhamnose residue via a glycosidic linkage. The same mass shift was observed for the second acceptor; the peaks of the unmodified acceptor 2 (Fig. 3D) were detected at m/z = 672 [M − 1H + 2Na]+ and m/z = 694 [M − 2H + 3Na]+, whereas the product peaks emerge at m/z = 818 [M − 1H + 2Na]+ and m/z = 840 [M − 2H + 3Na]+ (Fig. 3, E and F). We also tested the ability of GacB to catalyze the rhamnosylation of GlcNAc-α-1-P, but the reaction rendered no detectable product (data not shown), suggesting that the enzyme not only interacts with the GlcNAc-α-1-P, but might require the second phosphate and the lipid component to recognize the acceptor substrate.
Figure 3.
Spectra were obtained from the enzymatic reaction sample with the substrate TDP-Rha and acceptor 1 (CH3(CH2)12-PP-GlcNAc) (A), acceptor 1 + GacB-GFP (B), acceptor 1 + GacB cleaved (no GFP fusion) (C), acceptor 2 (PhO(CH2)11-PP-GlcNAc) (D), acceptor 2 + GacB-GFP (E), acceptor 2 + GacB cleaved (no GFP) (F), acceptor 2 + GacB-D160N GFP (G), or acceptor 2 + GacB-Y182N-GFP (H).
Spectra were obtained from the enzymatic reaction sample with the substrate TDP-Rha and acceptor 1 (CH3(CH2)12-PP-GlcNAc) (A), acceptor 1 + GacB-GFP (B), acceptor 1 + GacB cleaved (no GFP fusion) (C), acceptor 2 (PhO(CH2)11-PP-GlcNAc) (D), acceptor 2 + GacB-GFP (E), acceptor 2 + GacB cleaved (no GFP) (F), acceptor 2 + GacB-D160N GFP (G), or acceptor 2 + GacB-Y182N-GFP (H).We further investigated GacB's specificity toward the sugar–nucleotide donor. In particular, we tested whether GacB is selective for thymidine-based nucleotides and tolerates uridine-based nucleotides, such as UDP-Glc, UDP-GlcNAc, and UDP-Rha. As shown before, in the presence of TDP-β-l-Rha, two products consistent with the incorporation of rhamnose residue plus either two or three sodium cations were observed in the spectrum (Fig. S4). In contrast, no product peaks were observed with the α-d-configured substrates UDP-α-d-Glc or UDP-α-d-GlcNAc (Fig. S4, ), whereas residual activity was detected for UDP-β-l-Rha (Fig. S4). These data demonstrate that GacB does not tolerate α-d–configured nucleotide sugars. Furthermore, GacB has specificity toward the deoxythymidine moiety (TDP-Rha) or requires binding of the thymine methyl group.Finally, we assessed metal ion dependence in vitro. Compared with the control reaction (Fig. S5), we noticed no significant differences in the rhamnosylation activity of the enzyme when GacB was supplemented with MgCl2, MnCl2, or EDTA as a metal chelator (Fig. S5, ), indicating that GacB does not require a divalent metal ion for its activity. These data are in agreement with the lack of a conserved DXD motif found in metal-dependent GTs (70, 71).Together, these data confirmed our previous conclusions drawn from the LLSs radiolabeled assay (Fig. 2). These results provide the first in vitro evidence revealing that GacB is a metal-independent rhamnosyltransferase that catalyzes the initiation step in the GACRhaPS backbone biosynthesis by transferring single rhamnose to GlcNAc-PP-Und using TDP-β-l-Rha as the exclusive activated nucleotide sugardonor.
Investigation of GacB's catalytic residues
There are no available crystal structures of proteins with a high degree of identity to GacB. Therefore, we constructed a GacB structural model based on two enzymes that belong to the GT4 family of GTs: Bacillus anthracis' BaBshA (PDB entry 3MBO) (72) and Corynebacterium glutamicum's MshA (PDB entry 3C4V) (57). Both proteins have low amino acid sequence identity compared with GacB, making the identification of potential catalytic residues to mutate in GacB very challenging. BaBshA shares only 15% identity in only 64 of 424 amino acids. MshA is an equally distant “homologous” GT that shares 16% identical residues in a sequence stretch of only 71 residues of 446. Based on the scarce information provided by the structural models and the multiple-sequence alignment described in detail below, we mutated several residues that are highly conserved in over 40 pathogenic streptococci species (Fig. S7).Our in vitro E. coli system is the first one that enables the study of GacB mutant proteins, allowing the identification of those mutants that abrogate or reduce the production of RhaPS backbone. Conducting this in S. pyogenes is not possible, as gacB is essential (7, 26). We used the information available from the GT models mentioned above and the sequence alignment of multiple streptococci to select residues that might be involved in substrate binding, which tends to be conserved among GT. Through in situ mutagenesis, we constructed nine recombinant versions of GacB containing the following amino acid substitutions: D126A, D126N, E222A, E222Q, D160A, D160N, Y182A, Y182F, and K131R. The latter mutation was included as a negative control because it is a conserved predicted surface residue that presumably is not engaged in the catalytic activity or could inactivate the enzyme otherwise.Our results show that substitution of Asp-160 with an asparagine (Fig. S6, lane 5) led to a drastic reduction in the production of the RhaPS chain, whereas an alanine residue did not cause such a significant effect. This suggests that the Asp-160 carboxyl group might be required for catalysis, which potentially can be replaced in the alanine mutant by a water molecule. A more severe effect was observed with mutations of Tyr-182. The alanine substitution of Tyr-182 (Y182A) impeded the RhaPS backbone biosynthesis significantly, whereas Y182F completely inactivated GacB (Fig. S6, lanes 9 and 10), suggesting an essential role for the Tyr-182 hydroxyl group in GacB's enzymatic activity.We further investigated the mutants D160N and Y182F in an in vitro assay using recombinantly expressed and purified GacB-GFP fusions (Fig. S3). The MALDI-MS analysis of the reaction products from GacB-D160N-GFP and GacB-Y182F-GFP revealed that both mutants lacked an enzymatic activity in vitro (Fig. 3, G and H). These results support the hypothesis that the residues Asp-160 and Tyr-182 play a role in substrate binding or catalysis.Finally, we created three truncated versions of GacB at the N-terminal end as an attempt to determine whether the enzyme remains active in the absence of the residues predicted to be associated with the membrane. Our results showed that truncations of the first 22 (GacB23–385), 75 (GacB76–385), and 118 residues (GacB119–385) led to inactivation of the enzyme when assessed through the complementation assay (Fig. S6, lanes 11–13). Their inability to complement ΔgacB suggest that the N-terminal domain is required for activity and supports the hypothesis that GacB is a membrane-associated rhamnosyltransferase.
GacB is a retaining β-1,4-rhamnosyltransferase
The current gene annotation suggests that GacB is an inverting α-1,2-rhamnosyltransferase (7, 14). This annotation is incompatible with the acceptor sugarGlcNAc because its carbon at position C2 is already decorated with the N-acetyl group. Therefore, GacB can only transfer the rhamnose onto the available hydroxyl groups on C3, C4, or C6. The GAC backbone is composed of disaccharide repeating units of rhamnose connected via α-1,3-α-1,2 glycosidic bonds (15, 18, 85), suggesting that GacB could be using a retaining mechanism of action, synthesizing an α-1,3, α-1,4, or α-1,6 glycosidic linkage. According to the CAZy database, the GacB sequence is classified as a GT4 family member, which are classified as retaining GTs (73). If that classification is correct for GacB, the stereochemical configuration at the anomeric center of the sugardonor, TDP-β-l-rhamnose, should be retained in the final product (56, 64, 74).To elucidate whether GacB is an inverting or a retaining rhamnosyltransferase, we conducted NMR spectroscopy on the purified reaction products 1 and 2. 1H NMR spectra were collected at 800 MHz to both establish the structural integrity of acceptors 1 and 2 (Fig. 4A) and determine the chemical structure of their products after the enzymatic reaction (products 1 and 2). The NMR parameters were determined through one- and two-dimensional (1D and 2D) and 2D total correlation spectroscopy (TOCSY) experiments (Fig. 4B); their chemical shifts are summarized in Table 2. For both acceptors, the anomeric proton of the α-d-GlcNAc residue appeared as a doublet of doublets with 3J(H1,H2) = 3.4 Hz and 3J(H1,P) = 7.2 Hz. Proton H2 of α-d-GlcNAc was also split by a 3J(H2,P) = 2.4 Hz coupling with P. A 2D 1H, 31P HMQC spectrum (data not shown) revealed a correlation of both of these H-1′ protons with P at −13.5 ppm. Another correlation appeared between the 31P at −10.6 ppm and protons of the adjacent CH2 groups of the alkyl chain, confirming the integrity of the acceptor substrate. For acceptor 2, a typical pattern of signals of monosubstituted benzene with integral intensities of 2:2:1 was observed.
Table 2
A, 800-MHz 1H NMR spectra of acceptor substrate 1 (a), product 1 (b), acceptor substrate 2 (c), and product 2 (d). B, partial 2D ROESY spectrum of product 1 showing the correlations between the H1 of β-l-Rha and the protons of rhamnose (R) and protons of GlcNAc (G). The F2 cross-section through H1 of Rha is shown in red. C, chemical structures with proton numbering.The addition of rhamnose to both acceptor substrates was accompanied by the appearance of a characteristic signal in the anomeric region of the spectrum (4.88 ppm, H1) next to the water signal. The anomeric configuration of this monosaccharide was established in several ways. The measured 3J(H1,H2) coupling constant of 1.0 Hz indicated a β-l-configuration (1.1 and 1.8 Hz reported) (75) for β-l- and α-l-Rha, respectively). Furthermore, we complemented our studies by investigating the product rotating-frame NOE (ROESY) spectrum (Fig. 4B). Importantly, the ROESY spectrum showed spatial proximity of H1 of rhamnose with four other protons. Among these were H2, H3, and H5 protons of rhamnose, the latter two confirming a 1,3-diaxial arrangement between H1, H3, and H5 that is indicative of a β-l-Rha configuration. Finally, a comparison of 1H chemical shifts of rhamnose with those of α-l- and β-l-rhamnopyranose (Fig. 4C) showed a good agreement with those of β-l-rhamnose but not α-l-rhamnose (75) (Table 2), thus further confirming the anomeric configuration of this ring. The forth ROESY cross-peak of H1 of rhamnose was with H4 of GlcNAc, revealing the presence of a (1→4) linkage between the two monosaccharides. This observation was further supported by a comparison of GlcNAc 1H chemical shifts of acceptor substrates and products. Here, an increased chemical shift (+0.21 ppm) was observed for H4 upon glycosylation, whereas the average of the absolute values of the differences between the chemical shifts of the other corresponding protons of GlcNAc was 0.03 ppm. As expected, the signals of the alkyl and aryl radicals practically did not change in the respective acceptor-product pairs.In conclusion, 1H NMR spectroscopy revealed the formation of a β-l-Rha (1→4) d-GlcNAc moiety and the integrity of the product. Other enzymes from Gram-negative and Gram-positive bacteria that are involved in polysaccharide biosynthesis also use lipid-linked GlcNAc as the initial acceptor sugar, whereas their nucleotide-sugardonor is either TDP-l- or GDP-d-rhamnose. However, their respective reaction products are α-1,3 or α-1,4 glycosidic bonds (76–81). Product analysis, therefore, undoubtedly reveals that GacB is the first retaining GT that synthesizes a novel d-GlcNAc-β-1,4-l-rhamnose disaccharide using TDP-β-l-rhamnose as the exclusive nucleotide sugardonor.
Group A, B, C, and G Streptococcus share a common RhaPS initiation step
In addition to S. mutans SccB, GacB homologs with a high degree of sequence identity are found in other streptococcal species of clinical importance, such as the Streptococcus species from Group B (GBS), Group C (GCS), and Group G (GGS). All homologous enzymes are situated in the corresponding gene clusters encoding the biosynthesis of their Lancefield antigens (i.e. the Group B, C, and G carbohydrate) (15). The homologous gene products share 67, 89, and 89% amino acid identity to GacB, respectively (Table 1 and Fig. 5). With varying degrees of evidence, depending on the species, there is a general understanding of the chemical structure of the RhaPS of these streptococci (15). The currently accepted structures for GAC, GBC, GCC, GGC, and SCC are summarized in Fig. 5 (14–16, 23, 82, 83). Remarkably, none of the investigations that led to the understanding of the surface carbohydrate structures include data describing the mechanism of action of the enzymes involved in the priming step of each RhaPS biosynthesis.
Figure 5.
RhaPS biosynthesis is initiated on Und-P by GacO (green background), followed by the action of GacB (turquoise), generating the conserved core structure Und-PP-GlcNAc-Rha. Percentage of the amino acid sequence identity, positive amino acids, and gaps within the sequence compared with GacO or GacB are given below each homolog: S. mutans Xc serotype C SccB, S. agalactiae (GBS) RfaB, Streptococcus dysgalactiae subsp. equisimilis 167 (GCS) RgpAc, and S. dysgalactiae subsp. equisimilis ATCC 12394 (GGS) Rs03945. The specific carbohydrate composition extending the lipid-linked core structure of each group is depicted on the right. Carbohydrate repeat units (RU) are highlighted (light pink background), and symbolic representations of the sugars are shown in the key.
RhaPS biosynthesis is initiated on Und-P by GacO (green background), followed by the action of GacB (turquoise), generating the conserved core structure Und-PP-GlcNAc-Rha. Percentage of the amino acid sequence identity, positive amino acids, and gaps within the sequence compared with GacO or GacB are given below each homolog: S. mutans Xc serotype C SccB, S. agalactiae (GBS) RfaB, Streptococcus dysgalactiae subsp. equisimilis 167 (GCS) RgpAc, and S. dysgalactiae subsp. equisimilis ATCC 12394 (GGS) Rs03945. The specific carbohydrate composition extending the lipid-linked core structure of each group is depicted on the right. Carbohydrate repeat units (RU) are highlighted (light pink background), and symbolic representations of the sugars are shown in the key.Based on the high-sequence identity to GacB, we hypothesized that the carbohydrate biosynthesis of the Group A, Group B, Group C, and Group G Streptococcus possesses a conserved initiation step, in which the first rhamnose residue is transferred onto the lipid-linked acceptor forming α-d-GlcNAc-β-1,4-l-rhamnose-PP-Und. We tested the ability of the homologs from GBS, GCS, and GGS (GbsB, GcsB, and GgsB, respectively) to functionally substitute GacB in the production of the RhaPS chain (Fig. 6). Our results show that all homologous proteins were able to restore the RhaPS backbone when their genes were co-expressed with the ΔgacB expression plasmid, suggesting that these enzymes can perform the same enzymatic reaction.
Figure 6.
Bacterial complementation assay in Shown is immunoblotting of Tricine SDS-PAGE with whole-cell samples probed with anti-Group A carbohydrate antibody and membranes after Coomassie stain.
Bacterial complementation assay in Shown is immunoblotting of Tricine SDS-PAGE with whole-cell samples probed with anti-Group A carbohydrate antibody and membranes after Coomassie stain.We showed that GacB requires GlcNAc-PP-Und as acceptor (Fig. 6), but it is possible that the enzymes from GBS, GCS, and GGS use a different lipid-linked acceptor substrate, such as Glc-PP-Und. Thus, to determine whether the GacB homologs require GlcNAc-PP-Und as lipid acceptor, we conducted the complementation assay using E. coli ΔwecA cells, which lack GlcNAc-PP-Und (62). As a positive control, we identified Streptococcus pneumoniae WchF, a α-d-Glc-β-1,4-l-rhamnosyltransferase that uses Glc-PP-Und exclusively as substrate (84). As expected, GacB was unable to restore the RhaPS chain when co-transformed with the ΔgacB vector in the absence of the GlcNAc-PP-Und (Fig. 6A, lane 2). Importantly, the GacB homologs from GBS, GCS, and GGS also failed to produce the RhaPS backbone (Fig. 6A, lanes 4–6) but could replace GacB function in the ΔrfaS strain (Fig. 6B). Only WchF, which uses a Glc-PP-Und acceptor for the transfer of a rhamnose residue (83), restored the RhaPS biosynthesis in the absence of GlcNAc-PP-Und (Fig. 6A, lane 3). Combined with the data from our in vitro enzymatic reactions, these results strongly suggest that the GacB homologs from GBS, GCS, and GGS are also α-d-GlcNAc-β-1,4-l-rhamnosyltransferase that require GlcNAc-PP-Und as membrane-bound acceptor.
d-GlcNAc- or d-Glc-1,4-β-l-rhamnosyl-transferases define translocation mechanism in streptococcal pathogens
S. pneumoniae serotype 2 wchF encodes a α-d-Glc-β-1,4-l-rhamnosyltransferase that requires Glc-PP-Und as acceptor (84). The WchF enzyme conducts the first committed step for Wzx/Wzy-dependent capsule biosynthesis. Interestingly, it shares only 51% amino acid identity to GacB, compared with 67–89% for the homologous enzymes from GBS, GCS, GGS, and S. mutans (Table 1). Toward a better understanding of the conservation of GacB in the Streptococcus genus, we extended our bioinformatics analysis to search for other strains that harbor gacB homologous genes. We found 48 human/veterinary pathogenic Streptococcus species with a single GacB homolog, sharing 50–94% amino acid sequence identity (Table 1 and Fig. 7). Interestingly, only five of our 48 identified species showed a percentage identity equal to or lower than 51% (Streptococcus mitis, S. pneumoniae, Streptococcus oralis subsp. tigurinus, S. peroris, and Streptococcus pseudopneumoniae), whereas all other encoded proteins presented more than 65% homology to GacB. For simplicity, we will refer to the five Streptococcus strains with low amino acid identity as the “low-identity” subgroup, and the rest of the species as the “high-identity” subgroup. The sequence analysis paired with the complementation assay led us to hypothesize that all GacB homologs encompassed in the high-identity subgroup possess d-GlcNAc-β-1,4-l-rhamnosyltransferase activity. In contrast, the low-identity subgroup contains S. pneumoniae WchF, a known d-Glc-β-1,4-l-rhamnosyltransferase (84). Interestingly, all five members of the low-identity subgroup exhibit very high sequence identity (>90%) when compared with WchF, therefore strongly suggesting that they are WchF homologs that also synthesize Rha-β-1,4-Glc-PP-Und.
Figure 7.
A, phylogenetic tree of GacB homolog sequences identified from 48 partially or completely sequenced streptococcal pathogens. The tree was constructed based on multiple-sequence alignment of GacB homologs using the default neighbor-joining clustering method of Clustal Omega. The tree was plotted using iTOL online tool. Black squares at the branches indicate species with fully sequenced genomes. B, bar charts associated with each node indicate the amino acid identity (percentage) of the respective homologs to GacO (magenta) or GacB (blue) from S. pyogenes.
A, phylogenetic tree of GacB homolog sequences identified from 48 partially or completely sequenced streptococcal pathogens. The tree was constructed based on multiple-sequence alignment of GacB homologs using the default neighbor-joining clustering method of Clustal Omega. The tree was plotted using iTOL online tool. Black squares at the branches indicate species with fully sequenced genomes. B, bar charts associated with each node indicate the amino acid identity (percentage) of the respective homologs to GacO (magenta) or GacB (blue) from S. pyogenes.GacO from S. pyogenes, the WecA homolog, was shown to be responsible for the biosynthesis of the GlcNAc-PP-Und (14, 15), the substrate for GacB. We therefore hypothesized that the low- and high-identity subgroups utilize different substrates and therefore investigated whether a similar discrepancy should be observed when comparing the sequence identity of the GacO homologs. Strikingly, within the 48 pathogenic streptococci genomes (Table 1 and Fig. S7), we found that all strains from the high-identity subgroup share a GacO homolog with 63–92% sequence identity. Importantly, any genome from the low-identity subgroup contains a gene product with ≤30% sequence identity to GacO. This subgroup contains gene products that have high homology to S. pneumoniae Cps2E, which transfers Glc-1-P to P-Und, to generate Glc-PP-Und (51, 84). S. mitis, S. oralis subsp. tigurinus, S. peroris, and S. pseudopneumoniae homologs share 98% sequence identity to Cps2E. Thus, we propose that their capsule polysaccharides are also built on Glc-PP-Und.The degree of phylogenetic conservation of GacB in the Streptococcus genus highlights the importance of this gene, and perhaps of the entire RhaPS gene cluster, for survival and pathogenesis of streptococcal pathogens. Overall, these results lead us to propose that those streptococcal species that have GacB homologs with a high degree of identity (>65%) are d-GlcNAc-β-1,4-l- rhamnosyltransferases that catalyze the first committed step in the biosynthesis of their surface RhaPS by transferring rhamnose from TDP-β-l-rhamnose to the membrane-bound GlcNAc-PP-Und. In contrast, we postulate that the species encompassed in the low-identity subgroup, following the function of S. pneumoniae serotype 2 WchF, contain a rhamnosyltransferase that acts on lipid-linked Glc-PP-Und.It is worth noticing that the capsule biosynthesis in S. pneumoniae and other Gram-positive bacteria occurs via the Wzx/Wzy-dependent pathway, and the polymerization of the glycan occurs after the export of the repeating unit by Wzy (84–86). Importantly, ∼75% of all S. pneumoniae capsular polysaccharides are synthesized onto the reducing end of lipid-linked Glc, and strikingly, the elongation of the repeat unit occurs on that very same glucose (87, 88). This information strongly suggests that a reducing end GlcNAc is not compatible with the elongation step via the polymerase Wzy. The Wzx translocation mechanism is commonly used for the synthesis of several surface carbohydrates of lactic acid bacteria, as well as for the production of the lipopolysaccharide and capsules in Gram-negative species (78–80). In contrast, the elongation step of the GAC biosynthesis pathway is proposed to take place in the cytoplasm before the translocation via the hypothesized ABC transporter. The species encompassed within the low-identity subgroup all contain Wzx/Wzy genes downstream of the WchF homolog, whereas the high-identity subgroups share the same type of ABC transport genes with their respective surface RhaPS gene clusters (Table 1).
GacB's N-terminal domain encodes specificity for the GlcNAc acceptor
We decided to further investigate the role of the N- and C-terminal domains in acceptor substrate recognition. GTs show a high degree of specificity toward both the sugar acceptor and the nucleotide-activated sugardonor (89). Although the majority of GTs possess either a GT-A or GT-B type structure fold, the direct function of the GTs cannot be predicted, as they do not tend to share conserved residues associated with the enzymatic activity or have a common mechanism of action (64, 89). Nevertheless, it has been observed that the N-terminal domain of GT-B folded GTs is responsible for the acceptor substrate recognition, whereas the C-terminal domain is involved in sugar–nucleotide donor binding (90). Despite the poor sequence identity to known GT4 protein sequences, we trusted our bioinformatic analysis indicating that GacB has structural similarity to the GT4 family, a group of retaining GTs that have a GT-B fold. We have revealed that GacB is a retaining GT, and our sequence analysis revealed that the protein has two distinct domains. We investigated whether this observation correlates to a lower sequence percentage identity in the N-terminal domain than in the C-terminal domain when compared with WchF. If this were the case, then these differences should also extend to the other species encompassed in the high-identity subgroup.We performed multiple-sequence alignment of the GacB homologs from all 48 streptococcal pathogens to identify the most variable and conserved regions in the protein sequence. We observed a higher discrepancy between the high-identity and the low-identity subgroups in their N-terminal domains (Table 1 and Fig. S7). More precisely, a low sequence conservation region is identifiable between the GacB amino acid residues 40 and 80, suggesting that this section of the domain is either involved in the GlcNAc acceptor sugar recognition or essential protein–protein interactions.We knew from our previous experiment that GacB could not initiate the RhaPS biosynthesis on a wecA deletion background (Fig. 6A, lane 2). Based on this information and to identify residues involved in sugar acceptor recognition, we introduced mutations in the GacB amino acid sequence. The goal was to salvage the RhaPS initiation step in a wecA-deficient E. coli strain in which GacB mutants recognize a lipid-linked sugar acceptor other than GlcNAc-PP-Und. E. coli contains the wcaJ gene that encodes the enzyme that synthesizes Glc-PP-Und.Therefore, we investigated a structural model based on the GacB homolog from Bacillus anthracis, BaBshA (PDB entry 3MBO), which suggested that residues Leu-128 and Arg-131 and Gly-100, Asn-101, Thr-102 (GNT100) may potentially be involved in sugar acceptor recognition. We mutated these residues to mimic those found in WchF. Complementation assays using GacBL128H_R131L failed to complement ΔgacB in a ΔwecA background (Fig. 8, lane 2). Following a sequential approach, we modified GacB primary sequence by introducing additional amino acid substitutions that corresponded to those found in WchF: L128H_R131L_GNT100ARC and L128H_R131L_GNT100ARC_A105P. None of these mutant proteins were able to initiate the rhamnose chain on Und-PP-Glc and, thus, did not restore GacB's activity. Finally, we replaced the first 178 residues of GacB with the corresponding WchF amino acids (residues 1–186). When expressed in a wecA deletion background, this WchF-GacB chimera was able to synthesize the RhaPS backbone on the exclusive acceptor substrate Glc-PP-Und (Fig. 8, lane 5).
Figure 8.
Bacterial complementation assay in
A, immunoblot of whole-cell samples probed with anti-Group A carbohydrate antibody and Coomassie staining (right) (B).
Bacterial complementation assay in
A, immunoblot of whole-cell samples probed with anti-Group A carbohydrate antibody and Coomassie staining (right) (B).In light of our findings, we postulate that the C-terminal domain of GacB is involved in the GT reaction, whereas the N-terminal domain, which includes the uncharacterized DUF1972 domain, participates in lipid-linked substrate recognition and binding. A distinction in domain functionality has been suggested for other GTs of the GT4 family. For instance, in MshA, a GT4 in C. glutamicum, the donor substrate binding occurs at the C-terminal domain through the residues located at the β12/α10 loop, whereas the N-terminal domain is involved in acceptor binding (57). Likewise, biochemical and structural studies on WsaF, a rhamnosyltransferase involved in elongating the polysaccharide's backbone chain in the Gram-positive bacterium Geobacillus stearothermophilusNRS 2004/3a, indicate that the acceptor-binding region is located at the N-terminal domain, whereas the C-terminal domain is associated with sugardonor transfer (85, 86). Interestingly, WsaF's C-terminal end is involved in TDP-Rha binding and contains an EX7E motif, also found among its homologs and other putative rhamnosyltransferases. Remarkably, all GacB streptococcal homologs contain a similar GX7E motif that may be involved in the GT activity. This sequence contains neither the proline-rich motif nor a conserved tyrosine (Tyr-329) that is suggested to be involved in the rhamnose binding described for WsaF (85) (Fig. 8).Furthermore, GacB-related homologs have a fully conserved GGTNPGLLE sequence, whereas the S. pneumoniae WchF-like subgroup has a GGTNPSLLE sequence, suggesting that yet another mechanism may be used to bind and select for TDP-Rha in streptococcal Lancefield species. In terms of the mechanism of action, kinetic and structural data collected from other retaining rhamnosyltransferases of the GT-B family suggest that the stereochemical outcome is a result of a double-displacement mechanism facilitated by interdomain flexibility (66, 72, 76, 78, 87–90). However, further biochemical and structural work is required to pinpoint the catalytic residues in GacB and its homologs.Altogether, these results, although insufficient to pinpoint the individual residues that recognize the lipid sugar acceptor, demonstrate that GacB belongs to the GT-B superfamily and has a high substrate selectivity toward the GlcNAc acceptor and TDP-β-l-rhamnose as the nucleotide sugardonor. Importantly, our data strongly suggest that the acceptor recognition takes places at the N-terminal domain, opening the possibility of assigning a role in acceptor binding to the domain of unknown function DUF1972 as the GlcNAc-PP–binding site.
Conclusions
In recent years, the biosynthesis of GAC, a major structural component of the cell wall in S. pyogenes (14, 15), has received increased attention. The work of van der Beek et al. in 2015 (24) shed light on the role of GacA in the production of the nucleotide sugar TDP-β-l-rhamnose, which is used to build the GAC backbone chain (7). Collectively, the research of Rush et al.(14) and Edgar et al. (18) revealed the role of GacHIJKL, the enzymes involved in chain modification after the translocation step. Despite this progress, the production of the RhaPS backbone, which involves the initiation and elongation stages of the GAC biosynthesis, remained unexplored. Our work unveils the function of GacB, the first GT of this pathway, which conducts a conserved priming step in the RhaPS biosynthesis.Several research groups have shown that studying the GAC cluster directly in S. pyogenes, including the generation of conditional knockouts, constitutes a real technical challenge (7, 24). The genes encoding GacB, as well as GacA, GacC–G, and GacL, were indispensable for S. pyogenes survival in transposon mutagenesis experiments (7, 10, 25).It remains unknown whether the GAC is essential for Group A Streptococcus survival or if the lethal phenotype observed after deletion of genes of the gene cluster is a result of the accumulation of GlcNAc-PP-Und, GlcNAc-P-Und, or any other intermediate LLS. For instance, the accumulation of the lipid-linked precursor has been observed after the deletion of gacABC, gacG, and gacL homologous genes involved in S. pneumoniae serotype 2 capsule biosynthesis and in Bacillus subtilis and Staphylococcus aureus wall teichoic acid biosynthesis (22, 73, 84). Importantly, until today, no gene has been identified in Streptococcus species that could convert/recycle these lipid-linked intermediates, as has been described for other bacteria (99).In B. subtilis and S. aureus, secondary deletions of the early-acting gene, such as the one involved in GlcNAc-PP-Und biosynthesis, tagO/tarO (73), reverse the lethal phenotype observed as a consequence of interrupting the polymerization of these secondary cell wall structures. Under those circumstances, the genes involved in wall teichoic acid polymerization are no longer essential, albeit their deletion leads to cells with compromised fitness, and this phenomenon could also be the case for the GAC biosynthesis pathway. Importantly, disruption of the gacO gene has been shown to cause cell death (25). However, van Sorge et al. (7) showed that low concentrations of tunicamycin (0.025%), a specific inhibitor of undecaprenyl-phosphate GlcNAc-1-phosphate transferase (WecA/GacO) producing GlcNAc-PP-Und, lead to viable S. pyogenes cells with no RhaPS in their cell walls (25). Therefore, it would be interesting to investigate whether the gac gene deletions mentioned earlier cause the same lethal phenotype in tunicamycin-treated S. pyogenes strains.We have demonstrated that GacB is a functional homolog of SccB, the proposed first rhamnosyltransferase in the RhaPS synthesis pathway of S. mutans serotype c (26). Previous research on S. mutans and in silico studies of the Group B Streptococcus polysaccharide biosynthesis clusters showed that GacB homologs are likely to be rhamnosyltransferases, but no biochemical evidence was ever provided to support these hypotheses (26, 82, 91). Here, we provide the first biochemical confirmation that GacB is a β-1,4-l-rhamnosyltransferase, in opposition to its preliminary annotation (7, 14, 15). We also reveal that GacB does not require divalent cations for its activity, which is consistent with the absence of a DXD motif in its primary sequence (70, 89) (Fig. S5).The combined results presented here encourage GacB's inclusion in the GT4 subfamily, although the enzymes of this CAZy family are currently limited to those who utilize α-configured nucleotide sugars. Furthermore, the data presented in this work support the hypothesis that the N-terminal domain is responsible for the binding of the lipid-linked substrate as well as the association to the bacterial membrane in the cytosol. We speculate that the domain of unknown function DUF1972 is fulfilling this role.We observed that mutations in the highly conserved residues Asp-160 and Tyr-182 abrogated GacB's priming activity in both the in vivo complementation and the in vitro enzymatic activity assays (Fig. 3 and Fig. S6). In the absence of a crystal structure or one from closely related homologs, we can only speculate on the role of these residues based on the existing literature of other GT-B retaining GT. Tyrosine residues have been described in the literature as essential active-site residues for the catalytic activity of several glycosyltransferases. Their proposed roles are diverse and range from acting as a catalytic base, where the Tyr side chain can activate the acceptor hydroxyl group (92), to constituting nucleotide sugarribose-binding residues (93). In this context, we hypothesize that the hydroxyl group of Tyr-182 might fulfill one of the two proposed roles in the GAC initiation process. In the absence of a crystal structure, or a suitable homologous structure, we cannot reveal the mechanistic details of GacB and the role of this conserved Tyr-182.Another exciting outcome of this research has been the finding that the reaction catalyzed by GacB, the formation of Rha-β-1,4-GlcNAc-PP-Und, is highly conserved in the Streptococcus genus. The bacterial complementation assays and the bioinformatics analysis lead us to hypothesize that this initiation step is shared by all streptococcal pathogens expressing a RhaPS surface carbohydrate (Table 1 and Figs. 6 and 7), including but not limited to species from the Lancefield Groups A, B, C, D, E, F, G, H, K, L, M, N, O, P, and R.Overall, this work unveils the function of GacB, the first metal-independent, retaining, and nonprocessive α-d-GlcNAc-β-1,4-l-rhamnosyltransferase reported, and sheds light on the first committed step of the RhaPS biosynthesis in Group B, Group C, and Group G Streptococcus.Importantly, GacB's essentiality and the lethal phenotype of GAC-depleted strains, in addition to the high-sequence conservation observed for GacB homologs from several streptococcal pathogens, may have significant implications for the future development of multitarget drugs in polypharmacology, because humans lack both TDP-Rha and related glycosyltransferases. Finally, our insights regarding the ability to synthesize streptococcal RhaPS in a recombinant expression system, namely E. coli, onto different lipid-linked acceptor sugars by substituting the enzyme GacB (or its homologs) provide exciting new opportunities for glycoconjugate vaccine development.
Experimental procedures
Bioinformatics analysis
Alignments of protein sequences were performed using NCBI Blast Global align (https://blast.ncbi.nlm.nih.gov/Blast.cgi) and ClustalOmega (https://www.ebi.ac.uk/Tools/msa/clustalo/) (94). Molecular weight predictions were obtained using the ProtParam tool at the Expasy server (http://www.expasy.org/) (101).5 Protein topological predictions were generated using both SpOctopus (http://octopus.cbr.su.se/)5 (102) and the TMHMM algorithms (www.cbs.dtu.dk/services/TMHMM/).5Secondary structure predictions were generated using either Phyre2 (http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index)5 (103) or RaptorX (raptorx.uchicago.edu)5 and visualized using PyMOL (educational version 1.8, Schrödinger, LLC). The Carbohydrate-Active enZymes database (CAZy) (http://www.cazy.org/)5 (73) was examined to obtain information about the classification and characterization of carbohydrate active enzymes. Phylogeny relationships were established using Clustal Omega, Clustal X, and iTOL (95).
Bacterial strains and growth conditions
All bacterial strains used in this study are listed in Table 3. E. coli strains DH5α and MC1061 were used indistinctively as host strains for the propagation of recombinant plasmids and plasmid integration. E. coli CS2775, a strain sourced from Prof. Rodney Welsh that lacks the Rha modification on the lipopolysaccharide, was used as the host strain to evaluate the production of RhaPS. E. coli 21548 is a Und-PP-GlcNAc–deficient strain that contains a wecA deletion, serving as a negative control for the production of RhaPS. The strain was sourced from Dr. Jeffrey Rush. E. coli strain C43 (DE3) was used for the production of recombinant protein. All E. coli strains were grown in lysogeny broth media. Unless otherwise indicated, all bacterial cultures were incubated at 37 °C in a shaking incubator at 200 rpm. Where necessary, media were supplemented with one or more antibiotics to the following final concentration: carbenicillin (Amp) at 100 μg/μl, erythromycin (Erm) at 300 μg/μl or kanamycin (Kan) at 50 μg/ml.
Table 3
Plasmid IDs, genes, primers, and vector details
Plasmid IDs, genes, primers, and vector details
Molecular genetic techniques
Table S1 shows the DNA sequence of the forward and reverse oligonucleotide primer pairs used to amplify, delete, or mutagenize the genes of interest. All primers were obtained from Integrated DNA Technologies. All PCRs were performed using a SimpliAmp Thermal Cycler from Thermo Fisher Scientific with standard procedures. Constructs were cloned using standard molecular biology procedures, including restriction enzyme digest and ligation. All constructs were validated with DNA sequencing.
Determination of RhaPS production
Aliquots of 50 μl of A600-normalized overnight cultures grown at 37 °C were mixed with 50 μl of 6× SDS-loading buffer and resolved in 20% Tricine-SDS gels (96). Assessment of the RhaPS production was performed via immunoblotting on polyvinylidene difluoride membranes following the traditional immunoblotting technique. Primary antibody was rabbit-raised anti-S. pyogenes Group A carbohydrate polyclonal antibody (Abcam, ab21034). Secondary antibody was goat-raised anti-rabbit IgG horseradish peroxidase conjugate (Bio-Rad, 170-6515). Immunoreactive signals were captured using either GENESYSTM 10S UV-visible spectrophotometer (Thermo Scientific) after exposure to the Clarity Western ECL (Bio-Rad).
Extraction and radiolabeling of lipid-linked oligosaccharides
Radiolabeled LLSs of induced E. coli CS2775 cells bearing the selected plasmids were extracted using CHCl3/CH3OH and water-saturated butan-1-ol (1:1 v/v) solution to determine the addition of sugar residues in vivo after glucose D[6s3H] (N) (PerkinElmer Life Sciences) supplementation (1 mCi/ml). The incorporated radioactivity was measured in a Beckman LS6000SE scintillation counter. The organic phase containing the LLSs was normalized to 0.05 μCi/μl. The samples were separated via TLC on an HPTLC Silica Gel 60 plate (Merck) using a C:M:AC:A:W mobile phase (180 ml of chloroform + 140 ml of methanol + 9 ml of 1 m ammonium acetate + 9 ml of 13 m ammonia solution, 23 ml of distilled water), the plates were dried and sprayed with En3Hance liquid (PerkinElmer Life Sciences). Radioautography images were obtained using Carestream® Kodak® BioMax® XAR Film and MS intensifying screens after 5–10 days.
Purification of recombinantly expressed membrane-associated proteins
The purification was conducted following the established protocol from Waldo et al. (97) with the following modifications. Overnight cultures of E. coliC43 (DE3) cells expressing C-terminal GFP fusion proteins were diluted 1:100, incubated for 3 h until A600 = 0.6, induced with 0.5 mm isopropyl 1-thio-β-d-galactopyranoside, and shifted to room temperature overnight, all at 200 rpm shaking. GFP expression was detected through in-gel fluorescence using a Fuji FLA-5000 laser scanner. For cloning, expression, and purification of GacB-WT, GacB-D160N-GFP, and GacB-Y182-GFP, plasmids containing GFP-His8-tagged recombinant proteins were constructed as described in Table 3 into the vector pWaldo-E (97). For protein production and purification purposes, the vectors were transformed into E. coliC43 (DE3) cells and expressed as described above. The cells were fractionated using an Avestin C3 high-pressure homogenizer (Biopharma) and spun down at 4000 × g. Further centrifugation of the supernatant at 200,000 × g for 2 h rendered 2–3 g of membrane containing the GacB-GFP proteins. Membranes were solubilized in Buffer 1 (500 mm NaCl, 10 mm Na2HPO4, 1.8 mm KH2PO4, 2.7 mm KCl, pH 7.4, 20 mm imidazole, 0.44 mm tris(2-carboxyethyl)phosphine) with the addition of 1% DDM (Anatrace) for 2 h at 4 °C and bound to a 1-ml nickel-Sepharose 6 Fast Flow (GE Healthcare) column, prewashed with buffer 1 plus 0.03% DDM. Elution was conducted using Buffer 1 supplemented with 250 mm imidazole and 0.03% DDM. Imidazole was removed using a HiPrep 26/10 desalting column (GE Healthcare) equilibrated with buffer (PBS, 0.03% DDM, 0.4 mm tris(2-carboxyethyl)phosphine). The GFP-His tag was removed with PreScission Protease cleavage in a 1:100 ratio overnight at 4 °C. Cleaved GacB proteins were collected after negative IMAC. Protein identity and purity was determined by tryptic peptide mass fingerprinting, matrix-assisted laser desorption/ionization TOF MS (MALDI-TOF), respectively (University of Dundee “Fingerprints” proteomics facility).
Synthesis of acceptor 1 and 2
Acceptor 2 (P1-(11-phenoxyundecyl)-P2-(2-acetamido-2-deoxy-α-d-glucopyranosyl) diphosphate, PhO(CH2)11-PP-GlcNAc) was synthesized as sodium salt from phenoxyundecyl dihydrogen phosphate and 2-acetamido-2-deoxy-3,4,6-tri-O-acetyl-α-d-glucopyranosyl dihydrogen phosphate according to the procedure by Druzhinina et al. (98). Acceptor 1 (P1-tridecyl-P2-(2-acetamido-2-deoxy-α-d-glucopyranosyl) diphosphate, CH3(CH2)12-PP-GlcNAc) was synthesized from tridecyl dihydrogen phosphate (obtained similarly to phenoxyundecyl dihydrogen phosphate) by the same procedure as described for acceptor 2.
GacB in vitro enzymatic reaction
Purified GacB-WT-GFP, GacB-D160N-GFP, GacB-Y182F-GFP, and the GacB (tagless) protein (0.15 mg/ml final concentration) were mixed in a 100-μl TBS buffer supplemented with 1 mm TDP-Rha as sugardonor and 1 mm acceptor 1 (CH3(CH2)12-PP-GlcNAc) or 1 mm acceptor 2 (PhO(CH2)11-PP-GlcNAc)) as acceptor substrate. The reaction was incubated for 3–24 h at 30 °C. The assay mixture was adjusted with the exchange of the nucleotide sugardonor to UDP-Rha or UDP-GlcNAc and with the addition of either 1 mm MgCl2, 1 mm MnCl2, or 1 mm EDTA to define the essentiality of metal dependence.
MS analysis
MALDI-TOF was used to analyze the acceptors and products of the GacB in vitro assay. 100-μl reaction samples were purified over 100-μl Sep-Pak C18 cartridges (Waters), pre-equilibrated with 5% EtOH. The bound samples were washed with 800 μl of H2O and 800 μl of 15% EtOH, eluted in two fractions with (a) 800 μl of 30% EtOH and (b) 800 μl of 60% EtOH. The two elution fractions were dried in a SpeedVac and resuspended in 20 μl of 50% MeOH. 1 μl of sample was mixed with 1 μl of 2,5-dihydroxybenzoic acid matrix (15 mg/ml in 30:70 acetonitrile, 0.1% TFA), and 1 μl was added to the MALDI grid. Samples were analyzed by MALDI in an Autoflex speed mass spectrometer set up in reflection positive ion mode (Bruker, Germany).
NMR analysis
The purified GacB in vitro assay products (0.5–2 mg) were dissolved in D2O (550 μl) and measured at 300 K. The spectra were acquired on a four-channel Avance III 800-MHz Bruker NMR spectrometer equipped with a 5-mm TCI CryoProbeTM with automated matching and tuning. 1D spectra were acquired using the relaxation and acquisition times of 5 and 1.8 s, respectively. Between 32 and 512 scans were acquired using the spectral width of 11 ppm. J connectivities were established in a series of 1D and 2D TOCSY experiments with mixing times between 20 and 120 ms. Selective 1D TOCSY spectra (99) were acquired using 40-ms Gaussian pulses and DIPSI-2 sequence (100) (γB1/2π = 10 kHz) for spin lock of between 20 and 120 ms. The following parameters were used to acquire 2D TOCSY and ROESY experiments: 2048 and 768 complex points in t2 and t1, respectively, spectral widths of 11 and 8 ppm in F2 and F1, yielding t2 and t1 acquisition times of 116 and 60 ms, respectively. 16 scans were acquired for each t1 increment using a relaxation time of 1.5 s. The overall acquisition time was 6–7 h/experiment. A forward linear prediction to 4096 points was applied in F1. A zero filling to 4096 was applied in F2. A cosine square window function was used for apodization prior to Fourier transformation in both dimensions. The ROESY mixing time was applied in the form of a 250-ms rectangular pulse at γB1/2π = 4167 Hz. DIPSI-2 sequence (γB1/2π = 10 kHz) was applied for a 20, 80, and 120 ms spin lock. 2D magnitute mode heteronuclear multiple-bond correlation (HMBC) experiments were conducted using the following parameters: 2048 and 128 complex points in t2 and t1, respectively, spectral widths of 6 and 500 ppm in F2 and F1, yielding t2 and t1 acquisition times of 0.35 s and 0.6 ms, respectively. Two scans were acquired for each of 128 t1 increments using a relaxation time of 1.2 s. The overall acquisition time was 8 min. A forward linear prediction to 512 points was applied in F1; zero filling to 4096 was applied in F2. A sine square window function was used for apodization prior to Fourier transformation in both dimensions.
Author contributions
A. Z., B. H. M., E. A., D. U., and H. C. D. conceptualization; A. Z., B. H. M., and E. A. data curation; A. Z., B. H. M., E. A., D. U., and H. C. D. formal analysis; A. Z., B. H. M., E. A., D. U., and H. C. D. validation; A. Z., B. H. M., E. A., V. I. T., V. V. V., L. L. D., D. U., and H. C. D. investigation; A. Z., B. H. M., E. A., D. U., and H. C. D. visualization; A. Z., B. H. M., E. A., V. I. T., V. V. V., L. L. D., D. U., and H. C. D. methodology; A. Z., B. H. M., E. A., V. I. T., V. V. V., D. U., and H. C. D. writing-original draft; A. Z., B. H. M., E. A., L. L. D., D. U., and H. C. D. writing-review and editing; D. U. and H. C. D. supervision; D. U. and H. C. D. funding acquisition; H. C. D. resources; H. C. D. project administration.
Authors: Jeffrey S Rush; Rebecca J Edgar; Pan Deng; Jing Chen; Haining Zhu; Nina M van Sorge; Andrew J Morris; Konstantin V Korotkov; Natalia Korotkova Journal: J Biol Chem Date: 2017-10-11 Impact factor: 5.157
Authors: Lawrence A Kelley; Stefans Mezulis; Christopher M Yates; Mark N Wass; Michael J E Sternberg Journal: Nat Protoc Date: 2015-05-07 Impact factor: 13.491
Authors: Anna Henningham; Mark R Davies; Satoshi Uchiyama; Nina M van Sorge; Sean Lund; Kelsey T Chen; Mark J Walker; Jason N Cole; Victor Nizet Journal: mBio Date: 2018-01-30 Impact factor: 7.867
Authors: Benjamin H Meyer; Panagiotis S Adam; Ben A Wagstaff; George E Kolyfetis; Alexander J Probst; Sonja V Albers; Helge C Dorfmueller Journal: Elife Date: 2022-04-08 Impact factor: 8.713
Authors: Catherine T Chaton; Jeffrey S Rush; Svetlana Zamakhaeva; Sowmya Ajay Castro; Cameron W Kenner; Alexander E Yarawsky; Andrew B Herr; Nina M van Sorge; Helge C Dorfmueller; Gregory I Frolenkov; Konstantin V Korotkov; Natalia Korotkova Journal: Nat Chem Biol Date: 2021-05-27 Impact factor: 16.174