| Literature DB >> 24885406 |
Aminael Sánchez-Rodríguez, Hanne L P Tytgat, Joris Winderickx, Jos Vanderleyden, Sarah Lebeer1, Kathleen Marchal.
Abstract
BACKGROUND: Bacterial interactions with the environment- and/or host largely depend on the bacterial glycome. The specificities of a bacterial glycome are largely determined by glycosyltransferases (GTs), the enzymes involved in transferring sugar moieties from an activated donor to a specific substrate. Of these GTs their coding regions, but mainly also their substrate specificity are still largely unannotated as most sequence-based annotation flows suffer from the lack of characterized sequence motifs that can aid in the prediction of the substrate specificity.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24885406 PMCID: PMC4039749 DOI: 10.1186/1471-2164-15-349
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of the Hidden Markov Models (HMMs) used to screen for glycosyltransferases in the proteomes of NCTC 11168 and GG
| HMM group | Description | Database | Reference |
|---|---|---|---|
| I | Rossmann-fold domains | SUPERFAMILY | Ha |
| Egelund | |||
| Lairson | |||
| Hansen | |||
| II | Sugar transferase | SUPERFAMILY | Egelund |
| Hansen | |||
| UDP-Glycosyltransferase | SUPERFAMILY | Egelund | |
| Hansen | |||
| III | Transglycosylase (PF00912) | Pfam/CAZy | Di Guilmi |
| Glycosyltransferase WecB/TagA/CpsF (PF03808) | Pfam/CAZy | Maldonado-Barragán | |
| Bacterial sugar transferase (PF02397) | Pfam | Yoshida | |
| Provencher | |||
| Oligosaccharyltransferase STT3 subunit (PF02516) | Pfam/CAZy | Baïet | |
| DAD family (PF02109) | Pfam | Silberstein | |
| OST3/OST6 family (PF04756) | Pfam/CAZy | Knauer | |
| Glycosyltransferase family 25 (PF01755) | Pfam/CAZy | Campbell | |
| Glycosyltransferase family 28 (PF04101) | Pfam/CAZy | Mengin-Lecreulx | |
| Glycosyltransferase family 9 (PF01075) | Pfam/CAZy | Campbell et al., 1997 [ |
HMM group: HMMs were grouped according to their expected specificity for glycosyltransferase activity in an increasing order. Description: description of the HMM. The Pfam model id is also provided. Database: source of the model. Reference: bibliographic citation supporting the inclusion of the corresponding HMM in the analysis.
Figure 1Glycosyltransferase annotation flow. A: Genome-wide annotation of glycosyltransferases (GTs). Glycosyltransferases are predicted by scanning the proteomes of the studied species for GT-specific signatures using Hidden Markov Models (HMM) from SUPERFAMILY, CAZy and Pfam. An additional fold recognition filtering step is applied to only retain those genes containing a three-dimensional fold (inferred by the PGenTHREADER algorithm) with significant homology to folds present in experimentally confirmed GTs (deposited in the SCOP database). B: Predicting GT substrate class and putative mode of action (bottom panel). The local network neighborhood of each query GT (black node) in a functional interaction network (STRING) is used to extract a GT-specific local subnetwork for each query GT. The local subnetwork of a GT comprises predicted functional partners (proteins being functionally related to the query GT). Based on the GO enrichment analysis of these genes in this local subnetwork, the substrate class of the query GT is derived. To gain information on the mode of glycosylation, the GT specific local subnetwork is further annotated with either membrane associations between a query GT and a predicted transmembrane protein (blue edge) and with relations indicative for protein glycosylation (yellow edge).
Figure 2Annotated glycosyltransferases. Results for the model system Campylobacter jejuni are shown on the left panel and for L. rhamnosus GG on the right panel. Putative GTs were predicted using an HMM based screening. A: results obtained with an HMM recognizing ‘Rossmann-fold domains’, expected to be the HMM with the lowest specificity towards GTs (Table 1, class I). B and C: results obtained with a family of HMMs of intermediate specificity for GTs (Table 1, class II). D: results obtained with the class of HMMs, most specific for GTs (Table 1, class III). Pie charts indicate the extent to which different functional classes were enriched amongst the predictions obtained with the respective classes of HMMs. Slices indicated in red on the pie chart correspond to the functional classes of the predictions that were retained after the fold recognition filtering step. For each group of HMMs, the total number of predictions is denoted in black on top of every pie chart and the number of predictions retained after applying the fold recognition step is denoted in red.
Updated annotation of glycosyltransferases predicted in the genome of GG
| Locus tag | Current annotation | Proposed annotation | HMM | Evidence | Reference |
|---|---|---|---|---|---|
| LGG_00279 |
|
| Sugar transferase | Conservation | Kankainen |
| LGG_00280 |
|
| Sugar transferase | Conservation | Kankainen |
| LGG_00281 |
|
| Sugar transferase | Conservation | Kankainen |
| LGG_00283* |
|
| UDP-Glycosyltransferase | - | - |
| LGG_00295 | Glycosyltransferase, group 2 | Putative glycosyltransferase | Sugar transferase | Conservation | Kankainen |
| LGG_00348 |
| Putative glycosyltransferase | UDP-Glycosyltransferase | Conservation | Kankainen |
| LGG_00349 |
| Putative glycosyltransferase | UDP-Glycosyltransferase | Conservation | Kankainen |
| LGG_00645 | Glycosyltransferase, group 2 | Putative glycosyltransferase | Sugar transferase | Conservation | Kankainen |
| LGG_00695 |
| Putative glycosyltransferase | Sugar transferase | Conservation | Kankainen |
| LGG_00794 |
|
| Pfam/CAZy | Conservation | Kankainen |
| LGG_00825 |
| Putative glycosyltransferase | UDP-Glycosyltransferase | Conservation | Kankainen |
| LGG_00826 |
| Putative glycosyltransferase | UDP-Glycosyltransferase | Conservation | Kankainen |
| LGG_00928* |
| Putative glycosyltransferase | Rossmann-fold domains | - | - |
| LGG_00985* | Integral membrane protein | Putative glycosyltransferase | Pfam/CAZy | - | - |
| LGG_00998 |
| Putative glycosyltransferase | Sugar transferase | Conservation | Kankainen |
| LGG_00999 |
| Putative glycosyltransferase | Rossmann-fold domains | Conservation | Kankainen |
| LGG_01057 | Glycosyltransferase, group 2 | Putative glycosyltransferase | Sugar transferase | Conservation | Kankainen |
| LGG_01062# |
| UTP-glucose-1-phosphate uridylyltransferase | Sugar transferase | Conservation | Kankainen |
| LGG_01069 |
| Putative glycosyltransferase | Sugar transferase | Conservation | Kankainen |
| LGG_01147 | Glycosyltransferase, group 1 | Putative glycosyltransferase | Rossmann-fold domains | Conservation | Kankainen |
| LGG_01195* |
| ABC transporter, putative bifunctional glycosyltransferase | Pfam | - | - |
| LGG_01283 |
|
| UDP-Glycosyltransferase | Conservation | Mengin-Lecreulx |
| LGG_01412* |
| tRNA uracil −5-methyltransferase, putative bifunctional glycosyltransferase | Rossmann-fold domains | - | - |
| LGG_01487 |
|
| Pfam/CAZy | Conservation | Kankainen |
| LGG_01538 | Phage-related glycosyltransferase | Putative glycosyltransferase | Sugar transferase | Conservation | Kankainen |
| LGG_01586 |
| Putative glycosyltransferase | UDP-Glycosyltransferase | Conservation | Kankainen |
| LGG_01587 |
| Putative glycosyltransferase | Rossmann-fold domains | Conservation | Kankainen |
| LGG_01783 |
|
| Pfam/CAZy | Conservation | Di Guilmi |
| LGG_01991* | UDP-N-acetylglucosamine 2-epimerase | Epimerase, putative bifunctional glycosyltransferase | UDP-Glycosyltransferase | - | - |
| LGG_01992* | UDP-N-acetylglucosamine 2-epimerase | Epimerase, putative bifunctional glycosyltransferase | Sugar transferase | - | - |
| LGG_01999 |
|
| Sugar transferase | Conservation | Kankainen |
| LGG_02004 |
| Putative glycosyltransferase | Pfam | Conservation | Kankainen |
| LGG_02023# |
|
| UDP-Glycosyltransferase | Conservation | Kankainen |
| LGG_02024 |
|
| UDP-Glycosyltransferase | Conservation | Kiel |
| LGG_02025# |
|
| Sugar transferase | Conservation | Ballicora |
| LGG_02026# |
|
| Sugar transferase | Conservation | Ballicora |
| LGG_02040$ |
|
| Sugar transferase | Conservation | Kankainen |
| LGG_02042 |
|
| Sugar transferase | Conservation | Kankainen |
| LGG_02043 |
|
| Pfam | Experimental validation | Lebeer |
| LGG_02044 |
|
| UDP-Glycosyltransferase | Conservation | Kankainen |
| LGG_02045 |
|
| UDP-Glycosyltransferase | Conservation | Kankainen |
| LGG_02046 |
|
| Sugar transferase | Conservation | Kankainen |
| LGG_02047 |
|
| UDP-Glycosyltransferase | Experimental validation | This work |
| LGG_02284 | Glycosyltransferase, group 1 | Putative glycosyltransferase | UDP-Glycosyltransferase | Conservation | Kankainen |
| LGG_02285 |
| Putative glycosyltransferase | UDP-Glycosyltransferase | Conservation | Kankainen |
| LGG_02347* | Hypothetical protein | Putative glycosyltransferase | Pfam | - | - |
| LGG_02562# |
| UDP-N-acetylglucosamine pyrophosphorylase | Sugar transferase | Conservation | Kankainen |
| LGG_02869 | Glycosyltransferase, group 1 | Putative glycosyltransferase | UDP-Glycosyltransferase | Conservation | Kankainen |
Locus tag: gene identifier of the predicted GT. Genes for which a GT activity was predicted in this study that was not present in the current annotation are marked with a star (*). Potential false positive results are indicated with a hash (#). Current annotation: functional annotation as in current genome release of GenBank (NC_013198.1). Proposed annotation: new annotation proposed based on the results of our analysis. HMM: description of the Hidden Markov Model (HMM) with which the indicated GT was identified. Evidence: Level of evidence for the GT activity. Conservation: shows significant sequence conservation with an experimentally validated GT in a closely related species. Experimental validation: the GT activity has been experimentally validated in Lactobacillus rhamnosus GG. Reference: reference to the publication(s) supporting the evidence.
Proposed substrate classess of predicted glycosyltransferases in GG
| Query-GT locus tag | Query-GT localization | Enriched GO categories | Membrane association | Partner GTs | Proposed substrate class of the query-GT | Potential protein substrate | Evidence | Reference |
|---|---|---|---|---|---|---|---|---|
| LGG_00280 | C | EPS biosynthesis | LGG_00278 (hypothetical protein) | LGG_02043 LGG_00281 LGG_00283 LGG_00295 LGG_00279 LGG_01999 | Extracellular saccharides | - | Conservation | Kankainen |
| LGG_00281 | C | EPS biosynthesis; PS transport | LGG_00278 (hypothetical protein) | LGG_00280 LGG_00295 LGG_01057 LGG_00279 | Extracellular saccharides | - | Conservation | Kankainen |
| LGG_00295 | C | EPS biosynthesis | LGG_00296 (integral membrane protein) | LGG_00280 LGG_02043 LGG_00281 LGG_02869 LGG_01057 | Extracellular saccharides | - | Conservation | Kankainen |
| LGG_01062* | C | EPS biosynthesis | - | LGG_02026 LGG_02023 LGG_02025 | Extracellularsaccharides | - | - | - |
| LGG_02040$ | C | EPS biosynthesis; nucleotide-sugar metabolism | - | LGG_02042 LGG_02046 | Extracellular saccharides | - | Conservation | Kankainen |
| LGG_02042 | C | EPS biosynthesis; nucleotide-sugar metabolism | - | LGG_02040 | Extracellularsaccharides | - | Conservation | Kankainen |
| LGG_02043 | TM | Peptidyl-tyrosine dephosphorylation, regulation of catalytic acitivity, EPS biosynthesis | - | LGG_01992 LGG_02047 | Extracellularsaccharides | - | Experimental validation | Lebeer |
| LGG_02045 | C | Polysaccharide biosynthesis; polysaccharide transport | LGG_00282 (polysaccharide transporter) | LGG_00998 LGG_00999 LGG_02046 LGG_02047 | Extracellular saccharides | - | Conservation | Kankainen |
| LGG_02046 | C | EPS biosynthesis; polysaccharide transport | LGG_02049 (polysaccharide transporter) | LGG_02045 LGG_02047 LGG_01999 | Extracellular saccharides | - | Conservation | Kankainen |
| LGG_02047 | C | Polysaccharide biosynthesis; polysaccharide transport | LGG_02043 (undecaprenyl-P-β-glucosephosphotransferase) | LGG_02043 LGG_02045 LGG_02046 LGG_02869 LGG_00295 LGG_01057 | Extracellular saccharides | - | Experimental validation | This work |
| LGG_01062* | C | Glycogen biosynthesis | - | LGG_02026 LGG_02023 LGG_02025 | Intracellular saccharides | - | - | - |
| LGG_02023 | C | Glycogen biosynthesis; pyrimidine nucleoside metabolism | - | LGG_02026 LGG_01062 LGG_02024 LGG_02025 | Intracellular saccharides | - | Conservation | Kankainen |
| LGG_02024 | C | Glycogen biosynthesis; response to antibiotic | - | LGG_02023 LGG_02025 LGG_02026 | Intracellular saccharides | - | Conservation | Kiel |
| LGG_02025 | C | Glycogen biosynthesis | - | LGG_02023 LGG_02024 LGG_02026 | Intracellular saccharides | - | Conservation | Ballicora |
| LGG_02026 | C | Glycogen biosynthesis | - | LGG_02023 LGG_02024 LGG_02025 | Intracellular saccharides | - | Conservation | Ballicora |
| LGG_00998 | C | Carbohydrate metabolism; lipids metabolism | LGG_00995 (hypothetical protein) | LGG_02045 LGG_00999 | Lipid | - | Conservation | Kankainen |
| LGG_00999 | C | Carbohydrate metabolism; lipids metabolism | LGG_00995 (hypothetical protein) | LGG_02045 LGG_00998 | Lipid | - | Conservation | Kankainen |
| LGG_01057* | C | Carbohydrate metabolism; lipids metabolism | LGG_02004 (sugar or LPS synthesis transferase) | LGG_02004 LGG_00280 LGG_02043 LGG_02869 LGG_00295 LGG_02046 LGG_02047 | Lipid | - | - | - |
| LGG_00794 | TM | PG-based cell wall biogenesis | - | - | Peptidoglycan | - | Conservation | Kankainen |
| LGG_01283 | C | PG-based cell wall biogenesis | LGG_01192 (rod shape-determining protein RodA) | LGG_01487 | Peptidoglycan | - | Conservation | Mengin-Lecreulx |
| LGG_01487 | TM | PG-based cell wall biogenesis | - | LGG_01283 | Peptidoglycan | - | Conservation | Kankainen |
| LGG_01538* | TM | PG biosynthetic process; regulation of cell shape; dephosphorylation; response to antibiotics | - | LGG_00280 | Peptidoglycan | - | - | - |
| LGG_01783 | TM | PG-based cell wall biogenesis |
| - | Peptidoglycan |
| Conservation | Di Guilmi |
| LGG_00794* | TM | Regulation of cell shape; cell cycle |
|
| Protein | LGG_01280 (cell division protein FtsI) | - | - |
| LGG_00825* | C | Protein translation | LGG_00751 (SNARE associated golgi protein) | LGG_00826 | Protein | LGG_00829 (YkuJ protein) | - | - |
| LGG_00826* | C | Protein translation; amino acid transport | LGG_00751 (SNARE associated golgi protein) | LGG_00825 LGG_02047 | Protein | LGG_00829 (YkuJ protein) | - | - |
| LGG_01147* | C | DNA metabolic process | LGG_01146 (predicted ORF) | - | Protein | LGG_01145 (DNA-entry nuclease) | - | - |
| LGG_01283* | C | Regulation of cell shape; response to antibiotic, cell division | LGG_01192 (rod shape-determining protein RodA) | LGG_01487 | Protein | LGG_01280 (cell division protein FtsI) | - | - |
| LGG_01487* | TM | Regulation of cell shape; cell division | - | LGG_01283 | Protein | LGG_01706 (cell division protein/penicillin-binding protein 2); LGG_01280 (cell division protein FtsI); LGG_00254 (D-alanyl-D-alanine carboxypeptidase) | - | - |
| LGG_01783* | TM | Regulation of cell shape; cell cycle | - | - | Protein | LGG_01280 (cell division protein FtsI) |
Locus tag: gene identifier of the predicted GT used as query in STRING to obtain a query-dependent subnetwork. Localization: indicates whether the query-GT was predicted to be a cytoplasmic (C) or a transmembrane protein (TM). Enriched GO categories: GO categories enriched amongst the members of the query-dependent subnetwork of the indicated query-GT. Only categories showing an enrichment value of p < 0.05 are shown (according to a hypergeometric test corrected for multiple testing using False Discovery Rate). Membrane association: refers to edges between the query-GT and members of its subnetwork predicted to be transmembrane proteins. Partner GTs: predicted/experimentally validated GTs that belong to the subnetwork of the query-GT. Proposed substrate class of the query-GT: inferred from the GO enrichment analysis of the query-dependent subnetwork of the indicated query-GT derived from STRING. Novel substrate predictions derived from this study are indicated by a star (*) next to the locus tag of the corresponding query-GT. Potential protein substrate: it refers to edges between the query-GT and members of its subnetwork predicted to have N- or O-glycosylation sites. Such proteins are therefore suggested to be potential substrates of the query-GT in the cases where proteins are the proposed substrate. Evidence: level of evidence for the predicted substrate class of the query-GT. Conservation: shows significant sequence conservation with a GT for which the substrate specificity has been experimentally validated in closely related species. Experimental validation: the substrate specificity of the GT has been experimentally validated in Lactobacillus rhamnosus GG. Reference: publication(s) supporting the predicted substrate class of the query-GT.
Figure 3Consensus networks derived for each of the predicted substrate classes of putative GTs in GG. Consensus networks show all GTs, having the same substrate class, together with their protein neighbors that are hypothesized to contribute to the same common glycosylation mechanism as the one the GTs are involved in. On the consensus networks, nodes are proteins than can either be GTs (green nodes), transmembrane proteins (orange nodes) or proteins containing glycosylation signals (violet nodes). Membrane associations established between GTs and transmembrane proteins are represented by blue edges while predicted substrate relations between GT and proteins containing glycosylation signals are represented by yellow edges. Black edges refer to interactions between predicted GTs. If the local network neighborhood of GTs (local subnetwork) belonging to the same substrate class shows enrichment in more than one GO category (e.g. both the GO terms of EPS and glycogen biosynthesis), the consensus network is shown for each of the enriched GO categories. A: consensus networks involving GTs, predicted to glycosylate saccharides. Note that here two independent consensus networks were derived corresponding to respectively extracellular and intracellular PS biosynthesis. B: consensus network involving GTs, predicted to glycosylate peptidoglycan (PG). C: consensus network involving GTs, predicted to glycosylate lipids. D: consensus networks involving GTs, predicted to glycosylate proteins. Three independent consensus networks were derived corresponding to respectively cell cycle regulation, protein translation and DNA metabolic processes. Our analysis suggests substrate promiscuity for MurG, PBP1A, PBP1B and PBPA, all of which were predicted to be involved in the glycosylation of both peptidoglycan and proteins.
Figure 4Protein glycosylation of the cell division machinery. Schematic overview of the cell division machinery of L. rhamnosus. PBP1A, PBP1B, PBPB2A and MurG are predicted to be putative GTs. Our network-based analysis predicted PBP3, FtsI and PBP2B as putative substrates of the indicated GTs. The Msp1 cell wall hydrolase is the experimentally validated glycoprotein in L. rhamnosus GG [36].
Figure 5Experimental validation of the EPS network hierarchy. A: Total cell wall polysaccharides were extracted from respectively LGG wild-type, a ΔwelE::TcR gene deletion mutant (CMPG5351) and ΔwelI::TcR gene deletion mutant (CMPG10811). The total amount of EPS was measured. Error bars indicate standard deviations (of three repeats). One-way ANOVA statistical analysis rendered a p-value smaller than 0.05 for the variation of EPS across strains. B: Sugar monomer composition. The data are expressed as relative amounts, taking the total amount of detected monomeric sugars as 100%. Error bars indicate standard deviations (of three repeats). One-way ANOVA analyses (performed independently on each of the three datasets) rendered significant p-values (<0.05) for the variation of each sugar monomer across strains. C: Adhesion capacity. The adhesion capacity of wild type and mutants to Caco-2 cells is compared. Error bars indicate standard deviations (of three repeats). A One-way ANOVA analysis rendered a significant p-value (<0.05) for the variation of the adhesion capacity of the strains.