| Literature DB >> 34138841 |
Stefan Schulze1, Friedhelm Pfeiffer2, Benjamin A Garcia3, Mechthild Pohlschroder1.
Abstract
Glycosylation is one of the most complex posttranslational protein modifications. Its importance has been established not only for eukaryotes but also for a variety of prokaryotic cellular processes, such as biofilm formation, motility, and mating. However, comprehensive glycoproteomic analyses are largely missing in prokaryotes. Here, we extend the phenotypic characterization of N-glycosylation pathway mutants in Haloferax volcanii and provide a detailed glycoproteome for this model archaeon through the mass spectrometric analysis of intact glycopeptides. Using in-depth glycoproteomic datasets generated for the wild-type (WT) and mutant strains as well as a reanalysis of datasets within the Archaeal Proteome Project (ArcPP), we identify the largest archaeal glycoproteome described so far. We further show that different N-glycosylation pathways can modify the same glycosites under the same culture conditions. The extent and complexity of the Hfx. volcanii N-glycoproteome revealed here provide new insights into the roles of N-glycosylation in archaeal cell biology.Entities:
Year: 2021 PMID: 34138841 PMCID: PMC8241124 DOI: 10.1371/journal.pbio.3001277
Source DB: PubMed Journal: PLoS Biol ISSN: 1544-9173 Impact factor: 8.029
Fig 1Colony morphology and motility phenotypes of N-glycosylation pathway mutants.
(A) WT, Δagl15, and ΔaglB strains were streaked out on Hv-Cab plates with 1.5% agar. Colonies of Δagl15 and ΔaglB mutants appear smaller, and for the Δagl15 mutant, also darker than WT colonies. (B) WT, Δagl15, and ΔaglB strains were stabbed into motility plates (Hv-Cab with 0.35% agar) and imaged after 5 days of incubation. While the WT and Δagl15 strains show normal motility, the ΔaglB mutant is nonmotile. WT, wild-type.
Fig 2Growth and cell shape phenotypes of N-glycosylation pathway mutants.
(A) The growth of WT (green), Δagl15 (blue), and ΔaglB (orange) strains was analyzed by measuring the OD600 of cultures over the course of 3 days. Data points and error bars represent the mean and standard deviation of 3 biological replicates. (B) Samples of early- (left), mid- (middle), and late-logarithmic (right) growth phase cultures for each strain were imaged using DIC microscopy. The majority of WT cells in the early-logarithmic growth phase is rod-shaped, and the ratio of rod- to disk-shaped cells decreases over time. A higher ratio of rod- to disk-shaped cells was observed in mid- and late-logarithmic growth phase for the Δagl15 mutant in comparison to the WT, while almost no rod-shaped cells were visible for the ΔaglB strain. It should be noted that OD600 measurements for cell shape samples were performed with a path length of 1.5 cm, while growth curve measurements were performed in 96-well plates with 250 μl of culture. Images in A, B, and C (bottom) are representative for at least 4 biological replicates. The scale bar (bottom left) indicates 10 μm and applies to all DIC microscopy images. (C) A quantitative cell shape analysis was performed using CellProfiler [34]. Boxplots for each growth phase depict the eccentricity of cells for each strain across 3 biological replicates. Individual cells are shown as scatter plot, while the center line, box limits, and whiskers of each boxplot represent the median, upper/lower quartiles, and 1.5× interquartile range, respectively. A t test with subsequent Benjamini–Hochberg correction for multiple testing was performed, and statistically significant differences (p < 10−5) are indicated by asterisks (*). The underlying source data for A and C can be found in S1 Data. DIC, differential inference microscopy; WT, wild-type.
Fig 3Glycoproteomic analysis of Hfx. volcanii reveals concurrent AglB- and Agl15-dependent N-glycosylation.
Cellular fractions of WT, Δagl15, and ΔaglB strains were analyzed by LC–MS/MS. The overlap of identified N-glycopeptides (A) and N-glycoproteins (B) between the 3 strains is represented as Venn diagrams. No N-glycopeptides were identified for ΔaglB. Since Hfx. volcanii SLG has been extensively studied previously, the glycosites (numbers indicating the amino acid position after signal peptide cleavage) and corresponding glycans that have been described so far ([6,21,24,36], figure adapted from [37]) are depicted schematically (C). N-glycosites and corresponding N-glycans that have been identified in this study are shown schematically (D). For some N-glycosites, multiple N-glycans were identified, indicated by diagonal lines. Furthermore, N-glycans that have not been identified previously are highlighted by red boxes. It should be noted that N-glycopeptides with shorter versions of the AglB-dependent pentasaccharide have been described previously and identified here as well, but are not depicted separately. LC–MS/MS, liquid chromatography tandem mass spectrometry.
Summary of identified N-glycoproteins.
| HVO ID | Name | Description | PSMs | Dataset(s) | Predicted processing | ||
|---|---|---|---|---|---|---|---|
| HVO_0307 | - | Conserved hypothetical protein | 3 | AglB | 201 | PXD021874; PXD006877; PXD010824; PXD011012; PXD011050; PXD011056 | Sec (SPI) |
| HVO_0504 | - | DUF192 family protein | 1 | AglB | 32 | PXD021874 | Sec (SPI) |
| HVO_0778 | Ths3 | Thermosome subunit 3 | 1 | Agl15 | 4 | PXD011056 | Cyt |
| HVO_0892 | NosD | ABC-type transport system periplasmic substrate-binding protein (probable substrate copper) | 4 | AglB | 193 | PXD021874; PXD011012; PXD011218 | Sec (SPI) |
| HVO_0972 | PilA1 | Pilin PilA | 3 | AglB | 424 | PXD021874; PXD006877; PXD009116; PXD010824; PXD011012; PXD011015; PXD011050; PXD011056; PXD014974 | Pil (SPIII) |
| HVO_1014 | CoxB1 | Cox-type terminal oxidase subunit II | 2 | AglB | 13 | PXD021874 | Sec (SPI) |
| HVO_1030 | - | DUF4382 domain protein | 2 | AglB | 55 | PXD021874; PXD011050 | Sec (lipobox) |
| HVO_1176 | - | Conserved hypothetical protein | 2 | AglB | 54 | PXD021874; PXD009116; PXD011218; PXD013046; PXD014974 | Sec (lipobox) |
| HVO_1210 | ArlA1 | Archaellin A1 | 2 | AglB | 63 | PXD021874; PXD011012; PXD011015; PXD011050 | Pil (SPIII) |
| HVO_1211 | ArlA2 | Archaellin A2 | 1 | AglB | 6 | PXD021874; PXD011012; PXD011050 | Pil (SPIII) |
| HVO_1259 | - | Conserved hypothetical protein | 2 | AglB | 47 | PXD021874 | TM N-term |
| HVO_1530 | AglB | Dolichyl-monophosphooligosaccharide—protein glycotransferase AglB | 2 | AglB | 256 | PXD021874; PXD010824; PXD011012; PXD011050; PXD011056; PXD014974 | ≥ 2 TM |
| HVO_1624 | - | Conserved hypothetical protein | 1 | AglB | 3 | PXD021874 | Tat (lipobox) |
| HVO_1673 | - | Conserved hypothetical protein | 2 | AglB | 116 | PXD021874; PXD009116; PXD011012; PXD011050; PXD011218 | Sec (lipobox) |
| HVO_1749 | - | Conserved hypothetical protein | 2 | AglB | 153 | PXD021874; PXD010824; PXD011012; PXD011050; PXD011056 | Pil (SPIII) |
| HVO_1802 | - | Peptidase M10 family protein | 1 | AglB | 2 | PXD021874 | Sec (lipobox) |
| HVO_1806 | - | Conserved hypothetical protein | 1 | AglB | 8 | PXD021874 | Sec (lipobox) |
| HVO_1870 | - | M50 family metalloprotease | 2 | AglB | 97 | PXD021874; PXD010824; PXD011050 | ≥ 2 TM |
| HVO_1944 | - | Probable transmembrane glycoprotein/HTH domain protein | 1 | AglB | 16 | PXD021874 | Sec (SPI) |
| HVO_1945 | - | Conserved hypothetical protein | 4 | AglB | 183 | PXD021874 | Tat (SPI) |
| HVO_1976 | SecD | Protein-export membrane protein SecD | 2 | AglB | 33 | PXD021874; PXD010824; PXD011012; PXD013046; PXD014974 | Sec (SPI) |
| HVO_1988 | - | GATase domain protein | 1 | AglB | 31 | PXD021874 | Sec (SPI) |
| HVO_2062 | PilA2 | Pilin PilA | 2 | AglB | 56 | PXD021874; PXD011012; PXD011015; PXD011050 | Pil (SPIII) |
| HVO_2066 | - | Conserved hypothetical protein | 1 | AglB, Agl15 | 27 | PXD006877 | Sec (SPI) |
| HVO_2070 | - | Conserved hypothetical protein | 2 | AglB | 100 | PXD021874; PXD011012; PXD011015; PXD013046 | Sec (SPI) |
| HVO_2071 | - | Probable secreted glycoprotein | 4 | AglB | 134 | PXD021874; PXD006877; PXD011012; PXD011050 | Sec (SPI) |
| HVO_2072 | SLG | SLG | 6 | AglB, Agl15 | 2364 | PXD021874; PXD006877; PXD007061; PXD009116; PXD010824; PXD011012; PXD011015; PXD011050; PXD011056; PXD011218; PXD013046; PXD014974 | Sec (SPI) |
| HVO_2074 | - | Probable secreted glycoprotein | 1 | AglB | 2 | PXD021874; PXD011050 | Sec (SPI) |
| HVO_2076 | - | Probable secreted glycoprotein (nonfunctional) | 3 | AglB | 42 | PXD021874 | Sec (SPI) |
| HVO_2081 | - | Pectin lyase domain protein | 3 | AglB | 52 | PXD021874; PXD011050 | Sec (SPI) |
| HVO_2082 | - | Conserved hypothetical protein | 2 | AglB | 117 | PXD021874; PXD006877; PXD011012; PXD011050 | Sec (SPI) |
| HVO_2084 | - | ABC-type transport system permease protein (probable substrate macrolides) | 1 | AglB | 50 | PXD021874 | ≥ 2 TM |
| HVO_2160 | - | Probable secreted glycoprotein | 19 | AglB | 1561 | PXD021874; PXD006877; PXD007061; PXD009116; PXD010824; PXD011012; PXD011050; PXD011056; PXD011218; PXD013046; PXD014974 | Sec (SPI), ArtA |
| HVO_2161 | - | Probable secreted glycoprotein | 1 | AglB | 5 | PXD021874; PXD011012 | Sec (SPI) |
| HVO_2167 | - | Conserved hypothetical protein | 1 | AglB | 12 | PXD021874; PXD006877 | Sec (SPI) |
| HVO_2172 | - | Conserved hypothetical protein | 3 | AglB | 82 | PXD021874 | Sec (SPI) |
| HVO_2173 | - | DUF1616 family protein | 4 | AglB | 83 | PXD021874 | ≥ 2 TM |
| HVO_2533 | - | Conserved hypothetical protein | 5 | AglB | 103 | PXD021874; PXD007061; PXD009116; PXD011012; PXD011056; PXD011218; PXD013046 | Sec (SPI), ArtA |
| HVO_2535 | - | Conserved hypothetical protein | 1 | AglB | 2 | PXD021874 | Sec (lipobox) |
| HVO_2634 | - | Conserved hypothetical protein | 1 | AglB | 40 | PXD011012 | Sec (SPI) |
| HVO_A0039 | - | Conserved hypothetical protein | 2 | AglB | 93 | PXD021874; PXD007061; PXD013046 | Sec (SPI) |
| HVO_A0466 | - | Conserved hypothetical protein | 1 | AglB | 9 | PXD021874 | Sec (SPI) |
| HVO_A0499 | - | Conserved hypothetical protein | 2 | AglB | 27 | PXD021874 | Sec (SPI) |
| HVO_B0194 | - | LppX domain protein | 4 | AglB | 46 | PXD021874 | Sec (SPI) |
| HVO_C0054 | - | Hypothetical protein | 3 | AglB | 29 | PXD021874 | Tat (SPI) |
For each protein that was identified to be N-glycosylated in this study, the HVO ID, name, and description are given together with the number of identified N-glycosites, the N-glycan type(s), the number of corresponding PSMs, and the dataset(s) in which it was identified to be N-glycosylated, and the predicted processing. Datasets are given as PRIDE IDs, and it should be noted that PXD021874 corresponds to the dataset generated for this manuscript, while all other PRIDE IDs correspond to datasets of the ArcPP.
*The processing for this entry has been corrected upon manual curation.
ArcPP, Archaeal Proteome Project; ArtA, archaeosortase A substrate; Cyt, cytosolic; HTH, helix–turn–helix; lipobox, conserved cleavage site motif for lipoproteins; Pil, type IV pilin pathway; PSM, peptide spectrum match; Sec, Sec pathway; SLG, S-layer glycoprotein; SPI, signal peptidase I; SPIII, signal peptidase III; Tat, twin arginine translocation pathway; TM, transmembrane domain.
Fig 4Glycoproteomic analysis of ArcPP datasets extends the Hfx. volcanii N-glycoproteome.
Datasets included in the ArcPP, supplemented by dataset PXD021874 from the current study, were reanalyzed including Hfx. volcanii glycans as potential modifications. (A) The number of identified N-glycopeptides (light gray) and N-glycoproteins (dark gray) for each dataset is shown as a barplot (sorted by the total number of identified proteins; see S2 Fig). Datasets, for which the enzymatic digest was performed with trypsin as well as GluC, are marked with an asterisk (*). (B) For each identified N-glycoprotein, the number of whole proteome datasets that share this identification was determined. The number of N-glycoproteins identified in the given number of datasets is represented as a barplot. This analysis was performed taking into account either only N-glycopeptides (dark gray) or all peptides identified for a protein that was determined to be N-glycosylated in any of the datasets (white). The underlying source data for A and B can be found in S1 Data. ArcPP, Archaeal Proteome Project.
Fig 5Phylogenetic analysis of identified N-glycoproteins reveals conservation within Halobacteria.
The HVO IDs of all identified N-glycoproteins have been subjected to OrthoDB analysis in order to determine orthologues across the archaeal domain. For each N-glycoprotein encoding gene, the number of genera within the different archaeal taxonomic classes is given as a heatmap ranging in color from yellow (48 genera with orthologous proteins) to purple (0 genus). Genes were sorted by the overall number of genera with orthologous proteins, while clusters with multiple N-glycoprotein encoding genes (see S1 Table) were grouped separately from the remaining genes. Genera were grouped by phylum and sorted by the number of orthologues within each phylum, with phyla abbreviated as follows: C., Crenarchaeota; K., Korarchaeota; M., Micrarchaeota. The underlying source data can be found in S1 Data.
Summary of identified N-glycopeptides with noncanonical N-glycosite.
| HVO ID | Name | Description | Datasets | PSMs | Predicted processing | ||
|---|---|---|---|---|---|---|---|
| HVO_0806 | PykA | Pyruvate kinase | Agl15 | IERAGAVD | PXD006877 | 2 | Cyt |
| HVO_2160 | - | Probable secreted glycoprotein | AglB | MPS | PXD021874 | 25 | Sec (SPI) |
| HVO_2173 | - | DUF1616 family protein | AglB | LVRGEPASLVLGVG | PXD021874 | 2 | ≥ 2 TM |
| HVO_2703 | PanB2 | 3-methyl-2-oxobutanoate hydroxymethyl-transferase | AglB | AHAEAGAFSLVLEHVPA | PXD006877 | 2 | Cyt |
| HVO_A0633 | PilA6 | Pilin PilA | AglB | VVWTSESGS | PXD021874 | 20 | Pil (SPIII) |
For each noncanonical N-glycopeptide that was identified, the HVO ID, name, and description are given together with the peptide sequence (N-glycosite is marked in bold), the N-glycan type, the number of corresponding PSMs, the dataset(s) in which it was identified to be N-glycosylated, and the predicted processing.
Cyt, cytosolic; Pil, type IV pilin pathway; PSM, peptide spectrum match; Sec, Sec pathway; SPI, signal peptidase I; SPIII, signal peptidase III; TM, transmembrane domain.
Summary of identified O-glycoprotein candidates.
| HVO ID | Name | Description | Datasets | PSMs | Predicted processing | |
|---|---|---|---|---|---|---|
| HVO_0154 | - | Conserved hypothetical protein | 3 | PXD021874 | 20 | Tat (lipobox) |
| HVO_0306 | - | Probable transmembrane glycoprotein/HTH domain protein | 1 | PXD021874 | 13 | Sec (SPI) |
| HVO_0349 | RpoA1 | DNA-directed RNA polymerase subunit A’ | 1 | PXD006877 | 3 | Cyt |
| HVO_0359 | Tef1a1 | Translation elongation factor aEF-1 alpha/peptide chain release factor aRF-3 | 1 | PXD006877 | 7 | Cyt |
| HVO_0654 | Rpl43e | 50S ribosomal protein L43e | 1 | PXD006877 | 4 | Cyt |
| HVO_0677 | AspS | Aspartate–tRNA(Asp/Asn) ligase | 1 | PXD011056 | 6 | Cyt |
| HVO_0778 | Ths3 | Thermosome subunit 3 | 1* | PXD013046 | 8 | Cyt |
| HVO_0869 | GltB | Glutamate synthase (ferredoxin) large subunit | 1 | PXD006877 | 2 | Cyt |
| HVO_1148 | Rps15 | 30S ribosomal protein S15 | 1 | PXD006877 | 4 | Cyt |
| HVO_1198 | - | UspA domain protein | 1 | PXD006877 | 8 | Cyt |
| HVO_1597 | - | Conserved hypothetical protein | 1 | PXD021874 | 4 | Tat (lipobox) |
| HVO_2071 | - | Probable secreted glycoprotein | 2* | PXD021874 | 12 | Sec (SPI) |
| HVO_2160 | - | Probable secreted glycoprotein | 1* | PXD021874 | 5 | Sec (SPI), ArtA |
| HVO_2172 | - | Conserved hypothetical protein | 1* | PXD021874 | 3 | Sec (SPI) |
| HVO_2226 | TrpD2 | Probable phosphoribosyltransferase (homolog to anthranilate phosphoribosyltransferase) | 1 | PXD006877 | 6 | Cyt |
| HVO_2413 | Tef1a2 | Translation elongation factor aEF-1 alpha/peptide chain release factor aRF-3 | 1 | PXD006877 | 2 | Cyt |
| HVO_2487 | Asd | Aspartate-semialdehyde dehydrogenase | 1 | PXD006877 | 3 | Cyt |
| HVO_2580 | NadB | L-aspartate oxidase | 1 | PXD006877 | 3 | Sec (SPI) |
| HVO_A0380 | DppA8 | ABC-type transport system periplasmic substrate-binding protein (probable substrate dipeptide/oligopeptide) | 1 | PXD021874 | 2 | Tat (lipobox) |
| HVO_B0050 | CobN | ATP-dependent cobaltochelatase subunit CobN | 1 | PXD006877 | 2 | Cyt |
For each protein that has been identified to be likely O-glycosylated in this study, the HVO ID, name and description are given together with the number of identified O-glycopeptides, the number of corresponding PSMs and dataset(s), and the predicted processing.
ArtA, archaeosortase A substrate; Cyt, cytosolic; HTH, helix–turn–helix; lipobox, conserved cleavage site motif for lipoproteins; PSM, peptide spectrum match; Sec, Sec pathway; SPI, signal peptidase I; Tat, twin arginine translocation pathway.
Glycans included as potential modifications in protein database searches.
| Amino acid | Glycan composition | Chemical composition | Unimod ID |
|---|---|---|---|
| N | Hex(1) | C6H10O5 | 41 |
| N | Hex(1)HexA(1) | C6H8O6 | 1427 |
| N | Hex(1)HexA(2) | C18H26O17 | - |
| N | Hex(1)HexA(2)MeHexA(1) | C25H36O23 | - |
| N | Hex(2)HexA(2)MeHexA(1) | C31H46O28 | - |
| N | SO3Hex(1) | C6H10O8S1 | - |
| N | SO3Hex(1)Hex(1) | C12H20O13S1 | - |
| N | SO3Hex(1)Hex(2) | C18H30O18S1 | - |
| N | SO3Hex(1)Hex(2)dHex(1) | C24H40O22S1 | - |
| S/T | Hex(2) | C12H20O10 | 512 |
Given are the glycan composition (dHex, deoxyhexose; Hex, hexose; HexA, hexuronic acid; MeHexA, methyl-hexuronic acid; SO3Hex, sulfated hexose), the chemical composition, the Unimod ID, and the amino acid that has been specified as potential site of modification.