Nicholas M Riley1, Stacy A Malaker1, Marc D Driessen1, Carolyn R Bertozzi1,2. 1. Department of Chemistry, Stanford University, Stanford, California 94305-6104, United States. 2. Howard Hughes Medical Institute, Stanford, California 94305-6104, United States.
Abstract
Site-specific characterization of glycosylation requires intact glycopeptide analysis, and recent efforts have focused on how to best interrogate glycopeptides using tandem mass spectrometry (MS/MS). Beam-type collisional activation, i.e., higher-energy collisional dissociation (HCD), has been a valuable approach, but stepped collision energy HCD (sceHCD) and electron transfer dissociation with HCD supplemental activation (EThcD) have emerged as potentially more suitable alternatives. Both sceHCD and EThcD have been used with success in large-scale glycoproteomic experiments, but they each incur some degree of compromise. Most progress has occurred in the area of N-glycoproteomics. There is growing interest in extending this progress to O-glycoproteomics, which necessitates comparisons of method performance for the two classes of glycopeptides. Here, we systematically explore the advantages and disadvantages of conventional HCD, sceHCD, ETD, and EThcD for intact glycopeptide analysis and determine their suitability for both N- and O-glycoproteomic applications. For N-glycopeptides, HCD and sceHCD generate similar numbers of identifications, although sceHCD generally provides higher quality spectra. Both significantly outperform EThcD methods in terms of identifications, indicating that ETD-based methods are not required for routine N-glycoproteomics even if they can generate higher quality spectra. Conversely, ETD-based methods, especially EThcD, are indispensable for site-specific analyses of O-glycopeptides. Our data show that O-glycopeptides cannot be robustly characterized with HCD-centric methods that are sufficient for N-glycopeptides, and glycoproteomic methods aiming to characterize O-glycopeptides must be constructed accordingly.
Site-specific characterization of glycosylation requires intact glycopeptide analysis, and recent efforts have focused on how to best interrogate glycopeptides using tandem mass spectrometry (MS/MS). Beam-type collisional activation, i.e., higher-energy collisional dissociation (HCD), has been a valuable approach, but stepped collision energy HCD (sceHCD) and electron transfer dissociation with HCD supplemental activation (EThcD) have emerged as potentially more suitable alternatives. Both sceHCD and EThcD have been used with success in large-scale glycoproteomic experiments, but they each incur some degree of compromise. Most progress has occurred in the area of N-glycoproteomics. There is growing interest in extending this progress to O-glycoproteomics, which necessitates comparisons of method performance for the two classes of glycopeptides. Here, we systematically explore the advantages and disadvantages of conventional HCD, sceHCD, ETD, and EThcD for intact glycopeptide analysis and determine their suitability for both N- and O-glycoproteomic applications. For N-glycopeptides, HCD and sceHCD generate similar numbers of identifications, although sceHCD generally provides higher quality spectra. Both significantly outperform EThcD methods in terms of identifications, indicating that ETD-based methods are not required for routine N-glycoproteomics even if they can generate higher quality spectra. Conversely, ETD-based methods, especially EThcD, are indispensable for site-specific analyses of O-glycopeptides. Our data show that O-glycopeptides cannot be robustly characterized with HCD-centric methods that are sufficient for N-glycopeptides, and glycoproteomic methods aiming to characterize O-glycopeptides must be constructed accordingly.
Entities:
Keywords:
ETD; EThcD; N-glycopeptides; O-glycopeptides; electron transfer dissociation; fragmentation; glycoproteomics; sceHCD; stepped collision energy high-energy collisional dissociation; tandem MS
Protein
glycosylation is a complex post-translational modification
that governs a diverse range of biological functions, serving as a
biophysical and biochemical interface at the cell surface.[1] Glycosylation can be grouped into two main classes, N- and O-linked, where glycans are attached
at asparagine or serine/threonine residues, respectively. The pool
of glycans that decorate proteins is heterogeneous, which leads to
extensive microheterogeneity across glycosites; moreover, N-glycosites fundamentally differ from O-glycosites in both the glycans that modify them and the regions
of proteins where they generally occur.[2] Thus, intact glycopeptide characterization, which provides the opportunity
to probe microheterogeneity by localizing glycan modifications to
specific residues, is an imperative, yet challenging component to
modern glycoproteomic analysis. Tandem mass spectrometry (MS/MS) serves
as the center piece in these efforts, but the path to glycopeptide
identification is not one-dimensional. Numerous approaches comprise
the glycoproteomics toolkit, and efforts to improve our analytical
methods are ongoing.[3−9]Beam-type collisional activation, termed higher-energy collisional
dissociation (HCD) on Orbitrap systems,[10] and electron transfer dissociation (ETD) are two of the more widely
used MS/MS dissociation methods for glycopeptide characterization.[10−15] They are complementary to each other; ETD generates mostly c/z·-type
peptide backbone fragments that retain intact glycan moieties with
few glycan dissociation events, while HCD fragments glycans and also
produces b/y-type peptide backbone fragments that tend to lose all
or part of their glycan modifications during the activation process.[16−20] Many approaches pair the two dissociation methods within the same
analysis to capitalize on their complementary nature.[21−29] In fact, HCD followed by product-dependent ETD (HCD-pd-ETD) has
become arguably the most common glycoproteomic method to incorporate
ETD. Here, glycopeptide-specific oxonium ions derived from glycan
fragmentation in “scout HCD” scans are used to trigger
subsequent ETD fragmentation of the putative glycopeptide precursor.[30−32]One challenge of ETD-based fragmentation is poor dissociation
efficiency,
especially for low-charge-density precursors like glycopeptides.[15] To address this issue, hybrid methods that use
vibrational activation to provide supplemental energy for ETD reactions
have emerged and gained traction in glycoproteomics, with the most
popular being ETD followed by supplemental HCD (EThcD).[33−45] Even so, HCD remains widely used in N-glycoproteomics.[46−57] Tryptic N-glycopeptides tend to harbor only one
potential glycosite, as defined by its sequon N-X-S/T, where X represents
any amino acid other than proline, which limits dependence on peptide
fragments that retain intact glycans. HCD of N-glycopeptides
also often generates b/y-type fragments that retain an N-acetylglucosamine (GlcNAc) moiety, which provide clues to glycosite
localization. Conversely, O-glycopeptides generally
have multiple serine and/or threonine residues that serve as potential
glycosites, so O-glycoproteomic methods largely utilize
ETD and EThcD to localize modified residues using c/z·-type fragments
that retain the intact glycan.[58−74]Recently, several groups have observed that higher HCD collision
energies tend to provide better peptide backbone fragmentation, while
lower collision energies are often advantageous for glycan fragmentation
and, as such, have opted for stepped collision energy HCD (sceHCD)
methods.[75−77] In the sceHCD regime, total precursor ion accumulation
time per scan is divided into multiple (usually three) equal parts,
and ions accumulated in each separate event are fragmented at different
HCD collision energies. Product ions from each dissociation step are
collected in the same reaction cell prior to mass analysis and are
then analyzed together in one MS/MS scan. In 2017, Liu et al. used
sceHCD methods for the identification of ∼10,000 N-glycosites from five mouse tissues,[78] which contributed to its popularity in recent N-glycopeptide analysis.[79−86] A few studies have looked to extend the application of sceHCD to O-glycopeptides,[75,76,79,87,88] but this has not been as widespread. Limited comparisons between
sceHCD and ETD methods have been performed for N-glycopeptides,[78] as have comparisons of HCD and EThcD for O-glycopeptides;[64,69] however, a comprehensive
head-to-head comparison of standard HCD, sceHCD, ETD, and EThcD has
not been reported.Here, we systematically explore the advantages
and disadvantages
of sceHCD and EThcD for intact glycopeptide analysis, put the methods
in context with their canonical HCD and ETD counterparts, and comment
on their suitability for both N- and O-glycoproteomic applications. We test 14 product-dependent triggering
methods (i.e., HCD-pd-X, where X is an MS/MS dissociation type) and
also evaluate several HCD-pd-X methods relative to traditional data-dependent
acquisition (DDA). We compare standard HCD, sceHCD, ETD, and EThcD
for tryptic N-glycopeptides generated from a panel
of glycoprotein standards, in addition to N-glycopeptides
enriched from tryptic digests of HEK 293 whole cell lysates. We also
test each method using O-glycopeptides generated
with the recently characterized mucinase, StcE, to comment on O-glycoproteomic performance.[89] In all, we show that while HCD and sceHCD are sufficient for most N-glycoproteomic applications, they are ill-suited for site-specific O-glycoproteomic analysis. Instead, EThcD is the premier
choice for O-glycopeptide characterization, despite
excitement about the potential of sceHCD. We also discuss how these
results affect continued efforts toward improving our analytical toolkit,
including choices of instrument platforms and software development
for data analysis.
Experimental Section
A standard
glycoprotein mixture, a pool of N-glycopeptides
enriched from HEK293 whole cell lysate, and a mixture of recombinant
mucins were analyzed using multiple MS/MS dissociation methods. The
standard glycoprotein mixture consisted of eight glycoproteins: bovine
fetuin (P12763), bovinealpha-1-acid glycoprotein (Q3SZR3), recombinant
humanhemopexin (P02790), recombinant humanCD14 (P08571), humanfibronectin
(P02751), humanplasma protease C1 inhibitor (C1inh) (P05155), recombinant
humanCD59 (P13987), and recombinant human platelet glycoprotein 1b
alpha (GP1ba) (P07359). Figure S1 depicts
glycosites in these standard glycoproteins. Twenty micrograms of each
protein was combined prior to tryptic digestion, and approximately
2 μg of total peptide was injected per LC–MS/MS analysis.
HEK293 cells were lysed, 1 mg was digested with trypsin using an S-trap
protocol,[90] and glycopeptides were enriched
using an SAX-ERLIC solid-phase extraction method[24] prior to LC–MS/MS. The mucin mixture consisted of
recombinant humanGP1ba (P07359), recombinant humanleukosialin (CD43)
(P16150), recombinant humanMUC16 (Q8WXI7.3), and recombinant humanP-selectin glycoprotein ligand 1 (PSGL1) (Q14242). Proteins (10 μg
each) were digested individually similar to previously described methods
using a 3 h StcE digestion, followed by an overnight PNGaseF incubation
and a 12 h tryptic digestion.[89] After digestion,
peptides were combined in equal parts by mass for the four proteins
and analyzed by LC–MS/MS (approximately 2 μg total peptides
per injection). Fourteen product-dependent methods were constructed
using different dissociation types as the triggered scan, i.e., HCD-pd-X,
where X is a dissociation type defined in Figure a. The numbers used in all methods indicate
normalized collision energy (nce) settings used for collisional dissociation,
and “A ± B” values for sceHCD methods indicate the central nce (A) and the step
size (B) in either direction from the central value. To construct
these methods, we explored nine HCD collision energies individually
to understand how each collision energy used in standard HCD and sceHCD
contribute to performance (Figure S2).
(Glyco)peptide mixtures were separated using an Easy-Spray column
packed with C18 PepMap material and a Dionex UltiMate 3000 LC pump.
All LC–MS/MS methods were 90 min total, and each method was
run in technical triplicate, except for the HEK293glycopeptides,
which we only injected once per dissociation method. Scout HCD scans
used a normalized collision energy of 36, a resolving power of 30,000
at 200 m/z, and an automatically
determined scan range (Auto Normal) calculated based on precursor m/z with the first mass set to 100 m/z. Triggered MS/MS scans utilized the
Orbitrap high mass range (120 to 4000 m/z), which has been shown to benefit glycopeptide analysis,[91] and a resolving power of 30,000. ETD and EThcD
methods used calibrated charge-dependent parameters for calculating
reagent AGC targets and ion–ion reaction times.[92] Product-dependent triggering required at least
two ions from the following list to be present in the top 20 most
abundant peaks in a spectrum within a 25 ppm tolerance: 126.055, 138.0549,
144.0655, 168.0654, 186.076, 204.0865, 274.0921, 292.1027, and 366.1395 m/z. Several 90 min standard DDA methods
were also tested, where the desired dissociation method was used for
all precursors without a scout HCD or triggering event. All raw data
were searched using Byonic.[93] For the standard
glycoprotein mix, N- and O-glycopeptide
searches were conducted separately,[94] each
using the same fasta sequence file specific to the mixture. Glycopeptides
from the HEK293 lysate were searched using a focused database[95] created from prior data-dependent proteomic
analyses. MucinO-glycopeptides were searched using
a specific mucin fasta sequence file. The N-glycan
database for both the standard glycoprotein mixture and HEK293N-glycopeptide searches consisted of 286 unique compositions
of which HexNAc(1) was not included. The O-glycan
database used for O-glycopeptide searches consisted
of nine common O-glycans. After Byonic searches,
result files were filtered and fragmentation statistics were calculated
using scripts written in C# using the C# Mass Spectrometry Language
(CSMSL, https://github.com/dbaileychess/CSMSL). Filtering Byonic search results is necessary to retain only high-quality
identifications and minimize false positives.[96] Filtering metrics included a Byonic score greater than or equal
to 200, a logProb value greater than or equal to 2, and a peptide
length greater than 4 residues. A maximum of three glycosites were
allowed for any one glycopeptide. Data was graphed using OriginPro
2018. For box plots, median and quartile values are provided by the
center line and box boundaries, respectively. Whiskers show 10th and
90th percentiles, and the small square indicates the average. More
method details are available in the Supporting Information.
Figure 1
Comparing glycopeptide
dissociation methods for N- and O-glycopeptides. (a) Several product-dependent
(pd) methods, i.e., HCD-pd-X, were constructed to investigate glycopeptide
fragmentation quality, where X refers to different the dissociation
types shown. We compared ETD, EThcD with several collision energies,
several conventional HCD collision energies, and several stepped collision
energy (sce) HCD methods, where the number after the method indicates
the normalized collision energy used and “±” in
sceHCD methods indicates the step size in energy from the center value
provided. Each method has an assigned letter (A–N), which is
used for identification in subsequent figures. Schematics illustrate
(b) sceHCD30 ± 10 and (c) EThcD fragmentation prior to mass analysis.
Comparing glycopeptide
dissociation methods for N- and O-glycopeptides. (a) Several product-dependent
(pd) methods, i.e., HCD-pd-X, were constructed to investigate glycopeptide
fragmentation quality, where X refers to different the dissociation
types shown. We compared ETD, EThcD with several collision energies,
several conventional HCD collision energies, and several stepped collision
energy (sce) HCD methods, where the number after the method indicates
the normalized collision energy used and “±” in
sceHCD methods indicates the step size in energy from the center value
provided. Each method has an assigned letter (A–N), which is
used for identification in subsequent figures. Schematics illustrate
(b) sceHCD30 ± 10 and (c) EThcD fragmentation prior to mass analysis.
Results and Discussion
We systematically
compare HCD, sceHCD, ETD, and EThcD for their
performance in characterizing intact N- and O-glycopeptides. With variations in normalized collision
energies among these four dissociation types, we created 14 product-dependent
methods. Figure a
shows the method structure we chose for comparing these methods, where
we use product-dependent triggering to maximize the time spent on
glycopeptide analysis. In these HCD-pd-X methods, a scout HCD scan
provides product ions from collisional dissociation of precursors
in a data-dependent fashion. The presence of glycopeptide-specific
oxonium ions (see Experimental Section and
Supporting Information) then triggers a scan of specific dissociation
type X to fragment the glycopeptide, where X is one of the 14 methods
shown. Note that the letters next to each method are used as identification
codes in subsequent figures. Figure b,c depicts sceHCD and EThcD scan events, respectively,
to illustrate what happens to ions prior to mass analysis. We consider
several figures of merit beyond merely numbers of identifications
as we compare methods, including (1) degree of peptide backbone sequence
coverage, (2) degree of glycan sequence coverage, (3) proportion of
signal in different fragment ion types (i.e., oxonium ions, Y-type
ions, and peptide backbone fragment ions), (4) percentage of spectra
that enable confident glycosite localization, (5) percentage of spectra
that contain fragments with glycans (intact or fragments) retained,
and (6) proportions of total ion current that can be confidently annotated/explained
in identified spectra. In order to ensure quality identifications,
glycopeptide spectral matches (glycoPSMs) returned from Byonic for
all methods were filtered to have a Byonic score greater than or equal
to 200, a logProb score greater than or equal to 2, and a peptide
length equal to five residues or greater.
N-Glycopeptides
The average number
of localized N-glycopeptide spectral matches (N-glycoPSMs) for each method is summarized in Figure a. Localization here, and throughout
this work, is defined as the unambiguous assignment of a glycosite
within a glycopeptide (discussed further below). The advantage for
generating identifications is clear for HCD and sceHCD methods, and
both HCD35 and HCD40 outperform sceHCD methods in terms of identification
numbers. Looking at peptide sequence coverage and glycan sequence
coverage in Figure b,c, however, it is clear that HCD methods sacrifice peptide fragmentation
quality for glycan fragmentation quality or vice versa, while sceHCD
methods provide quality fragmentation for both moieties. sceHCD30
± 10, sceHCD30 ± 18, and sceHCD35 ± 15 all provide
good peptide and glycan sequence coverage with similar identification
numbers, with a slight identification advantage for sceHCD30 ±
10. EThcD15, EThcD25, and EThcD35 all generate superior peptide sequence
coverage for all methods, and EThcD25 also excels at glycan fragmentation.
However, ETD and EThcD scans are significantly slower than HCD and
sceHCD scans, resulting in fewer MS/MS acquisitions (Figure S3). This speed issue limits their effectiveness compared
to the collision-based alternatives despite the superior fragmentation
quality.
Figure 2
Collision-based methods are sufficient for N-glycopeptides.
(a) Average number of localized N-glycopeptide spectral
matches (N-glycoPSMs) is shown for technical triplicate
analyses of tryptic peptides generated from a mixture of eight glycoproteins.
Error bars show one standard deviation. Box plots show the distribution
of (b) peptide backbone sequence coverage (i.e., the proportion of
peptide backbone bonds that can be explained by fragment ions) and
(c) glycan sequence coverage (i.e., the proportion of glycosidic bond
cleavages observed) for N-glycopeptides identified
with each method. Letters on the x-axes (A–N)
correspond to the labels in Figure and are grouped by method type.
Collision-based methods are sufficient for N-glycopeptides.
(a) Average number of localized N-glycopeptide spectral
matches (N-glycoPSMs) is shown for technical triplicate
analyses of tryptic peptides generated from a mixture of eight glycoproteins.
Error bars show one standard deviation. Box plots show the distribution
of (b) peptide backbone sequence coverage (i.e., the proportion of
peptide backbone bonds that can be explained by fragment ions) and
(c) glycan sequence coverage (i.e., the proportion of glycosidic bond
cleavages observed) for N-glycopeptides identified
with each method. Letters on the x-axes (A–N)
correspond to the labels in Figure and are grouped by method type.We next compared the types of localization evidence each method
generated for N-glycosites (Figure ). There are three ways to localize glycosites:
(1) intact fragments, where peptide backbone fragments retain glycan
moieties to enable unambiguous glycosite assignment; (2) HexNAc-retaining
fragments, where peptide backbone fragments lose most of the glycan
moiety but retain the initiating HexNAc monosaccharide for a mass
shift of +203.0794 Da (for N-glycopeptides, this
is a GlcNAc residue) to show which amino acid harbored the glycan;
and (3) the presence of only one potential glycosite in the peptide
sequence. More than 90% of the total N-glycoPSMs
that passed the post-Byonic filtering were localized successfully
for EThcD methods, sceHCD methods, and HCD30-40 methods (Figure S4a). Figure a shows that the majority (>∼90%)
of localized N-glycoPSMs from ETD, EThcD15, and EThcD25
spectra have evidence for localization via intact (c/z·-type)
fragments, compared to only ∼60% of EThcD35 localized N-glycoPSMs. Recent work has shown that the site of glycan
attachment and the glycan itself can affect localization with ETD.[97,98] Here, we see that EThcD methods identify N-glycopeptides
with glycosites distributed more evenly across the peptide backbone
(Figure S5), potentially mitigating some
glycan/glycosite localization dependency of ETD.
Figure 3
Evidence for localized
glycosites in N-glycopeptides.
Three panels at the top provide the percentage total localized identifications
that can be explained using (a) intact peptide backbone fragments
(i.e., that have no glycan neutral losses), (b) peptide backbone fragments
that retain the +203.0794 Da mass shift to indicate a remaining HexNAc
fragment, or (c) presence of only one potential N-glycosite (i.e., one N-X-S/T sequon). Letters on the y-axes (A–N) correspond to the labels in Figure and are grouped by method type. (d) Example
of localization using intact fragments in an EThcD25 spectrum (precursor m/z: 1107.2990, z: 4).
Ions with a red star are important for localization, the majority
of which retain the intact glycan mass. (e) Example of localization
using HexNAc-retaining (+203.0794) fragments in a sceHCD30 ±
10 spectrum (precursor m/z: 1040.9621, z: 4). Red stars show peptide ions that usefully retain
the HexNAc moiety to show where the glycosite is. Blue circles indicate
peptide fragments that did not retain the HexNAc mass and are not
useful for localization, a common phenomenon in HCD and sceHCD spectra.
Note, a fragment with a “∼” denotes a peptide
backbone fragment that does not retain any glycan, and both panels
(d) and (e) show tryptic peptides from C1inh. Byonic color coding
of annotated fragments show N-terminal peptide fragment
ions in blue, C-terminal peptide fragment ions in
red, and glycan-derived fragment ions in green.
Evidence for localized
glycosites in N-glycopeptides.
Three panels at the top provide the percentage total localized identifications
that can be explained using (a) intact peptide backbone fragments
(i.e., that have no glycan neutral losses), (b) peptide backbone fragments
that retain the +203.0794 Da mass shift to indicate a remaining HexNAc
fragment, or (c) presence of only one potential N-glycosite (i.e., one N-X-S/T sequon). Letters on the y-axes (A–N) correspond to the labels in Figure and are grouped by method type. (d) Example
of localization using intact fragments in an EThcD25 spectrum (precursor m/z: 1107.2990, z: 4).
Ions with a red star are important for localization, the majority
of which retain the intact glycan mass. (e) Example of localization
using HexNAc-retaining (+203.0794) fragments in a sceHCD30 ±
10 spectrum (precursor m/z: 1040.9621, z: 4). Red stars show peptide ions that usefully retain
the HexNAc moiety to show where the glycosite is. Blue circles indicate
peptide fragments that did not retain the HexNAc mass and are not
useful for localization, a common phenomenon in HCD and sceHCD spectra.
Note, a fragment with a “∼” denotes a peptide
backbone fragment that does not retain any glycan, and both panels
(d) and (e) show tryptic peptides from C1inh. Byonic color coding
of annotated fragments show N-terminalpeptide fragment
ions in blue, C-terminal peptide fragment ions in
red, and glycan-derived fragment ions in green.HCD methods steadily decrease their proportion of b/y-type fragments
that retain intact glycans as collision energies increase, while sceHCD25
± 15 generates the most localized N-glycoPSMs
with intact glycan-retaining fragments (∼20% of spectra), followed
by sceHCD30 ± 10. On the other hand, HCD30 provides the largest
proportion of N-glycoPSMs that can be localized with
HexNAc-retaining fragments (∼80%), and sceHCD30 ± 10 and
sceHCD35 ± 5 are highest of the sceHCD methods (just under ∼70%)
(Figure b). Regardless
of the spectral evidence provided by intact or HexNAc-retaining peptide
backbone fragments, the majority (>93%) of all N-glycoPSMs
could be localized purely in the presence of only one N-glycosite, except for ETD and HCD (∼82% each), meaning that
little spectral evidence is needed for a confident localization (Figure c). That said, N- and O-glycosites can be contained within
the same glycopeptide, which would confound this one glycosite assumption
and would thus require spectral evidence for localization. Furthermore,
for longer glycopeptides that are characterized using middle-down
approaches, the presence of multiple N-glycosites
necessitates the use of electron-driven activation to generate intact
fragments to properly localize each glycan.[99] Only a handful of N-glycoPSMs here were identified
with multiple N-glycosites (mostly using ETD-based
methods), with approximately five identifications having spectral
evidence to localize both glycan modifications (Figure S6). In this dataset, >90% of ETD, EThcD15, and
EThcD25
localized N-glycoPSMs have spectral evidence for
localization (i.e., intact and/or HexNAc-retaining peptide backbone
fragments), while only ∼60–80% of localized N-glycoPSMs are supported by spectral evidence for EThcD35,
HCD methods, and sceHCD methods (Figure S4b). Example spectra from C1inh-derived N-glycopeptides
with similar glycan modifications illustrate the two different special
evidence types, i.e., intact fragments in EThcD25 (Figure d) and HexNAc-retaining fragments
in sceHCD30 ± 10 (Figure e).Figure S7 illustrates
the number of
different product ion types each method generates in N-glycoPSMs, including peptide backbone fragments, peptide backbone
fragments that have a glycan neutral loss, peptide backbone fragments
that retain a HexNAc moiety, Y-type ions (which represent an intact
peptide attached to a fragment of the original glycan broken along
glycosidic bonds), and oxonium/ B-type ions that represent only glycan
moieties. Note that “peptide backbone fragments” include
both those that are not expected to harbor a glycan and those that
are seen with an intact glycan, while “glycan neutral loss
fragments” include peptide fragments that have fully lost the
glycan or retain only a HexNAc remnant. Interestingly, while peptide
backbone fragments retaining the intact glycan mass can be observed
in HCD and sceHCD, they are far more the exception than the rule (Figure S7b). Peptide fragments retaining the
intact glycan are readily observed in ETD and EThcD spectra but become
less frequent as supplemental HCD collision energy increases in EThcD.
Also, while Y-type fragments can be useful for indicating some glycan
structural information, some Y-type ions observed here are likely
the result of more than one glycosidic cleavage, which are not as
useful or reliable for structural determination. ETD and EThcD methods
generate more peptide backbone fragments, with a small fraction of
HexNAc-remnant fragments being present, while HCD and sceHCD methods
can produce nearly as many neutral loss fragments as standard peptide
fragments. Approximately half of the neutral loss fragments in HCD
and sceHCD spectra are HexNAc-remnant fragments, although this differs
slightly based on the method. Figure S8 summarizes these distributions by comparing the median number of
fragments for each method, delineated by fragment type. The trends
in numbers of fragments explain the sequence coverages seen in Figure , and they translate
to the amount of explainable signal (total ion current) in spectra
from each dissociation type (Figure S9). Figure S10 shows the distribution of explainable
signal between four different fragment types. EThcD 25 distributes
signal between peptide backbone fragments, Y-type ions, and oxonium
ions most evenly of any dissociation method while also minimizing
the signal from peptide backbone fragments with neutral losses. The
majority of signal in HCD and sceHCD spectra is in glycan-related
channels, i.e., Y-type fragments and oxonium ions, although peptide
backbone fragment signal generally increases with higher collision
energies. In general, EThcD methods provide the highest quality fragmentation.We repeated our comparison of methods for N-glycopeptides
enriched from HEK293 whole cell lysate using eight of the methods
tested for the standard glycoprotein mixture. Figure a shows the number of N-glycoPSMs
identified for the eight methods, and it also provides a comparison
to standard DDA analyses for two HCD and two sceHCD methods. The superior
performance of the HCD and sceHCD is again evident, but perhaps more
striking is the significantly higher number of identifications with
standard DDA methods compared to product-dependent methods. One reason
for this is that the identifications in the scout HCD scans in HCD-pd-X
methods have not been included in our results so far (as to not confound
data interpretation of each individual method), which removes one-third
to one-half of total N-glycoPSMs. Figure b demonstrates that the two
approaches are more evenly matched when also including N-glycoPSMs from scout HCD scans, although the standard DDA methods
still have the slight advantage. Note that the advantages of fragmentation
quality, especially for sceHCD methods, are not applicable to the
identifications from scout HCD scans. These results highlight that
product-dependent methods may not be necessary in samples that have
been enriched for N-glycopeptides; the majority of
precursors in such samples are glycopeptides, and thus screening precursors
via the scout HCD scan is unnecessary. This does not hold true for
the standard glycoprotein mix, where HCD-p-X methods significantly
outperform standard DDA methods (Figure c). The standard glycoprotein mixture was
not enriched, meaning that many nonglycosylatedpeptides are present
along with glycopeptides. This discrepancy highlights how product-dependent
methods are advantageous for samples with low N-glycopeptide
enrichment efficiency (i.e., where little to no enrichment is performed),
but they are not always necessary for high enrichment efficiency samples.
In all, our data from both the standard glycoprotein mixture and the
HEK293N-glycopeptides show that HCD and sceHCD methods
are sufficient for standard N-glycoproteomics, despite
the superior spectral quality of EThcD methods.
Figure 4
Trends hold true for N-glycopeptides enriched
from complex lysate, but HCD-pd-X methods are not always necessary.
(a) Numbers of localized N-glycoPSMs from glycopeptides
enriched from HEK293 whole cell lysate are shown for a select number
of HCD-pd-X experiments. Also shown are four methods where product-dependent
triggering was not used, but instead, the dissociation method was
used for every precursor (i.e., a standard DDA method, gray box).
(b) Numbers of localized N-glycoPSMs from enriched
HEK293 lysate are shown for standard DDA methods (red) and for HCD-pd-X
methods, where identifications are delineated as including only those
from X fragmentation (yellow) or from both the scouting HCD and X
spectra (blue). (c) Same comparison as in panel (b) of standard DDA
and HCD-pd-X methods is shown for localized N-glycoPSMs
from the mixture of standard glycoproteins. Note that the y-axes for all three panels are the same, with the definition
at the left, and the three-color legend is only for panels (b) and
(c).
Trends hold true for N-glycopeptides enriched
from complex lysate, but HCD-pd-X methods are not always necessary.
(a) Numbers of localized N-glycoPSMs from glycopeptides
enriched from HEK293 whole cell lysate are shown for a select number
of HCD-pd-X experiments. Also shown are four methods where product-dependent
triggering was not used, but instead, the dissociation method was
used for every precursor (i.e., a standard DDA method, gray box).
(b) Numbers of localized N-glycoPSMs from enriched
HEK293 lysate are shown for standard DDA methods (red) and for HCD-pd-X
methods, where identifications are delineated as including only those
from X fragmentation (yellow) or from both the scouting HCD and X
spectra (blue). (c) Same comparison as in panel (b) of standard DDA
and HCD-pd-X methods is shown for localized N-glycoPSMs
from the mixture of standard glycoproteins. Note that the y-axes for all three panels are the same, with the definition
at the left, and the three-color legend is only for panels (b) and
(c).
O-Glycopeptides
We first searched
the standard glycoprotein mixture data set for O-glycopeptides
because several of these glycoproteins are known to have O-glycosites. While some spectra, especially EThcD spectra, were confidently
identified, the number of localized O-glycoPSMs was
lower than desired to draw conclusions (Figure S11). Instead, we opted to generate a new sample for O-glycopeptide interrogation using the professional mucinase,
StcE, which cleaves specifically in glycosylatedmucin domains.[89] StcE is particularly important for characterizing
densely O-glycosylatedmucin proteins because mucin
domains are largely impervious to other proteases. Canonical proteolysis
of mucins (e.g., with trypsin, chymotrypsin) generates O-glycopeptides tens to hundreds of residues in length, comprising
mostly serine, threonine, and proline residues (so-called PTS domains).
Furthermore, the majority of serine and threonine residues in these
stretches are O-glycosylated. These O-glycopeptides are effectively impossible to sequence at all, much
less with any site specificity of O-glycosite localization.
StcE recognizes glycosylatedserine and threonine residues in these
PTS domains, cleaving to produce O-glycopeptides
more amenable to MS analysis. Using PNGaseF to deglycosylate N-glycosites and a combination of StcE and trypsin for peptide
backbone proteolysis, we digested four recombinant mucins and analyzed
them with 12 HCD-pd-X methods (Figure ). Contrary to the N-glycopeptide
analysis above, EThcD significantly outperformed all other methods
for O-glycopeptide identification (Figure a), even with similar differences
in acquisition rate seen in the N-glycopeptide data
set (Figure S3). Surprisingly, peptide
sequence coverage was consistently good across EThcD, HCD, and sceHCD
data for O-glycopeptide spectra (Figure b). Glycan sequence coverage
was moderate for EThcD and HCD methods, was nonexistent for ETD (which
generates virtually no Y-type fragments), and was most favorable for
sceHCD methods except sceHCD35 ± 5 (Figure c).
Figure 5
EThcD methods are significantly better at O-glycopeptide
characterization. (a) Average number of localized O-glycopeptide spectral matches (O-glycoPSMs) is shown for O-glycopeptides generated from four recombinant mucin glycoproteins
after enzymatic treatment with PNGaseF, trypsin, and StcE. Error bars
show one standard deviation. Box plots show the distribution of (b)
peptide backbone sequence coverage and (c) glycan sequence coverage
for O-glycopeptides identified with each method.
Letters on the x-axes (A–D, F–I, and
K–N) correspond to the labels in Figure and are grouped by method type.
EThcD methods are significantly better at O-glycopeptide
characterization. (a) Average number of localized O-glycopeptide spectral matches (O-glycoPSMs) is shown for O-glycopeptides generated from four recombinant mucin glycoproteins
after enzymatic treatment with PNGaseF, trypsin, and StcE. Error bars
show one standard deviation. Box plots show the distribution of (b)
peptide backbone sequence coverage and (c) glycan sequence coverage
for O-glycopeptides identified with each method.
Letters on the x-axes (A–D, F–I, and
K–N) correspond to the labels in Figure and are grouped by method type.The superior performance of EThcD was enabled by the retention
of intact glycan moieties on peptide backbone fragment ions (Figure ). ETD, EThcD, HCD,
and sceHCD all produced sufficient numbers of peptide backbone fragments
(Figure b, Figure S7a), but the majority of HCD and sceHCDpeptide fragments had glycan neutral losses (Figure S7b). In contrast, ∼99% of localized O-glycoPSMs could be localized using intact peptide backbone fragments
for ETD, EThcD15, and EThcD25 (∼94% for EThcD35). While some
HexNAc-retaining fragments were detected in EThcD, HCD, and sceHCD
spectra (Figure S7c), these are often not
sufficient for glycosite localization in O-glycopeptides
because multiple serine and/or threonine residues lead to ambiguity.
This is further supported by the lower percentage of O-glycoPSMs that could be localized due to the presence of only one
potential glycosite (Figure b). This is largely expected for O-glycopeptides
derived from mucins, which have dense regions of glycosylation and
repeating domains rich in serine and threonines, but 68% of tryptic
peptides from the standard glycoprotein mixture also harbor more than
one serine or threonine (Figure S12b),
indicating that this is a phenomenon common to O-glycopeptides.
Consequently (and similarly to N-glycopeptides, Figure S6), multiply glycosylatedO-glycopeptides were detected in ETD and EThcD methods while the O-glycopeptides that were identified by HCD and sceHCD were
exclusively singly modified species (Figure c).
Figure 6
O-glycopeptides require localization
using intact
fragments, which enables localization of multiple O-glycosites per peptide. (a) Percentage of total localized O-glycopeptide identifications that can be explained using
intact peptide backbone fragments. (b) Percentage of total localized O-glycopeptide identifications that can be explained by
the presence of only one potential O-glycosite. (c)
Proportions of localized O-glycoPSMs that were identified
with one, two, or three glycosites. Letters (A–D, F–I,
and K–N) on the y-axes for panels (a) and
(b) (and on the x-axis for panel (c)) correspond
to the labels in Figure and are grouped by method type.
Figure 7
Examples
of why HCD fails and EThcD succeeds at O-glycosite
localization. The peptide sequence TKPVSLLESTKKTIPELDQPPK,
generated by combined StcE and trypsin cleavage of GP1b alpha, was
detected with many different glycoforms in all methods. For HCD methods,
the glycosites and respective glycan compositions were defined without
any spectral evidence, as exemplified in panel (a), leading to incorrect
localization (precursor m/z: 1014.4911, z: 4). No localized glycoPSM of this precursor was identified
in any HCD or sceHCD analyses. Note that a fragment with a “∼”
denotes a peptide backbone fragment that does not retain any glycan.
Panel (b) is an EThcD spectrum of same precursor, where two glycosites
are confidently localized with defined glycan masses based on direct
observation of intact peptide backbone product ions. Panels (c) and
(d) show the EThcD spectra of different precursors (precursor m/z: 943.0402, z: 5; and
precursor m/z: 1001.4602, z: 5, respectively) that provide confident localization
of three glycosites in same peptide sequence with different combinations
of glycans, highlighting the need for localization in O-glycopeptide characterization. Annotation labels follow the same
scheme as noted as the end of Figure .
O-glycopeptides require localization
using intact
fragments, which enables localization of multiple O-glycosites per peptide. (a) Percentage of total localized O-glycopeptide identifications that can be explained using
intact peptide backbone fragments. (b) Percentage of total localized O-glycopeptide identifications that can be explained by
the presence of only one potential O-glycosite. (c)
Proportions of localized O-glycoPSMs that were identified
with one, two, or three glycosites. Letters (A–D, F–I,
and K–N) on the y-axes for panels (a) and
(b) (and on the x-axis for panel (c)) correspond
to the labels in Figure and are grouped by method type.Examples
of why HCD fails and EThcD succeeds at O-glycosite
localization. The peptide sequence TKPVSLLESTKKTIPELDQPPK,
generated by combined StcE and trypsin cleavage of GP1b alpha, was
detected with many different glycoforms in all methods. For HCD methods,
the glycosites and respective glycan compositions were defined without
any spectral evidence, as exemplified in panel (a), leading to incorrect
localization (precursor m/z: 1014.4911, z: 4). No localized glycoPSM of this precursor was identified
in any HCD or sceHCD analyses. Note that a fragment with a “∼”
denotes a peptide backbone fragment that does not retain any glycan.
Panel (b) is an EThcD spectrum of same precursor, where two glycosites
are confidently localized with defined glycan masses based on direct
observation of intact peptide backbone product ions. Panels (c) and
(d) show the EThcD spectra of different precursors (precursor m/z: 943.0402, z: 5; and
precursor m/z: 1001.4602, z: 5, respectively) that provide confident localization
of three glycosites in same peptide sequence with different combinations
of glycans, highlighting the need for localization in O-glycopeptide characterization. Annotation labels follow the same
scheme as noted as the end of Figure .Figure provides
an illustrative example of how HCD fails and EThcD succeeds at characterizing O-glycopeptides. The peptide TKPVSLLESTKKTIPELDQPPK from
platelet glycoprotein 1b alpha (GP1ba, CD42) is the result of combined
StcE and trypsin cleavage at the N- and C-terminus, respectively.
HCD and sceHCD spectra generate high scoring spectral matches that
have numerous peptide backbone fragments (Figure a), but all of the b/y-type fragments that
would explain the assigned glycosites are missing glycan modifications
(as indicated by the “∼” symbol). The correct
total glycan composition, HexNAc(2)Hex(2)NeuAc(3), is assigned to
the sequence but the localization assignments are entirely incorrect.
An EThcD spectrum of the same precursor shows extensive peptide backbone
fragmentation, and the peptide fragments retain intact glycan(s) (Figure b). This spectral
evidence enables confident, unambiguous assignment of glycan compositions
to two threonine sites (indicated in red). The need for unambiguous
glycosite assignment is further emphasized by the presence of multiple
glycoforms of this peptide, as shown in Figure c,d. EThcD correctly localizes glycans, including
a di-sialylated core-1 structure, to three different glycosites in
different glycoforms. HCD and sceHCD are blind to the locations of
each glycan, making glycoform analysis impossible, whereas EThcD has
the ability to assign site specificity even for multiply sialylated O-glycopeptides. We did not see evidence for positional
isomers of the reported glycopeptides in these spectra, but multiple
glycoforms of the same peptide sequence can complicate spectral interpretation.
This underscores the need for continued development of analysis tools
to interpret complex glycopeptide spectra resulting from multiple
glycoforms. Note that O-glycan structures were not
determined from the spectra but rather depict the most common structures
known for these glycans (with linkage information purposefully omitted).
Interestingly, GP1ba was also in the standard glycoprotein mixture
that was digested with trypsin only. In that data set, where only
a handful of localized O-glycopeptides were confidently
identified, EThcD provided localized O-glycoPSMs
only for singly glycosylatedO-glycopeptides from
this same region of GP1ba. The ability to confidently characterize
the doubly and triply glycosylated species in the StcE+trypsin mucinO-glycopeptide mixture highlights the value StcE adds to O-glycoproteomic workflows.Our data shows that HCD
and sceHCD are generally not reliable at
generating fragment ion types sufficient for robust O-glycopeptide characterization. This shortcoming of HCD and sceHCD
for O-glycopeptides is underscored by the reliance
on ETD-based methods for O-glycopeptides even when O-glycans are simplified to truncated forms, i.e., the SimpleCell
system,[100−105] and by the lack of ability to localized the O-glycopeptide
spectra in limited previous studies investigating O-glycopeptides with HCD and sceHCD spectra.[69,79,87] Some studies have reported the retention
of O-glycans on b/y-type peptide backbone fragments
during collision-based O-glycopeptide fragmentation.[76,106] Indeed, the tens of O-glycoPSMs localized by HCD
and sceHCD methods in this study were able to be localized mainly
due to HexNAc-retaining b/y-type ions (which is an N-acetylgalactosamine, or GalNAc, residue in mucin-type O-glycopeptides). However, this represents less than ∼8–15%
of the total O-glycoPSM identifications retained
after post-Byonic filtering for HCD and sceHCD methods, compared to
a ∼65% localization rate of total O-glycoPSMs
for EThcD25 (Figure S12a). Others have
used HCD in combination with trypsin and proteinase K or pronase proteolysis
to make short peptides with few possible glycosites with some success.[107−109] While this strategy may be effective at generating short glycopeptides
that can be successfully characterized with HCD, nonspecific digestions
create issues with database searching, both in increasing search space
and time requirements and also in increased rates of false identifications.
Thus, the more straightforward approach is to utilize EThcD methods.
A recent development that may mitigate this requirement is the O-glycoprotease called OpeRATOR, which cleaves the N-terminal
residue.[110,111] This is an exciting proposition
that could have significant impact on O-glycoproteomic
methods, allowing researchers to capitalize on the benefits of HCD
and sceHCD methods. That said, O-glycoproteomic applications
with OpeRATOR likely need further testing to understand how many missed
cleavages occur that would create internally glycosylated residues
to confound localization in HCD or sceHCD spectra.
Comparisons
between N- and O-Glycopeptide Data
Beyond the intraclass comparison of methods
for N- and O-glycopeptide mixtures,
our data allows comparisons across data sets to identify spectral
features inherent to each class of glycopeptide. Perhaps one of the
most intriguing differences between N- and O-glycopeptides is the generation of peptide backbone fragments
under different conditions. N-glycopeptides show
a dependency on collision energy for the number of peptide sequencing
ions generated (Figure S6) and the subsequent
peptide sequence coverage achieved (Figure b). O-glycopeptides, on
the other hand, generate a larger number of peptide backbone fragments
than N-glycopeptides (Figures S7 and S8) and have higher peptide sequence coverage values,
with less variation based on collision energy or HCD versus sceHCD
(Figure b). More peptide
backbone fragments retaining intact glycan masses were observed in
EThcD spectra of O-glycopeptides compared to N-glycopeptides, which is likely because more glycosites
are present throughout peptide sequence (Figure c and Figures S6 and S7). The number of neutral loss-associated peptide backbone
fragments was also greater for O-glycopeptides, including
EThcD methods. This supports a recent report by Kelly and Dodds, where
they found that O-glycopeptides require lower collision
energies for precursor depletion for a small pool of O-glycopeptides.[88] Although some increase
in the number of neutral loss-associated backbone fragments can likely
be attributed to the greater number of potential glycosites, this
also shows that the dissociation thresholds for GalNac-Ser/Thr may
be lower than GlcNAc-Asn. Conversely, N-glycopeptides
generated more Y-type and oxonium/B-type ions than O-glycopeptides, likely explained by the larger size of N-glycans.Figures S14–S16 provide distributions of precursor peptide lengths, m/z values, and charge state distributions of identified N- and O-glycoPSMs for each method. As
expected, EThcD methods extend the m/z range of ETD for successfully identified glycopeptides, making their
distributions more similar to HCD and sceHCD methods. Peptide lengths
of identified glycopeptides are generally similar between the different
methods, and identified O-glycopeptides tend to be
slightly longer than N-glycopeptides on average.
Even though all methods across both glycopeptide classes had the same
settings for precursor charge state selection, fewer z = 2 N-glycopeptides were identified relative to O-glycopeptides while more highly charged N-glycopeptides were sequenced. This observation could be both peptide
sequence-dependent (as O-glycopeptides tend to be
less enriched for basic residues) and glycan-dependent (as smaller O-glycans are less likely to carry a positive charge). The
majority of identifications from HCD and sceHCD methods for both classes
of glycopeptides were z = 3 precursors, although
this was generally more prevalent for O-glycopeptides,
while ETD-based methods broadened the charge state distributions of N- and O-glycoPSMs.Given these complementary
trends in peptide and glycan fragment
generation, the amount of signal that could be explained for the different
fragmentation methods was approximately the same for both classes
(Figure S9). The distribution of that signal,
however, varied greatly between N-glycopeptides (Figure S10) and O-glycopeptides
(Figure S13). HCD and sceHCD spectra of N-glycopeptides were dominated by oxonium ions, and the
proportion of Y-type ion signal steadily decreased with increasing
collision energy, accompanied by an increase in peptide backbone fragment
signal. HCD and sceHCD of O-glycopeptides had more
balanced signal distributions, with noticeably larger proportions
of peptide backbone fragments that had neutral losses. N-glycopeptideEThcD spectra had more signal occupied by oxonium ions
and Y-type ions at higher collision energies, while O-glycopeptideEThcD spectra had more than half of their signal in
peptide fragment channels. Again, this is likely due to larger N-glycans compared to O-glycans, but these
are important spectral features to consider when developing algorithms
to score N- and O-glycopeptide spectra.
The Delta Mod score, which is the drop in Byonic score from the top-scoring
identification to the second-best identification, showed drastically
different distributions for N- and O-glycopeptides (Figure S17). According
to Byonic documentation, Delta Mod scores below 20 indicate dubious
modification site assignments while scores above 40 mean that the
reported identification is significantly better than other candidates.
These distributions further support the relative ease of localizing N-glycopeptides with both sceHCD and EThcD methods (albeit
with different levels of confidence) compared to the challenge of O-glycosite localization.Despite the evidence of
higher quality spectra for EThcD for both N- and O-glycopeptides, Byonic appears
to under-score EThcD spectra relative to HCD and sceHCD for both classes
(Figure S18). For each glycopeptide identification,
we compared the best scoring scout HCD scan to the best scoring spectrum
from triggered dissociation methods. EThcD spectra had a higher score
than scout HCD spectra for only 45–55% of N-glycopeptide identifications. Comparatively, HCD35 and HCD40 outscore
their scout HCD scans 86 and 93% of the time, and the sceHCD30 spectra
outscore ∼70–90% of their corresponding scout HCD spectra.
The problem is even more exacerbated for O-glycopeptides,
where EThcD25 and EThcD35 outscore scout HCD spectra only ∼10
and ∼32% of the time, compared to >75% for most HCD and
sceHCD
methods. This is likely because Byonic was not designed specifically
for glycopeptide spectral analysis, weights high intensity matching
fragments favorably, and rewards the presence of expected fragments
(even b/y-type fragments that have complete glycan loss),[112] which may give an unfair advantage to HCD spectra
over ETD and EThcD. Regardless, this highlights the need to incorporate
spectral features specific to glycopeptide dissociation as search
algorithms continue to progress. Such changes could include weighting
peptide backbone fragments that retain an intact glycan or HexNAc
moiety as the most important matched peaks in a spectrum. This could
allow more nuanced analyses of glycoforms and the presence of multiple
positional isoforms present within the same spectrum. Localization
algorithms that leverage this type of information are widely used
in phosphoproteomics[113] but have remained
largely absent in glycoproteomics. Considering the general lack of
structural information derived from the majority of intact glycopeptide
studies, peptide fragment scores should likely be weighted more heavily
than Y-type ions (and certainly more heavily than oxonium ions, regardless
their abundance).Even so, Y-type ions can be useful and are
known features of glycopeptideHCD and sceHCD spectra, especially Y1 ions (peptide+GlcNAc) in N-glycopeptide spectra and Y0 (peptide with no glycan) in O-glycopeptide spectra.[114]Figure S19 shows the percentage of ETD, EThcD,
HCD, and sceHCD spectra that have Y0, Y1, and two different Y2 ions,
peptide+HexNAc(2) versus peptide+HexNAc(1)Hex(1). Note that these
data do not comment on the abundance of Y-type ions, merely their
presence in spectra. As expected, Y1 is seen in the vast majority
of HCD and sceHCDN-glycopeptide spectra, although
higher collision energies (e.g., HCD40) reduce its presence. Y0 is
also expected for N-glycopeptides, although to a
lesser degree,[115] as is observed. Y1 is
present in the majority (>80%) of the EThcD25 and EThcD35N-glycopeptide spectra as well, while Y0 is only in ∼35
and 60%, respectively. The pattern of Y0 and Y1 ions effectively flips
for O-glycopeptides, where Y0 is more often present,
especially in EThcD25, EThcD35, and sceHCD spectra. Y1 is less reliably
observed in O-glycopeptide spectra, although still
in relatively high proportions (60–80%) for EThcD25, EThcD35,
and sceHCD30 methods. Y2 peptide+HexNAc(2) occurs frequently (>80%)
in N-glycopeptide spectra in lower to middle HCD
energies (20–30 nce) and sceHCD30 methods, while it is less
frequently observed in EThcD and higher energy HCD spectra. This Y2
ion is rarely (<20%) observed in O-glycopeptide
spectra, as it would be a GalNAc-GlcNAc moiety indicative of core-2 O-glycans (which occurs less frequently in the recombinant
mucins used in this study). However, higher HCD energies tend to be
more favorable for generating it in the O-glycopeptide
spectra where it should exist. The more common Y2 species for O-glycopeptides, peptide+HexNAc(1)Hex(1), is not a possibility
for N-glycopeptides but represents the common core-1 O-glycan structure (GalNAc-Gal). sceHCD methods appear to
be the most favorable fragmentation conditions for generating this
Y2 ion, although it was also observed in ∼50% of EThcD25 and
EThcD35 O-glycopeptide spectra. The presence of these
Y-type ions can be useful when designing search strategies best suited
for dissociation type and glycopeptide class.Pap et al. recently
compared HCD and EThcD for O-glycopeptides and observed
larger oxonium (B-type) glycan fragments
in EThcD spectra.[64] Large oxonium ions
can be valuable in confirming glycan composition and determining structural
aspects of the sugar. One such ion was the HexNAc(1)Hex(1)NeuAc(1)
fragment, 657.2349 m/z, which can
be present in both N- and O-glycopeptide
spectra. We screened spectra that had at least one of the two NeuAc
oxonium ions (274.0921 and/or 292.1027 m/z) for the presence of 657 m/z (Figure S19). EThcD methods, sceHCD methods
except sceHCD35 ± 5, and HCD20-25 all generated the 657 m/z oxonium ion in at least 80% of Neu5Ac-containing N-glycopeptide spectra, with EThcD25 having the highest
percentage at ∼96% of spectra. Higher energy HCD activation,
however, caused a precipitous loss of the 657 m/z peak. For O-glycopeptides, the 657 m/z peak was most often observed in sceHCD30
spectra (∼80%), while only 70–75% of EThcD25 and EThcD35
spectra had the fragment. This is slightly lower than that reported
by Pap et al., but it highlights that EThcD, sceHCD, and lower energy
HCD can generate useful higher mass oxonium ions.One final
observation compared low-mass oxonium ions (Figure S20). Halim et al. showed that the ratio
of low-mass oxonium ions can indicate the presence of GalNAc (O-glycopeptide) or GlcNAc (N-glycopeptides)
residues, and oxonium ions have since been used to classify glycopeptide
classes and sialylation states.[116−119] We calculated the ratio of 138.055
and 144.0655 m/z oxonium ions for
all scout HCD scans from N- and O-glycopeptide data sets and plotted their distributions in Figure S20a. N- and O-glycopeptides have distinct distributions, as predicted,
with most O-glycopeptides producing a ratio <
3 (median = 1.11) and most N-glycopeptides producing
a ratio > 5 (median = 16.04). A minor number of higher ratio values
for O-glycopeptides likely come from species harboring
core-2 glycans, which contain a GlcNAc residue. Higher energy HCD
and sceHCD triggered scans for N-glycopeptides recapitulated
ratios from scout HCD scans (Figure S20b), while EThcD35 and lower energy HCD methods slightly overestimated
the ratio. Ratios could not be reliably calculated for ETD, EThcD15,
or EThcD25 spectra. For O-glycopeptides, ratios were
detected in EThcD25, although the ratio was slightly underestimated
(Figure S20c). Otherwise, EThcD35 and all
HCD and sceHCD faithfully reported the 138/144 ratios seen in scout
HCD scans. This shows that oxonium ion ratios can be successfully
used in sceHCD methods and in some EThcD scans, depending on the method
parameters and glycopeptide class.
Conclusions
Ideally, N- and O-glycopeptides
would share the same optimal dissociation method so that all classes
could be analyzed with the same approaches. Here, we compared HCD,
sceHCD, ETD, and EThcD methods for mixtures of N-
and O-glycopeptides, determining their identification
rates, spectral quality, and suitability for the different glycopeptide
classes. Results are summarized in Table . Despite the superior spectral quality of
EThcD, HCD and sceHCD methods provide more rapid scan acquisition
rates to improve identifications and have fragmentation quality sufficient
for N-glycopeptide identification. Only 60–80%
of localized N-glycoPSMs from HCD and sceHCD methods
in this study had spectral evidence for the localized N-glycosite. The vast majority of N-glycosites, however,
occur in sequences with only one N-sequon, making
peptide backbone and glycan composition identification acceptable.
sceHCD methods provide a slight boost in spectral quality over standard
HCD and, thus, are the recommended method for N-glycopeptides.
sceHCD30 ± 10 generally performed the best in this study, as
has been reported elsewhere,[78] yet a recent
report argues that a method using stepped collision energies of 20/30/30
may be superior.[86] We saw that sceHCD35
± 15 also generally performs well, indicating that steps that
cover a wide range of energies can be beneficial. On the contrary,
HCD and sceHCD are mostly inadequate for site-specific O-glycopeptide analysis. Instead, EThcD methods are necessary due
to challenges in localizing O-glycosites. EThcD25
gave the best localization rates, but EThcD35 provided slightly more O-glycoPSMs. Notably, proteolysis with the professional
mucinase StcE also improved our ability to characterize O-glycopeptides with EThcD.
Table 1
Summary of Dissociation
Method Strengths
for N- and O-Glycopeptidesa
Performances of each HCD-pd-X method
tested is considered for four figures of merit, i.e., acquisition
speed, quality of peptide backbone fragmentation, quality of glycan
fragmentation, and the ability to localize glycosites. HCD and sceHCD
methods are recommended for N-glycopeptides, and
EThcD methods are recommended for O-glycopeptides.
Although EThcD methods are superior for generating spectral evidence
to support N-glycosite localization, acquisition
speed, balance of peptide and glycan fragmentation, and general presence
of only one N-glycosite per peptide make sceHCD methods
the recommended choice for N-glycopeptides.
Performances of each HCD-pd-X method
tested is considered for four figures of merit, i.e., acquisition
speed, quality of peptide backbone fragmentation, quality of glycan
fragmentation, and the ability to localize glycosites. HCD and sceHCD
methods are recommended for N-glycopeptides, and
EThcD methods are recommended for O-glycopeptides.
Although EThcD methods are superior for generating spectral evidence
to support N-glycosite localization, acquisition
speed, balance of peptide and glycan fragmentation, and general presence
of only one N-glycosite per peptide make sceHCD methods
the recommended choice for N-glycopeptides.Our findings have important implications
for many choices glycoproteomic
researchers must face. First, MS instrument platforms govern access
to dissociation methods, and it is crucial to know if desired experiments
require access to ETD-enabled systems, such as Orbitrap Tribrid or
solariX XR instruments,[99,120−122] or can be successfully completed with HCD-centric systems, e.g.,
time-of-flight instruments and the Q-Exactive or Exploris platforms.[123,124] Looking forward, ion mobility is gaining traction in many proteomic
areas including glycoproteomics,[125,126] and applications
like trapped ion mobility spectrometry on the timsTOF system may prove
valuable.[127] That said, timsTOF instruments
currently rely on collisional dissociation and may not yet be ready
for O-glycoproteomic applications, while the SNYAPT
platforms offer traveling wave ion mobility spectrometry on an ETD-enabled
system.[128] A recently described ECD cell
may bring electron-driven dissociation to a wider breadth of instrument
platforms, too.[129,130]Dissociation method choice
also affects experimental design. sceHCD
has been shown to benefit reporter ion generation without detrimental
effects on peptide identification in isobaric labeling experiments,[131] yet relatively few studies to date have employed
isobaric labeling strategies for glycoproteomic experiments.[44,50,132−136] Perhaps, adoption of sceHCD methods for N-glycopeptides
will enable more widespread use of isobaric labels, while combinations
of HCD and EThcD would still permit isobaric label-based quantitation
in O-glycoproteomic workflows. Alternatively, the
benefits of data-independent acquisition (DIA), which largely relies
on collisional dissociation, have been shown for N-glycoproteomics.[137−140] That said, there may be caveats for DIA methods for O-glycoproteomic applications because of the requirement of ETD-based
methods. Indeed, Vakhrushev and co-workers recently reported a DIA
method for O-glycopeptides, but site-specific analysis
came from separate ETD-based acquisitions.[141] This perspective will be critical when mining old data sets for
glycopeptide identifications, too, as this approach will likely better
suit N-glycopeptides than O-glycopeptides
due to more ubiquitous HCD methods.[142]In addition to instrumentation and method development, data analysis
software is required to interpret glycopeptide spectra, and the choice
of dissociation method currently dictates which analysis pipelines
are available.[143] A multitude of methods
exist for the interpretation of HCD and sceHCD spectra of N-glycopeptides,[54,78,81,144−148] but many of these do not have ETD functionalities. Several approaches
to interpret HCD and sceHCD spectra of O-glycopeptides
are emerging,[149−151] but strategies to couple these to concomitant
ETD spectral analyses are only beginning to develop.[63] Byonic and Protein Prospector remain the two main pipelines
to analyze ETD and EThcD spectra for glycoproteomics. If O-glycoproteome analysis is to improve, more software suites that
can handle EThcD spectra must emerge, and we must keep improving Byonic
and Protein Prospector as the tools we have in hand. We show here
that N- and O-glycopeptides produce
fundamentally different spectra, and tools tailored to each are needed.
As such, we hope that this dataset provides a useful resource to benchmark
new software tools for both N- and O-glycoproteomic applications.In closing, this study is the
first to comprehensively compare
HCD, sceHCD, ETD, and EThcD in head-to-head methods for both N- and O-glycoproteomic analyses. With
these data, we conclude that N-glycoproteomics should
move forward with sceHCD methods while O-glycoproteomics
must continue to rely on ETD and EThcD, a fact that is unlikely to
change unless novel noncollision-based dissociation methods emerge.
This knowledge is not only informative to glycoproteomic methodological
choices made today but is also instructional for future considerations
in method and software development.
Authors: Kathrin Stavenhagen; Hannes Hinneburg; Morten Thaysen-Andersen; Laura Hartmann; Daniel Varón Silva; Jens Fuchser; Stephanie Kaspar; Erdmann Rapp; Peter H Seeberger; Daniel Kolarich Journal: J Mass Spectrom Date: 2013-06 Impact factor: 1.982
Authors: Christian K Frese; A F Maarten Altelaar; Henk van den Toorn; Dirk Nolting; Jens Griep-Raming; Albert J R Heck; Shabaz Mohammed Journal: Anal Chem Date: 2012-10-31 Impact factor: 6.986
Authors: Nichollas E Scott; Benjamin L Parker; Angela M Connolly; Jana Paulech; Alistair V G Edwards; Ben Crossett; Linda Falconer; Daniel Kolarich; Steven P Djordjevic; Peter Højrup; Nicolle H Packer; Martin R Larsen; Stuart J Cordwell Journal: Mol Cell Proteomics Date: 2010-04-01 Impact factor: 5.911
Authors: Kshitij Khatri; Yi Pu; Joshua A Klein; Juan Wei; Catherine E Costello; Cheng Lin; Joseph Zaia Journal: J Am Soc Mass Spectrom Date: 2018-04-16 Impact factor: 3.109
Authors: Qing Yu; Alejandra Canales; Matthew S Glover; Rahul Das; Xudong Shi; Yang Liu; Mark P Keller; Alan D Attie; Lingjun Li Journal: Anal Chem Date: 2017-08-07 Impact factor: 6.986
Authors: Romain Huguet; Christopher Mullen; Kristina Srzentić; Joseph B Greer; Ryan T Fellers; Vlad Zabrouskov; John E P Syka; Neil L Kelleher; Luca Fornelli Journal: Anal Chem Date: 2019-11-22 Impact factor: 6.986
Authors: Ling Y Lee; Edward S X Moh; Benjamin L Parker; Marshall Bern; Nicolle H Packer; Morten Thaysen-Andersen Journal: J Proteome Res Date: 2016-09-02 Impact factor: 4.466
Authors: Christian Pett; Waqas Nasir; Carina Sihlbom; Britt-Marie Olsson; Vanessa Caixeta; Manuel Schorlemer; René P Zahedi; Göran Larson; Jonas Nilsson; Ulrika Westerlind Journal: Angew Chem Int Ed Engl Date: 2018-06-20 Impact factor: 15.336
Authors: Maia I Kelly; Mustafa Albahrani; Chase Castro; Ellen Poon; Bin Yan; Jack Littrell; Matthew Waas; Kenneth R Boheler; Rebekah L Gundry Journal: Pflugers Arch Date: 2021-04-08 Impact factor: 3.657
Authors: Martina Pirro; Yassene Mohammed; Arnoud H de Ru; George M C Janssen; Rayman T N Tjokrodirijo; Katarina Madunić; Manfred Wuhrer; Peter A van Veelen; Paul J Hensbergen Journal: Int J Mol Sci Date: 2021-05-20 Impact factor: 5.923
Authors: Dylan Nicholas Tabang; Yusi Cui; Daniel M Tremmel; Megan Ford; Zihui Li; Sara Dutton Sackett; Jon S Odorico; Lingjun Li Journal: Mol Omics Date: 2021-10-11