| Literature DB >> 33106676 |
Lei Lu1, Nicholas M Riley2, Michael R Shortreed1, Carolyn R Bertozzi2,3, Lloyd M Smith4.
Abstract
We report O-Pair Search, an approach to identify O-glycopeptides and localize O-glycosites. Using paired collision- and electron-based dissociation spectra, O-Pair Search identifies O-glycopeptides via an ion-indexed open modification search and localizes O-glycosites using graph theory and probability-based localization. O-Pair Search reduces search times more than 2,000-fold compared to current O-glycopeptide processing software, while defining O-glycosite localization confidence levels and generating more O-glycopeptide identifications. Beyond the mucin-type O-glycopeptides discussed here, O-Pair Search also accepts user-defined glycan databases, making it compatible with many types of O-glycosylation. O-Pair Search is freely available within the open-source MetaMorpheus platform at https://github.com/smith-chem-wisc/MetaMorpheus .Entities:
Mesh:
Substances:
Year: 2020 PMID: 33106676 PMCID: PMC7606753 DOI: 10.1038/s41592-020-00985-5
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Figure 1.O-Pair Search through MetaMorpheus for fast and confident identification of O-glycopeptides.
a) The workflow describes processing steps in the O-Pair Search strategy, which generates a fragment ion index [1, 2] and O-glycan groups [3, 4] from user defined protein and O-glycan databases, respectively. Using an ultrafast, fragment-index-enabled open modification search [5] paired with a match of delta masses to aggregate glycan mass combinations [6] enables identification of O-glycopeptide candidates from HCD spectra [7]. Paired EThcD spectra are then used for graph theory-based localization calculations to rapidly assign modification sites for all glycans comprising the O-glycan group [8]. Finally, more detailed re-scoring of spectra, localization probability calculations, and false discovery rate corrections are performed before returning identifications to the user [9]. b) A demonstration of graph theory-based localization using a hypothetical example of an O-glycopeptide TTGSLEPSSGASGPQVSSVK from human mucin-type O-glycoprotein CD43 (leukosialin), which has 8 potential O-glycosites. Here we consider how graph theory determines O-glycosites using c/zdot fragments present in EThcD spectra when two glycans (termed A and B for the sake of demonstration) are presented as modifications. c) An example of paired HCD and EThcD spectra for quadruply-O-glycosylated TTGSLEPSSGASGPQVSSVK, showing a Level 1 identification where all calculated glycan mass shifts can be confidently localized to discrete residues. Note, no fragments in the HCD spectrum retain any glycan masses. Rather, the thorough peptide backbone fragmentation without glycan retention shows how the sequence was confidently retrieved with a defined mass shift matching a combination of O-glycans. The subsequent EThcD spectrum then enables localization of all 4 O-glycosites (gold) even with the presence of 4 other unmodified potential sites. d) O-Pair Search defines levels of localization for each GlycoPSM. Note, “H”, “N”, and “A” represent hexose, HexNAc, and Neu5Ac, respectively.
Figure 2.Performance of O-Pair Search for O-glycopeptide characterization.
Comparing the number of a) localized and b) total glycopeptide spectral matches (GlycoPSMs) returned from Byonic and from O-Pair Search for HCD-pd-EThcD data collected from StcE digestions of four recombinant mucin standards. Note, only Level 1 and 1b identifications are considered for the localized O-Pair Search data, and 3 glycans per peptide were allowed for both searches. c) The table compares the search times required for Byonic and O-Pair Search when considering 2, 3, and 4 glycans per peptide (DNF = did not finish). The number of localized glycosites identified by the searches is also provided, which correspond to the GlycoPSMs in panel a and Supplementary Fig. 3. Overlap in GlycoPSMs between O-Pair Search and Byonic is compared for both d) HCD and e) EThcD scans, and f) the majority (~95%) of the shared identified scans mapped to the same glycopeptide. g) O-Pair Search enabled consideration of more glycans per peptide while keeping search times reasonable. h) O-Pair Search also allowed the use of several different protein database backgrounds much larger in size without untenable search time increases. Here, “Total” indicates all identifications, i.e., the sum of all Localization Level identifications. i) Use of entrapment databases with proteins not present in the sample did not inflate false discovery rates above approximately 1-3%. j) O-Pair Search was used to process files from a published urinary O-glycopeptide study that previously reported Protein Prospector (Prot. Pros.) and Byonic results. k) Protein Prospector reports localized glycosites, which we converted into our Localization Level system and compared with O-Pair results. l) Results from several O-Pair searches of Fraction 1 (three files), Fraction 2 (two files), and all ten files available from the urinary O-glycopeptide study. Supplementary Note 4 details the files used for each panel.