| Literature DB >> 34448030 |
Eden P Go1, Shijian Zhang2,3, Haitao Ding4, John C Kappes4,5, Joseph Sodroski2,3,6, Heather Desaire7.
Abstract
Glycosylation analysis of viral glycoproteins contributes significantly to vaccine design and development. Among other benefits, glycosylation analysis allows vaccine developers to assess the impact of construct design or producer cell line choices for vaccine production, and it is a key measure by which glycoproteins that are produced for use in vaccination can be compared to their native viral forms. Because many viral glycoproteins are multiply glycosylated, glycopeptide analysis is a preferrable approach for mapping the glycans, yet the analysis of glycopeptide data can be cumbersome and requires the expertise of an experienced analyst. In recent years, a commercial software product, Byonic, has been implemented in several instances to facilitate glycopeptide analysis on viral glycoproteins and other glycoproteomics data sets, and the purpose of the study herein is to determine the strengths and limitations of using this software, particularly in cases relevant to vaccine development. The glycopeptides from a recombinantly expressed trimeric S glycoprotein of the SARS-CoV-2 virus were first analyzed using an expert-based analysis strategy; subsequently, analysis of the same data set was completed using Byonic. Careful assessment of instances where the two methods produced different results revealed that the glycopeptide assignments from Byonic contained more false positives than true positives, even when the data were assessed using a 1% false discovery rate. The work herein provides a roadmap for removing the spurious assignments that Byonic generates, and it provides an assessment of the opportunity cost for relying on automated assignments for glycopeptide data sets from viral glycoproteins.Entities:
Keywords: Glycopeptide; Glycoprotein; Mass spectrometry; SARS-CoV-2
Mesh:
Substances:
Year: 2021 PMID: 34448030 PMCID: PMC8390178 DOI: 10.1007/s00216-021-03621-z
Source DB: PubMed Journal: Anal Bioanal Chem ISSN: 1618-2642 Impact factor: 4.478
Fig. 1Overview and summary of the glycopeptide analysis. A Workflow for collecting LC-MS data on the SARS-CoV-2 S protein glycopeptides and two analysis strategies compared herein; B graphical depiction of the number of glycoforms identified in the expert analysis strategy at each N-linked glycosylation site. (*SARS-CoV-2 S protein PDB ID: 6ZGE)
Examples of misassigned glycopeptides from Byonic based on inaccurate retention times
| Peptide | Glycan | Score | Scan time |
|---|---|---|---|
| DLPQGFSALEPLVDLPIGINITR | HexNAc(2)Hex(7) | 377.97 | 58.9123 |
| DLPQGFSALEPLVDLPIGINITR | HexNAc(2)Hex(8) | 218.4 | 58.9114 |
| DLPQGFSALEPLVDLPIGINITR | HexNAc(3)Hex(4) | 103.27 | 59.2252 |
| DLPQGFSALEPLVDLPIGINITR | HexNAc(3)Hex(6) | 173.61 | 59.0913 |
| DLPQGFSALEPLVDLPIGINITR | HexNAc(5)Hex(3)Fuc(1)NeuAc(1) | 139.97 | 47.2818 |
| DLPQGFSALEPLVDLPIGINITR | HexNAc(5)Hex(6)Fuc(1)NeuAc(2) | 136.14 | 6.3723 |
Fig. 2Demonstration of a case where a Byonic-assigned glycopeptide was determined to be misassigned due to its high-resolution mass and CID data. A CID spectrum for the expert-assigned glycopeptide at m/z 1254, with major product ions assigned; B theoretical and experimental high-resolution MS data for the Byonic-assigned glycopeptide of the CID spectrum in A; C theoretical and experimental high-resolution MS data for the expert-assigned glycopeptide
Fig. 3CID data for two very similar glycopeptides highlighting the weaknesses in the Byonic scoring algorithm when glyco-centric fragmentation dominates. A Example of a high-scoring glycopeptide, where peptide-based fragmentation is abundant. B Example of a low-scoring glycopeptide, where glycan-based fragmentation dominates the spectrum. In both panels A and B, the same glycan is attached to the same glycosylation site; the only difference is that the top panel includes a missed tryptic cleavage site
Fig. 4MS/MS data for the same glycopeptide using two different dissociation methods. A CID data; B EThcD data; the Byonic scoring algorithm for CID is not as effective at assigning the glycopeptide with a high-confidence score
Fig. 5Results summary of the two different analysis approaches for the SARS-CoV-2 S data set. A Tally of the number of correctly identified SARS-CoV-2 S glycopeptides (orange) and false positives (blue), based on the expert-based assignment criteria. B Assessment of fucosylation at different glycosylation sites: Depending on the approach used to assign the data, researchers would draw different conclusions about the fucosylation profile of the SARS-CoV-2 glycoprotein