Literature DB >> 34179644

Impact of Expressing Cells on Glycosylation and Glycan of the SARS-CoV-2 Spike Glycoprotein.

Yan Wang¹, Zhen Wu², Wenhua Hu³, Piliang Hao⁴, Shuang Yang³.

Abstract

The spike glycoprotein of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the first point of contact for the virus to recognize and bind to host receptors, is the focus of biomedical research seeking to effectively prevent and treat coronavirus disease (COVID-19). The mass production of spike glycoproteins is usually carried out in different cell systems. Studies have been shown that different expression cell systems alter protein glycosylation of hemagglutinin and neuraminidase in the influenza virus. However, it is not clear whether the cellular system affects the spike protein glycosylation. In this work, we investigated the effect of an expression system on the glycosylation of the spike glycoprotein and its receptor-binding domain. We found that there are significant differences in the glycosylation and glycans attached at each glycosite of the spike glycoprotein obtained from different expression cells. Since glycosylation at the binding site and adjacent amino acids affects the interaction between the spike glycoprotein and the host cell receptor, we recognize that caution should be taken when selecting an expression system to develop inhibitors, antibodies, and vaccines.

Entities: CellLine Chemical Disease Gene Mutation Species

Year: 2021 PMID： 34179644 PMCID： PMC8204757 DOI： 10.1021/acsomega.1c01785

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

Severe acute respiratory syndrome coronavirus 2 (n class="Species">SARS-CoV-2) is a strain of novel coronavirus that caused the 2019 pandemic disease (COVID-19). SARS-CoV-2 has close genetic similarity to bat coronavirus. Since its first appearance in Wuhan, China, in December 2019, SARS-CoV-2 has spread globally in a few months.[1] It was confirmed by January 20, 2020, that SARS-CoV-2 can be transmitted from person to person through direct or indirect contact, such as respiratory droplets (coughs or sneezes), airborne, fomite, and urine or feces. As of March 2021, the SARS-CoV-2 virus has caused 2.7 million deaths and 123.2 million cases worldwide. Similar to the earlier coronavirus strains middle east respiratory syene">ndrome (n class="Species">MERS)-CoV and SARS-CoV that transmits to humans, SARS-CoV-2 consists of four structural proteins, called spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins. The S, E, and M proteins together form the viral envelope, while the N protein retains the RNA genome.[2,3] The entry of SARS-CoV-2 cells depends on the binding of the viral S protein to the cell receptors and the S protein triggered by the host cell proteases. Studies have shown that the cell entry process engages angiotensin-converting enzyme 2 (ACE2) to bind the S protein and uses transmembrane protease serine 2 (TMPRSS2) to trigger the S protein.[4,5] TMPRSS2 not only cleaves and activates the spike glycoprotein for membrane fusion but also splits ACE2 into two to enhance viral infectivity.[6] Because ACE2 and TMPRSS2 are highly present in the respiratory system, digestive tract, and gastrointestinal tract (such as human airway epithelium),[7,8] bronchial transient secretory cells,[9] nasal epithelial cells,[10] human ocular surface,[11] and small intestine,[12] various routes of infection may occur when SARS-CoV-2 comes into contact with humans through any of these organs. Furin, another receptor highly expressed in the lungs, binds to the spike and cleaves the furin cleavage site (FCS) of SARS-CoV-2.[13] The presence of ACE2, TMPRSS2, and furin in these cells and tissues may indicate that there are multiple routes of transmission through their respective viral infections. The spike glycoprotein of coronavirus plays a key role in virus infection, mediates virus entry, and is a primary determinant of cell tropism and pathogenesis.[14] The spike S1 is the first possible point of contact for recognition and binding of host receptors (ACE2, Furin, or GRP78 via CD147[15]), allowing subsequently conformational changes in S2, thereby promoting the fusion between the viral envelope and the host cell membrane. According to reports, the binding affinity of the SARS-CoV-2 S1 receptor-binding domain (RBD) to ACE2 is considerably higher than that of SARS-CoV,[16,17] leading to severe infection and the widespread of the SARS-CoV-2 virus. Although the mortality rate has decreased from ∼10% of SARS-CoV to 1–5% of SARS-CoV-2, the number of deaths caused by SARS-CoV-2 is substantially higher than that of SARS-CoV, e.g., over 2.7 million for the former and 812 for the latter globally to date. Therefore, it is important to understand the structure of the spike glycoprotein and the mechanism of infection. The spike glycoprotein deploys S1 for attachment to the host cell and S2 for fusion. Obviously, the high affinity promotes the attachment of S1 to the host cell and increases the spread of the virus. The detailed structural comparison of S1 between SARS-CoV and SARS-CoV-2 shows that 10 regions in the S1 domain play critical roles in ACE2 binding; mutations in certain amino acid residues in these regions result in low affinity of S1 to ACE2.[4] In contrast, SARS-CoV-2 mutations on some amino acids may help enhance affinity, such as Y442 in SARS-CoV to L457 in SARS-CoV-2, N479 to Q493, Y484 to Q498.[4] Thus, any mutations in amino acid residues or post-translational modification (PTM) of amino acids may affect the attachment of the spike S1 to the host cell receptors. Because the spike is a glycoprotein, its glycosylated variants have a profound effect on the affinity and infectivity of SARS-CoV-2. Recent studies have identified 22 N-glycosites in the protomer of the trimeric spike and have a high-density N-glycan mask on the surface of the viral protein, similar to the S1 subunit of MERS-CoV.[18,19] Several studies have also detected trace levels of O-glycosites at T323 and S325 of the spike glycoprotein[19,20] and T678 near FCS occupied by core-1 and core-2 structures.[21] Recent studies have identified 25 O-glycosites in the S1 of the spike glycoprotein expressed from HEK293 cells, of which 16 O-glycosites are located within the three amino acids from the N-glycosites.[22] These results are consistent with our predictions using ISOGlyP, indicating that S1 RBD is highly O-glycosylated in SARS-CoV-2. On the other hand, as observed in the influenza viruses, when viral glycoproteins are expressed in different cell systems, their glycosylation can change.[23,24] Yet, it has not been studied whether the expression cell system has an impact on the O-glycosylation of SARS-CoV-2 S1 RBD. In this study, we intend to comprehensively characterize N-linked and O-linked glycosylation of the spike S1 subunit of SARS-CoV and SARS-CoV-2 produced by different expression host cells. We recognize that host expression may alter the glycosylation pattern of spike glycoproteins. HEK293 cells and baculovirus-insect system Hi5 cells are used for virus production and recombinant spike glycoprotein production in our work. The effect of host cell lines on viral protein glycosylation has been reported. The influenza A virus glycoprotein can contain structures of paucimannose (Sf9 cells), core-fucosylated bisected N-GlcNAc (embryonated hen egg), or sialylated biantennary glycans (HEK293).[23] baculovirus-insect cells, already used in influenza and human papillomavirus (HPV), is an ideal baculovirus expression system for the production of recombinant spike glycoproteins and vaccines.[25] baculovirus-insect cells can synthesize glycans with one or two core fucoses. There is a report of glucuronic acid (GlcA) in the cells,[26] even though other insect cells may have GlcA residues.[27] It should be investigated whether baculovirus-insect cells have GlcA and other glycans to analyze the glycosylation of the spike glycoprotein. To reveal these uncertainties, we compared the S1 subunits of Spike expressed in HEK293 cells and baculovirus-insect Hi5 cells (Table ). The spike S1 was digested with trypsin, and then glycopeptides were enriched using hydrophilic interaction liquid chromatography (HILIC). The enriched glycopeptides were analyzed by liquid chromatography–mass spectrometry (LC-MS/MS) using electron-transfer/higher-energy collision dissociation (EThcD) fragments. In another experiment, N-glycans and O-glycans were released from spike S1 and evaluated using a Bruker Autoflex Matrix-Assisted Laser Desorption/Ionization (MALDI)-MS.

Table 1

Recombinant Spike S1 Expressed in Different Expression Cellsa

sample	catalog	description	species	expression host	sequence
BIC1	40150-V08B1	spike S1	SARS-CoV	baculovirus-insect	M7-R667
BIC2	40591-V08B1	spike S1	SARS-CoV-2	baculovirus-insect	V16-R685
HEK2	40591-V08H	spike S1	SARS-CoV-2	HEK293	V16-R685

Samples were purchased from Sino Biological.

Results and Discussion

Most Diverse Mutations of Amino Acids Occurred in the S1 Domain of the Spike Glycoprotein

The global initiative on sharing all influenza data (GISAID) has updated the n class="Species">SARS-CoV-2 genome and the spike glycoprotein sequence based on data submitted by laboratories and research institutes around the world. As of February 2021, we have downloaded more than 200 000 protein sequences of the spike glycoproteins. After removing redundant and incomplete sequences, we found that there are 98 unique spike glycoproteins, most of which have mutations in the receptor-binding domain (RBD) of spike S1 (Figure ). The sequences are arranged according to their submission date (strain list is given in Table S1). Figure a illustrates the schematic structure of SARS-CoV-2 and its spike glycoprotein, and Figure b compares the alignment of SARS-CoV-2, SARS-CoV, and MERS-CoV. Genetic analysis showed 79% similarity between SARS-CoV and SARS-CoV-2, and the amino acid sequence identity was 76.47%;[28] the sequence alignment between MERS-CoV and SARS-CoV showed significant differences.[29] There were 51 amino acid changes between SARS-CoV and SARS-CoV-2, or 25.8% variation. Importantly, the variation falls in several sites that are critical for binding affinity to the host cell receptors.[4] From 12/2019 to 05/2020, amino acid mutations were observed at 19.3% positions within the RBD domain (Figure c). This result indicates that the diversity of SARS-CoV-2 is caused by its frequent mutation on the spike RBD. Thus, it is essential to clarify the spike RBD domain variation to provide necessary information for the development of inhibitors, antibodies, and vaccines.

Figure 1

Amino acid mutation predominantly occurred on the receptor-binding domain of the SARS-CoV-2 spike glycoprotein from 12/2019 to 05/2020. (a) Domains of SARS-CoV-2 virion include ORF1a&b, spike (S), 3a, 3b, envelope (E), membrane (M), 6, 7a, 7b, 8a, 8b, 9b, and nucleocapsid (N). The spike S1 domain consists of an N-terminal domain (NTD), a receptor-binding domain (RBD), a subdomain 1 (SD1), and a SD2; the other domains are S2, heptad repeat 1 (HR1), central helix (CH), connector domain (CD), HR2, transmembrane (TM), and cytoplasmic tail (CT). S1/S2 is the protease cleavage site, FP is the fusion peptide, and S2′ is the protease cleavage site. (b) Spike RBD sequence alignment between SARS-CoV-2, SARS-CoV, and MERS-CoV. (c) Alignment on RBD of SARS-CoV-2 strains from 12/2019 to 05/2020. The 98 complete and unique sequences are listed, most of which are conserved. The amino acid mutations are highlighted with white bars, while few mutations are observed in other domains of spike glycoproteins.

Amino acid mutation predominantly occurred on the receptor-binding domain of the SARS-CoV-2 n class="Gene">spike glycoprotein from 12/2019 to 05/2020. (a) Domains of SARS-CoV-2 virion include ORF1a&b, spike (S), 3a, 3b, envelope (E), membrane (M), 6, 7a, 7b, 8a, 8b, 9b, and nucleocapsid (N). The spike S1 domain consists of an N-terminal domain (NTD), a receptor-binding domain (RBD), a subdomain 1 (SD1), and a SD2; the other domains are S2, heptad repeat 1 (HR1), central helix (CH), connector domain (CD), HR2, transmembrane (TM), and cytoplasmic tail (CT). S1/S2 is the protease cleavage site, FP is the fusion peptide, and S2′ is the protease cleavage site. (b) Spike RBD sequence alignment between SARS-CoV-2, SARS-CoV, and MERS-CoV. (c) Alignment on RBD of SARS-CoV-2 strains from 12/2019 to 05/2020. The 98 complete and unique sequences are listed, most of which are conserved. The amino acid mutations are highlighted with white bars, while few mutations are observed in other domains of spike glycoproteins.

N-Glycosylation of SARS-CoV-2 Regulated by the Expression System

The purified recombinant S1 proteins expressed in HEK293 cells (n class="Gene">HEK2) and baculovirus-insect cells (BIC2 and BIC1) (Table ) were purchased from Sino Biological. The analysis of each sample was performed in triplicate. Each N-glycosite was plotted using relative abundance related to all N-glycosites. The expression system impacts N-glycosylation aene">nd the types of n class="Chemical">N-glycans at each site. As shown in Figure a,b, the spike S1 expressed in HEK293 cells has 12 N-glycosites. When expressed in baculovirus-insect cells, it will carry an additional N-glycosite N603. N-glycans show distinct patterns between the proteins expressed by HEK2 and BIC2. For example, N17 only exhibits complex N-glycans in HEK2, and N17 in BIC2 predominantly contains complex N-glycans with 4% high mannose. A similar observation was also found in N149 of HEK2. On the other hand, HEK2 N616 only has the Man5 (Man5GlcNAc2: Man = Mannose, GlcNAc = N-acetylglucosamine), while N616 from BIC2 mainly contains complex N-glycans, a small amount of hybrid and high-mannose N-glycans. These results indicate that the N616 site from HEK2 cells can be accessed by α1,2-mannosidases, but not as much as GlcNAcT-I.[18] Other sites containing complex and high-mannose N-glycans, such as N61, N74, N331, and N343 in HEK2, or N74, N234, N282, and N331 in BIC2, are good substrates for GlcNAcT-I when forming complex N-glycans. N122, N165, N234, N282, and N657 in HEK2 show hybrid N-glycans; N-61, N122, N149, N165, N603, N616, and N657 in BIC2 also have hybrid N-glycans, indicating that the N-glycan process of SARS-CoV-2 depends on the expression system. Moreover, the sialylation distribution of N-glycans is strikingly different between HEK2 and BIC2. Except for N616, all other N-glycosites in HEK2 contain large amounts of sialylated N-glycans. Further linkage analysis by matrix-assisted laser desorption ionization-MS (MALDI-MS) showed that these sialic acids have α2,3 or α2,6 linkage (Figure d and Table S2a),[30] suggesting that these peptide substrates may be processed by sialyltransferases (e.g., ST3Gal4 or ST6Gal1). Our results are consistent with previous studies on the SARS-CoV-2 spike proteins recombinantly expressed on the HEK293 supernatant,[20,31] except for the identification of N-glycans in N17 and N603 in our study, even though the number of N-glycans observed in these N-glycosites is limited.

Figure 2

Site-specific characterization of N-glycosylation of the S1 domain of SARS-CoV and SARS-CoV-2 spike glycoprotein. (a) SARS-CoV-2 virus expressed in HEK293 cells. Twelve N-glycosites in S1 were identified by LC-MS/MS. N-glycans are divided into high-mannose (green), hybrid (light purple), and complex (purple). N-glycosites, N17 and N149, are attached by complex N-glycans, N616 only has high-mannose (Man5), and other sites are predominantly complex types. Among these sites, N165, N234, and N657 have more than 10% hybrid N-glycans. (b) SARS-CoV-2 virus expressed in baculovirus-insect. In addition to 12 N-glycosites similar to HEK293 cells, another N-glycosite N603 was detected. High-mannose and complex N-glycans are present in all N-glycosites, while hybrid N-glycans are present in N61, N122, N149, N165, N343, N616, N657, and N603. (c) SARS-CoV virus expressed in baculovirus-insect cells. There are 14 N-glycosites in SARS-CoV. High-mannoses are predominantly present in N65, N227, and N318. Complex N-glycans are highly abundant in N29, N73, N109, N118, N119, N158, N296, N330, N357, N589, and N602. (d) MALDI-MS profiling of N-glycans released from SARS-CoV and SARS-CoV-2. Spike S1 was immobilized on AminoLink plus resins and derivatized by ethyl esterification/ethylenediamine amidation. The most abundant N-glycans are represented, and complete N-glycans for HEK293, CoV-2, and CoV are listed in Table S2. Data are given as mean ± standard deviation. SARS-CoV expressed in baculovirus-insect (BIC1) has 14 n class="Chemical">N-glycosites, and SARS-CoV-2 expressed in baculovirus-insect (BIC2) has 13 N-glycosites. BIC1 and BIC2 produce high-mannose and complex N-glycans, and all of these N-glycans contain fucosylated complex types such as Man3GlcNAc2Fuc1, or known as paucimannose specific to the insect. These results demonstrate the synthesis of core fucose in the presence of α1,3-fucosyltransferase in the baculovirus-insect cells.[32] Conversely, almost no sialylated N-glycans were identified in the BIC1 or BIC2, although treatment of baculovirus-insect with a β-N-acetylglucosaminidase inhibitor may produce terminally sialylated N-glycans.[33] N-glycosylation primarily glycosylated by high-mannoses is located at N61, N122, and N234 in BIC2 and N65, N227, and N318 in BIC1. The subtle difference in N-glycosylation may be attributed to the change in the amino acid sequence between BIC2 and BIC1. Generally, the N-glycan profile is highly conserved between BIC2 and BIC1 (Figure d).

Differential Pattern of O-Glycosites of SARS-CoV-2 in Host Cells

Table S3 shows the potential O-glycosylation on the SARS-CoV-2 n class="Gene">spike glycoprotein predicted by ISOGlyP.[34,35] In this study, T or S sites marked as “high” were reported in the literature and detected in our work, and our method also detected other O-glycosites marked as “medium” (Table S3). It is worth noting that the detected O-glycosites are mainly in the peptide substrate cluster, e.g., T22, T29, S31, and T33 are in the peptide cluster of T[22]QLPPAYT[29]NS[31]FT[33]R. This is consistent with the finding that an amino acid substrate containing P (proline) is beneficial to GalNAcTs (UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases (E.C. E.C. 2.4.1.41)) accessible to the T or S residues.[36,37] The charge state surrounding T or S may be a factor because the polarity of the GalNAcTs lectin domain affects glycosylation.[38] According to N-glycopeptide aene">nalysis, n class="Gene">spike S1 also showed different O-glycosylation expressed in HEK293 and baculovirus-insect cells. In the HEK293 cells, T323 and T325 are O-glycosylated by GalNAc and GalGalNAc mucin-type O-glycans. S637, T676, and T638 are more abundant than BIC2. In BIC2, T22, T29, S31, T33, S94, T95, T323, and T325 are the most abundant O-glycosites; T572 and T573 are only present in BIC2. These results may imply that the types of GalNAcTs are different in HEK2 and BIC2 because the glycopeptide substrate preferences of GalNAcTs may cause distinct O-glycosylation.[39] It is expected that for the same peptide substrates, such as S323, S325, T676, and T678, there will be some O-glycosites with similar glycosylation. A comparison of the site-specific O-glycan profiles on these O-glycosites is given in Figure b–e. We noticed that T323 has O-glycans similar to GalNAc (N1) and GalGalNAc (H1N1). The other three main O-glycosites have divergent O-glycans. For example, BIC2 has H1N1 at S325, N1, H1N1, H2N1, and H3F1 (F = Fucose) at S673, H1, H1N1, H2N1, H1N2, H1N2F2, H3F1, H4N3F2, and H2N4F2 at T676; HEK2 has H1N1, H2N1, and N3F1 at S325, H1N1, H2N1, H3F1, H2N4F2, and S1H3N2F1 (S = NeuAc) at S673, H1N1, H2N1, H1N2, N3F1, H2N4F2, S1N2N3, and S1H3N2F1 at T676. This demonstrates the combination of the availability of branched glycoenzyme and the preference for GalNAcTs on peptide substrates.[40]

Figure 3

Differential O-glycosylation in spike S1 expressed in baculovirus-insect and HEK293. (a) Relative abundance of O-glycosites identified in the Spike S1 domain. The most abundance O-glycosites are labeled in the ring, and the complete list of all O-glycosites are described in the legend. (b) Most abundant O-glycosite, T323, is present in both BIC2 and HEK2. This O-glycosite consists of GalNAc (N1) and GalGalNAc (H1N1). (c) S325 in BIC2 is mainly H1N1, while S325 in HEK2 is more diverse. O-glycosites (d) S673 and (e) T676 reveal more diverse O-glycans in HEK2, including several sialylated species.

O-Glycosylation in the BIC2 RBD Domain

The location of the O-glycosites is differeene">nt betweeene">n n class="CellLine">HEK293 and baculovirus-insect (Figure a). We paid special attention to the RBD domain of SARS-CoV-2 expressed in two cell lines. RBD has 197 amino acids starting from I332 to K528. When SARS-CoV-2 was expressed in HEK293 cells, nine O-glyosites were found including S366, T371, T430, S438, S443, S477, T478, S494, and T500 (Figure ). GalNAc, GalGalNAc, and GalGalNAc2 are the main O-glycans in all O-glycosites, and the abundance of S494 and T500 is high (the area inside the circle in HEK293 represents the relative abundance). These two O-glycosites and T438 are the key positions that may affect the binding affinity of RDB to the ACE2 receptor.[4]

Figure 4

Site-specific O-glycan profiling of SARS-CoV-2 receptor-binding domain expressed in baculovirus-insect and HEK293 cells. The outer ring, 16 O-glycosites within RBD of baculovirus-insect cells expressed SARS-CoV-2. Conversely, HEK293 cells expressing RBD have nine O-glycosites. The area within the ring denotes the relative abundance of the O-glycosite, while the ring color illustrates the same O-glycosites between baculovirus-insect and HEK293, e.g., S371 in yellow for both BIC2 and HEK2. Baculovirus-insect has fucosylated O-glycans in most O-glycosites, and HEK293 produces sialylated O-glycans in several O-glycosites, including T500.

Site-specific O-glycan profiling of n class="Species">SARS-CoV-2 receptor-binding domain expressed in baculovirus-insect and HEK293 cells. The outer ring, 16 O-glycosites within RBD of baculovirus-insect cells expressed SARS-CoV-2. Conversely, HEK293 cells expressing RBD have nine O-glycosites. The area within the ring denotes the relative abundance of the O-glycosite, while the ring color illustrates the same O-glycosites between baculovirus-insect and HEK293, e.g., S371 in yellow for both BIC2 and HEK2. Baculovirus-insect has fucosylated O-glycans in most O-glycosites, and HEK293 produces sialylated O-glycans in several O-glycosites, including T500. Compared with HEK293 cells, the O-glycosylation of n class="Species">SARS-CoV-2 S1 RBD domain expressed in Baculovirus-insect cells is more diverse and complex. Besides the nine O-glycosites identified in HEK2, six additional O-glycosites were found in BIC2, revealing that the density of O-glycosites is higher in BIC2. Additionally, no sialylated O-glycans were found in BIC2, while HEK2 showed sialic acid at S371, T430, S438, T478, S494, and T500. This is consistent with previous reports that insect cells lack sialyltransferases, rarely produce sialylated glycans, and often require metabolic engineering to make terminal sialic acid.[41,42] It is worth noting that terminal sialic acid plays an important role in viral infection by attaching to the surface of host cells (such as influenza virus hemagglutinin or receptor determinants for coronaviruses).[43,44]

Potential Impact of Glycosite Differentiation on the RBD–ACE2 Binding

We further compared how RBD amino acid mutations change RBD glycosylation (Figure ). Although the lengths of the RBD domains of SARS-CoV-2, n class="Species">SARS-CoV, and MERS-CoV are different, they contain a receptor-binding motif (RBM) in which 10 sites directly interact with the ACE2 receptor.[4,45]Figure a shows 10 sites (red dotted circle) across their coronavirus strains. We emphasized whether the amino acid inside, before or after each binding site is T or S (e.g., S438 before site 1 in SARS-CoV-2, T425 before site 1 in SARS-CoV). The reason is that glycosylation changes at these sites may impact the binding affinity between the spike S1 and ACE2. Figure b compares N- and O-glycosites of the spike S1 RBD between SARS-CoV-2 and SARS-CoV. The red bars indicate the relative abundance of N-glycosites, while the cyan bars indicate O-glycosites (note: the purple dotted line is a value equal to 0). There is one N-glycosite located within the RBD domain in SARS-CoV-2 and two N-glycosites in SARS-CoV; however, these N-glycosites are not in the RBM domain. There are several O-glycosites highly abundant in SARS-CoV than that in CoV-2, such as S362, T363, T431, S432, and T433. These O-glycosites are in the secondary structures of SARS-CoV-2 RBD.[45] The O-glycosites at S438, S494, and T500 are ACE2 contact residues or adjacent to them (Figure c). The high abundance of these O-glycosites in SARS-CoV-2 may be the determinant of the attachment of spike S1 to ACE2.

Figure 5

O-glycosites in or nearby key ACE2–RBD binding sites. (a) Ten binding sites that are crucial in the ACE2–RBM interaction. These sites are aligned for SARS-CoV-2, SARS-CoV, and MERS. (b) N- and O-glycosites in the RBD domain of the SARS-CoV-2 and SARS-CoV. The red bar is the relative abundance of N-glycosites, and the cyan bar is that of O-glycosites. Each amino acid is aligned based on the sequence described in (a). (c) Three ACE2–RBM binding sites (1, 7, and 9) overlapping with O-glycosites. SARS-CoV-2 has S438, S494, and T500; SARS-CoV has T485 and T486. The RBM, receptor-binding motif, starts from S438 to Q506. Based on PDB 6VW1 (dimer) for SARS-CoV-2 aene">nd PDB 3D0H for n class="Species">SARS-CoV, we mapped S1 glycosites using receptor-binding domain (RBD) in complex with ACE2.[46,47]Figure shows the site-specific glycosylation mapping of SARS-CoV in baculovirus-insect (BIC1) (Figure a), SARS-CoV-2 in baculovirus-insect (BIC2) (Figure b), and SARS-CoV-2 in HEK293 cells (HEK2) (Figure c). Compared with BIC2, BIC1 has less glycosites on the spike S1. The latter has O-glycosite at T500 in the RBM domain and may affect the affinity of the spike S1 and ACE2. BIC1 retains complex N-glycans and GalGalNAc or Gal O-glycans; in contrast, BIC2 also carries complex O-glycans and a higher number of O-glycosites. HEK2 revealed a similar location of glycosylation but showed different high-mannose N-glycans at N343 and fewer O-glycosites of spike S1. The spike S1 glycosylation in RBM and secondary structure may interact with ACE2 receptors, whose glycosylation adds another factor in S1 attachment and virus fusion into host cells.[48,49] Further studies on the stoichiometric structure of RBD and ACE2 can provide valuable insights into the interaction between RBD and ACE2.

Figure 6

Mapping glycosites of the Spike S1 RBD domain and its human receptor ACE2. N-glycosites are labeled in red and O-glycosites in cyan. The site mapping color represents different types of glycans: yellow = Gal (H1), GalGalNAc (H1N1) without or with minimal fucosylation or sialylation; light yellow = H1 or H1N1 with fucosylation or sialylation; pink = fucosylation and/or sialylation; green = high-mannose; purple = sialylated complex N-glycans; light purple = other types of complex N-glycans. SARS-CoV is based on 3D0H[47] and SARS-CoV-2 on 6WV1.[46] (a) SARS-CoV Spike S1 RBD domain glycosites include T485, T486, and T487 near or within the binding sites between ACE2 and RBD. These sites are H1 and H1N1. The front and back sides of the S1 are illustrated for glycosites. (b) Glycosites on the SARS-CoV-2 spike S1 RBD domain expressed in baculovirus-insect cells. (c) Glycosites on the SARS-CoV-2 spike S1 RBD domain expressed in HEK293 cells.

Mapping glycosites of the n class="Gene">Spike S1 RBD domain and its human receptor ACE2. N-glycosites are labeled in red and O-glycosites in cyan. The site mapping color represents different types of glycans: yellow = Gal (H1), GalGalNAc (H1N1) without or with minimal fucosylation or sialylation; light yellow = H1 or H1N1 with fucosylation or sialylation; pink = fucosylation and/or sialylation; green = high-mannose; purple = sialylated complex N-glycans; light purple = other types of complex N-glycans. SARS-CoV is based on 3D0H[47] and SARS-CoV-2 on 6WV1.[46] (a) SARS-CoV Spike S1 RBD domain glycosites include T485, T486, and T487 near or within the binding sites between ACE2 and RBD. These sites are H1 and H1N1. The front and back sides of the S1 are illustrated for glycosites. (b) Glycosites on the SARS-CoV-2 spike S1 RBD domain expressed in baculovirus-insect cells. (c) Glycosites on the SARS-CoV-2 spike S1 RBD domain expressed in HEK293 cells.

Summary and Perspectives

In this study, we investigated the effect of host expression cells on the glycosylation of the SARS-CoV-2 n class="Gene">spike S1 protein. SARS-CoV-2 virus particles infect host cells through S1 attachment to cells and S2 fusion. The affinity between S1 and host cell receptors plays a critical role in viral infection and transmission. The receptor-binding domain of spike S1 has a specific receptor-binding motif (RBM), which may directly interact with the receptor through hydrogen bonds and salt bridges.[45] From S438 to Q506, the RBM domain has 10 sites that directly interact with the ACE2 receptor. The binding kinetics between RBM and ACE2 receptor may be affected by glycosylation on these two proteins,[50] which has been similarly manifested by influenza A virus hemagglutinin[51] and HIV-1 whose encapsulated glycan moieties determine viral propagation.[52] The glycosylation of spike S depends on the host cell line, which can express varying glycoenzymes and transporters, resulting in specificity and heterogeneity.[53] Differential glycosylation not only impacts the infectivity of the virus but also changes the clinical effectiveness of therapeutic products. Thus, we intend to explore how the expression system regulates the glycosylation of spike S1 RBM and secondary structure and compare the glycosylation distribution between SARS-CoV and SARS-CoV-2. HEK293 and baculovirus-insect cell expression system is used for non-mRNA COVID-19 vaccine development.[54,55] Our results show that the expression cell determines glycosylation of the spike S1 and the type of attached glycans. SARS-CoV-2 derived from baculovirus-insect cells contains high-mannose and fucosylated complex N-glycans and fucosylated mucin-type O-glycans. SARS-CoV-2 in HEK293 cells constructs hybrid and sialylated complex N-glycans and sialylated O-glycans. MALDI-MS analysis found that SARS-CoV-2 in HEK293 contains α2,3- and α2,6-linked sialic acids. These observations are consistent with the glycan biosynthesis of the expression system. The known glycan biosynthetic pathways of insects can form Man3GlcNAc2Fuc through GlcNAcMan5GlcNAc2 with α-mannosidase II, core α1,3-fucosyltransferase, and N-Acetylglucosaminidase. Complex glycans are further extended by additional glycoenzymes.[56] HEK293 follows the general mammalian glycosylation pathways, forming biantennary, triantennary, or tetraantennary complex glycans in the presence of sialic acid or fucose residues.[57] As expected, we found that SARS-CoV-2 expressed by HEK293 has bisected, fucosylated, and sialylated N-glycans and fucosylated/sialylated O-glycans. When using the same expression host cell, similar glycosylation was still detected in BIC2 and BIC1 despite the different strains. Glycosite mapping of n class="Gene">spike S1 suggests the potential influence of host cells on the binding affinity to the ACE2 receptor. Eight O-glycosites in the RBM domain were identified in the baculovirus-insect and six O-glycosites in HEK293. The difference in glycosylation and the three-dimensional (3D) conformation of spike S1 can improve the interaction with the ACE2 receptor. It is very important to systematically study glycosylation, since the RBD (especially RBM) in the SARS-CoV-2 spike glycoprotein may be the target for the development of virus attachment inhibitors, neutralizing antibodies, and vaccines.[58] Given that SARS-CoV-2 can be infected and transmitted through many media (lungs, oral, eyes, intestine, etc.), consideration should be given to selecting suitable host cell lines for diagnostic applications and the development of inhibitors, antibodies, or vaccines.

Methods

Sample Preparation of SARS-CoV and SARS-CoV-2

Recombinant spike S1 was purchased from Sino Biological (HEK2, BIC2, and BIC1) (Table ). The amino acid sequences of HEK2 and BIC2 were from V16 to K685 and that of BIC1 from M1 to R667. The sample preparation followed the procedure described in Figure S1. Each sample was performed in technical triplicate. First, 40 μg of the protein was denatured in high-performance liquid chromatography (HPLC) water at 90 °C/10 min, and half of which was reduced in 12 mM tris(2-carboxyethyl)phosphine hydrochloride (TCEP)/37 °C/1 h and alkylated in 16 mM iodoacetamide (IAA)/room temperature/1 h. The sample was then digested with trypsin (1:25) (Promega, Madison, WI) at 37 °C/overnight. The digest solution was acidified with 30 μL of 100% trifluoroacetic acid (TFA) prior to solid-phase extraction (SPE) cartridge C18 cleanup (Waters, Milford, MA). An in-house packed Amide-80 (Tosoh Bioscience LLC, King of Prussia, PA) HILIC SPE column was used to further enrich glycopeptides.[59] The glycopeptides and flow-through peptides after HILIC were analyzed using LC-MS/MS. The remaining 20 μg of the protein after denaturation was conjugated with an Aminolink plus coupling resin (Thermo Fisher Scientific, Waltham, MA) for glycan aene">nalysis. The solid-phase method is called glycoprotein immobilization for n class="Chemical">glycan extraction (GIG),[60] in which α2,6-linked sialic acid underwent an ethyl esterification reaction (0.5 M N-(3-dimethylaminopropyl)-N′-ethylcarbodiimide hydrochloride (EDC·HCl) and 0.5 M 1-hydroxybenzotriazole hydrate (HBot), 250 μL) and α2,3-linked sialic acid through a carbodiimide coupling (1 M p-toluidine in the presence of EDC (pH 4–6)).[59] First, we used 1 μL of PNGase F (New England BioLabs, Ipswich, MA) in 25 mM ammonium bicarbonate to release N-glycans; the remaining sample on the Aminolink resin was further processed to release O-glycans through β-elimination (0.1 M NaOH) and permethylation. The permethylated O-glycans were purified using a C18 SPE cartridge and eluted with 300 μL of 60% ACN in 0.1% TFA. Glycans were analyzed by MALDI-time-of-flight/TOF-MS (MALDI-TOF/TOF-MS) (Bruker Autoflex).

MALDI-TOF/TOF-MS Identification of Glycans

The eluted glycans in 60% n class="Chemical">ACN (0.1% TFA) were spotted onto a μFocus MALDI plate (384 circles; Hudson Surface Technology, West New York, NJ), together with 1 μL of 10 mg/mL dihydroxybenzoic acid (DHB) matrix in the presence of 2% N,N-dimethylaniline (DMA) (50% ACN in 0.1 mM NaCl). The plate was dried on the top of a 50–60 °C hot plate. Each MALDI-MS test was performed in triplicate for 8000 shots. The mass (m/z) was searched against the glycan database in GlycoWorkBench.[61] For N-glycans, the mass range was set between 900 and 6000 Da, while it was set between 300 and 3000 Da for O-glycans.

LC-MS/MS Analysis of Glycopeptides

The samples were analyzed using a Dionex U3000 nanoHPLC system connected to a Thermo Orbitrap Fusion Lumos mass spectrometer (Thermo Fisher Scientific). Glycopeptides (1 μg) were injected aene">nd desalted with aene">n Acclaim PepMap C18 n class="Gene">Nano trap column (3 μm, 100 Å, 75 μm × 2 cm) at 5 μL/min with 100% solvent A (0.1% formic acid in HPLC water) for 5 min. Then, glycopeptides were separated by an Acclaim PepMap 100 nano column (3 μm, 100 Å, 75 μm × 250 mm) using a linear gradient of 2.5–37.5% solvent B (80% ACN, 0.1% formic acid) over 85 min, with a wash at 90% B for 5 min. The column was equilibrated at 2.5% B for 10 min before the next injection. Data-dependent analysis (DDA) was carried out with a duty cycle of 2 s. Precursor masses were detected in the orbitrap at a resolution (R) of 120 000 (at m/z 200) with internal calibration (Easy IC). Stepped HCD spectra (HCD energy at 15, 25, and 35%) were acquired for precursors with charges between 2 and 8 and intensities over 5.0 × 104 at R = 30 000. Dynamic exclusion was set at 20 s. When at least one glycan oxonium fragment ion (m/z 138.0545, 204.0867, 366.1396 Da) was observed within the top 20 most abundant fragments and within 15 ppm mass accuracy, an EThcD spectrum was acquired in the orbitrap at R = 30 000. The electron-transfer dissociation (ETD) reagent target was 2.0 × 105, with supplemental collision energy at 15%. The ETD reaction time was dependent on the precursor charge state: 125 ms (ETD reaction time) for charge 2, 100 ms for 3, 75 ms for 4, and 50 ms for ≥5.

Data Analysis

Through precursor and MS/MS fragmentation matching, the glycan composition aene">nalysis was performed in GlycoWorkBeene">nch, which uses n class="Chemical">glycan databases from a consortium for functional glycomics (CFG), Carbbank, GlycomeDB, and Glycosciences. The derivatization of the sialic acid linkages added a mass tag to its residues, namely, 28.031301 on α2,6-link or 42.058183 on α2,3-link. The identified N-glycans and O-glycans were used as the glycan database for glycopeptide analysis (Tables S2 and S4). MS/MS spectra were searched using Byos (Protein Metrics, San Carlos, CA) against a spiked protein database compiled in-house. The identified glycans in the MALDI-TOF were used as the glycan database. Search parameters include precursor mass tolerance (15 ppm), HCD fragment mass tolerance (20 ppm), EThcD fragment mass tolerance (20 ppm), missed cleavage (3), oxidation (+15.994915, variable), carbamidomethyl (+57.021464, fixed), common modification (≤2), rare modification (1), maximum precursor mass (30 000), protein FDR (2%), and missed cleavage (3). The identified glycopeptides were manually verified according to oxonium ions, pep-HexNAc, and y and b ions with fragments surrounding an O-glycosite. An example of glycopeptide tandem MS is shown in Figure S2. For a peptide that has multiple glycosites, such as N-glycosite and T/S O-glycosites, we use a fragment ion calculator (http://db.systemsbiology.net:8080/proteomicsToolkit/FragIonServlet.html) to check the fragmentation mass of glycopeptides. The quantification of glycopeptides was performed as follows. After searching the LC-MS/MS spectra against Byonic, Byologic further aene">nalyzed the Byonic output files. The total area uene">nder the curve (AUC) of each n class="Chemical">glycopeptide was extracted from LC-MS/MS by Byologics. The AUC of the same glycopeptide was summed up, and the relative abundance was estimated by dividing the AUC (single glycopeptide) by the total AUC (all glycopeptides). To quantify glycans on each glycosite, we used the AUC of each glycoform divided by the total AUC of all glycoforms.

61 in total

1. Differential splicing of the lectin domain of an O-glycosyltransferase modulates both peptide and glycopeptide preferences.

Authors: Carolyn May; Suena Ji; Zulfeqhar A Syed; Leslie Revoredo; Earnest James Paul Daniel; Thomas A Gerken; Lawrence A Tabak; Nadine L Samara; Kelly G Ten Hagen
Journal: J Biol Chem Date: 2020-07-15 Impact factor: 5.157

Review 2. Glycoproteins from insect cells: sialylated or not?

Authors: I Marchal; D L Jarvis; R Cacan; A Verbert
Journal: Biol Chem Date: 2001-02 Impact factor: 3.915

3. Sialylation of N-glycans on the recombinant proteins expressed by a baculovirus-insect cell system under beta-N-acetylglucosaminidase inhibition.

Authors: Satoko Watanabe; Takehiro Kokuho; Hitomi Takahashi; Masashi Takahashi; Takayuki Kubota; Shigeki Inumaru
Journal: J Biol Chem Date: 2001-12-06 Impact factor: 5.157

4. N-linked glycosylation of the hemagglutinin protein influences virulence and antigenicity of the 1918 pandemic and seasonal H1N1 influenza A viruses.

Authors: Xiangjie Sun; Akila Jayaraman; Pavithra Maniprasad; Rahul Raman; Katherine V Houser; Claudia Pappas; Hui Zeng; Ram Sasisekharan; Jacqueline M Katz; Terrence M Tumpey
Journal: J Virol Date: 2013-06-05 Impact factor: 5.103

5. GlycoWorkbench: a tool for the computer-assisted annotation of mass spectra of glycans.

Authors: Alessio Ceroni; Kai Maass; Hildegard Geyer; Rudolf Geyer; Anne Dell; Stuart M Haslam
Journal: J Proteome Res Date: 2008-03-01 Impact factor: 4.466

Review 6. Mechanisms of coronavirus cell entry mediated by the viral spike protein.

Authors: Sandrine Belouzard; Jean K Millet; Beth N Licitra; Gary R Whittaker
Journal: Viruses Date: 2012-06-20 Impact factor: 5.048

7. Expression of the SARS-CoV-2 ACE2 Receptor in the Human Airway Epithelium.

Authors: Haijun Zhang; Mahboubeh R Rostami; Philip L Leopold; Jason G Mezey; Sarah L O'Beirne; Yael Strulovici-Barel; Ronald G Crystal
Journal: Am J Respir Crit Care Med Date: 2020-07-15 Impact factor: 21.405

Review 8. The genetic sequence, origin, and diagnosis of SARS-CoV-2.

Authors: Huihui Wang; Xuemei Li; Tao Li; Shubing Zhang; Lianzi Wang; Xian Wu; Jiaqing Liu
Journal: Eur J Clin Microbiol Infect Dis Date: 2020-04-24 Impact factor: 3.267

9. SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor.

Authors: Markus Hoffmann; Hannah Kleine-Weber; Simon Schroeder; Nadine Krüger; Tanja Herrler; Sandra Erichsen; Tobias S Schiergens; Georg Herrler; Nai-Huei Wu; Andreas Nitsche; Marcel A Müller; Christian Drosten; Stefan Pöhlmann
Journal: Cell Date: 2020-03-05 Impact factor: 41.582

10. Development and simulation of fully glycosylated molecular models of ACE2-Fc fusion proteins and their interaction with the SARS-CoV-2 spike protein binding domain.

Authors: Austen Bernardi; Yihan Huang; Bradley Harris; Yongao Xiong; Somen Nandi; Karen A McDonald; Roland Faller
Journal: PLoS One Date: 2020-08-05 Impact factor: 3.240

9 in total

1. Analysis of Viral Spike Protein N-Glycosylation Using Ultraviolet Photodissociation Mass Spectrometry.

Authors: Edwin E Escobar; Shuaishuai Wang; Rupanjan Goswami; Michael B Lanzillotti; Lei Li; Jason S McLellan; Jennifer S Brodbelt
Journal: Anal Chem Date: 2022-04-07 Impact factor: 8.008

2. Aberrant Fucosylation of Saliva Glycoprotein Defining Lung Adenocarcinomas Malignancy.

Authors: Ziyuan Gao; Zhen Wu; Ying Han; Xumin Zhang; Piliang Hao; Mingming Xu; Shan Huang; Shuwei Li; Jun Xia; Junhong Jiang; Shuang Yang
Journal: ACS Omega Date: 2022-05-19

3. Assessing the Mobility of Severe Acute Respiratory Syndrome Coronavirus-2 Spike Protein Glycans by Structural and Computational Methods.

Authors: Soledad Stagnoli; Francesca Peccati; Sean R Connell; Ane Martinez-Castillo; Diego Charro; Oscar Millet; Chiara Bruzzone; Asis Palazon; Ana Ardá; Jesús Jiménez-Barbero; June Ereño-Orbea; Nicola G A Abrescia; Gonzalo Jiménez-Osés
Journal: Front Microbiol Date: 2022-04-15 Impact factor: 5.640

Review 4. Abnormal sialylation and fucosylation of saliva glycoproteins: Characteristics of lung cancer-specific biomarkers.

Authors: Ziyuan Gao; Mingming Xu; Shuang Yue; Huang Shan; Jun Xia; Junhong Jiang; Shuang Yang
Journal: Curr Res Pharmacol Drug Discov Date: 2021-12-20

Review 5. Strategies for Proteome-Wide Quantification of Glycosylation Macro- and Micro-Heterogeneity.

Authors: Pan Fang; Yanlong Ji; Thomas Oellerich; Henning Urlaub; Kuan-Ting Pan
Journal: Int J Mol Sci Date: 2022-01-30 Impact factor: 5.923

Review 6. Transient Expression of Glycosylated SARS-CoV-2 Antigens in Nicotiana benthamiana.

Authors: Valentina Ruocco; Richard Strasser
Journal: Plants (Basel) Date: 2022-04-18

Review 7. Exploring the Potential of Chemical Inhibitors for Targeting Post-translational Glycosylation of Coronavirus (SARS-CoV-2).

Authors: Nancy Tripathi; Bharat Goel; Nivedita Bhardwaj; Ram A Vishwakarma; Shreyans K Jain
Journal: ACS Omega Date: 2022-07-28

8. Systematic analysis and comparison of O-glycosylation of five recombinant spike proteins in β-coronaviruses.

Authors: Xuefang Dong; Xiuling Li; Cheng Chen; Xiaofei Zhang; Xinmiao Liang
Journal: Anal Chim Acta Date: 2022-09-16 Impact factor: 6.911

Review 9. Proteomics-based mass spectrometry profiling of SARS-CoV-2 infection from human nasopharyngeal samples.

Authors: Sayantani Chatterjee; Joseph Zaia
Journal: Mass Spectrom Rev Date: 2022-09-29 Impact factor: 9.011

9 in total