| Literature DB >> 34089862 |
Jeremy L Praissman1, Lance Wells2.
Abstract
In late 2019, a virus subsequently named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in China and led to a worldwide pandemic of the disease termed coronavirus disease 2019. The global health threat posed by this pandemic led to an extremely rapid and robust mobilization of the scientific and medical communities as evidenced by the publication of more than 10,000 peer-reviewed articles and thousands of preprints in the first year of the pandemic alone. With the publication of the initial genome sequence of SARS-CoV-2, the proteomics community immediately joined this effort publishing, to date, more than 100 peer-reviewed proteomics studies and submitting many more preprints to preprint servers. In this review, we focus on peer-reviewed articles published on the proteome, glycoproteome, and glycome of SARS-CoV-2. At a basic level, proteomic studies provide valuable information on quantitative aspects of viral infection course; information on the identities, sites, and microheterogeneity of post-translational modifications; and, information on protein-protein interactions. At a biological systems level, these studies elucidate host cell and tissue responses, characterize antibodies and other immune system factors in infection, suggest biomarkers that may be useful for diagnosis and disease-course monitoring, and help in the development or repurposing of potential therapeutics. Here, we summarize results from selected early studies to provide a perspective on the current rapidly evolving literature.Entities:
Keywords: COVID-19; MS; SARS-CoV-2; glycosylation; review
Mesh:
Substances:
Year: 2021 PMID: 34089862 PMCID: PMC8176883 DOI: 10.1016/j.mcpro.2021.100103
Source DB: PubMed Journal: Mol Cell Proteomics ISSN: 1535-9476 Impact factor: 5.911
Fig. 1The SARS-CoV-2 proteome and its post-translational modifications (PTMs). The SARS-CoV-2 NCBI reference sequence proteome delineated along its genome (A). The 28 proteins annotated in the NCBI reference sequence are represented as boxes with the starting base corresponding to each protein in the genome listed later along with most protein names (pp1ab and pp1a are labeled inside boxes). Note that the nsp proteins are expressed as parts of large polyproteins (pp1ab and pp1a), which are subsequently cleaved by proteases contained in the polyproteins themselves. A summary of PTMs detected in proteomics studies is listed above each protein except for N and S, which are shown in detail in panels B and C. Numbers in parentheses indicate the residue number in pp1ab as given in the study by Klann et al. (102). The PTMs of S. A partial domain structure is shown for orientation with coloring for contrast and start residue numbers. The most abundant N-glycans from the most abundant Oxford class at each site are shown as reported by Zhao et al. (86). The class abundances at each site reported by Watanabe et al. (83) are similar although the protein they aonalyzed showed a small but clear tendency toward slightly less processed glycoforms. Articles have reported varying amounts of O-glycosylation on S almost exclusively at T323, occupancy generally ~10% or less. Note also that Davidson et al. (14) identified 13 sites of phosphorylation on S; however, most were not cytoplasmic. Secretory pathway kinases have been confirmed (e.g., FAM20C), but it is not clear that these sites fit with known specificity determinants. The PTMs of N and ORF9b. Domain structure shown with coloring for contrast and start residue numbers. ORF9b is an alternative ORF in the N coding sequence that is not annotated in the NCBI reference sequence. FP, fusion peptide; HR1, heptad repeat 1; HR2, heptad repeat 2; NCBI, National Center for Biotechnology Information; nsp, nonstructural protein; RBD, receptor binding domain; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
Selected peptides for SARS-CoV-2 detection and quantification reported in at least two publications
| Accession | Protein | Peptide sequence | Theoretical unmodified precursor [M + H]+ | Observed precursor | Common mods | References |
|---|---|---|---|---|---|---|
| VME1_SARS2 | M | EITVATSR | 876.48 | 2 | None | ( |
| VAGDSGFAAYSR | 1200.57 | 2 | None | ( | ||
| NCAP_SARS2 | N | ADETQALPQR | 1128.57 | 2 | Deamidation (NQ) | ( |
| AYNVTQAFGR | 1126.57 | 2 | Deamidation (NQ) | ( | ||
| GFYAEGSR | 886.41 | 2 | None | ( | ||
| IGMEVTPSGTWLTYTGAIK | 2025.04 | 2, 3 | Oxidation (M) | ( | ||
| NPANNAAIVLQLPQGTTLPK | 2060.15 | 2, 3 | Deamidation (NQ) | ( | ||
| SPIKE_SARS2 | S | FQTLLALHR | 1098.64 | 3 | None | ( |
| LQSLQTYVTQQLIR | 1690.95 | 2, 3 | None | ( |
Fig. 2Views of the SARS-CoV-2 spike protein and its glycosylation. Images courtesy of Oliver C. Grant (unused graphics from Zhao et al. (86)). Protein models courtesy of Professor Bing Chen. A, the interface of SARS-CoV-2 S (white) bound to ACE2 (red) showing glycans involved in glycan–peptide and glycan–glycan interactions. B, the postfusion structure of SARS-CoV-2 S showing its distinctive columnar structure and regular spacing of N-glycans. ACE2, angiotensin-converting enzyme 2; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
Fig. 3The SARS-CoV-2 viral life cycle and selected host proteins involved. The viral life cycle is displayed proceeding from host cell entry through new virion synthesis, packaging, and export. Host cell proteins are labeled in green, and SARS-CoV-2 proteins are labeled in blue. Red arrows (→) indicate protease cleavage. The representation of virus shows ribonucleoproteins (RNPs) (consisting of five dimers of N) in the tetrahedral geometry recently reported (Yao et al. (99)). This article reported an average of 26 ± 15 copies of prefusion S per virion and 26 ± RNPs per virion. The life cycle in a given cell begins with host cell entry mediated by ACE2 (the receptor), TMPRSS2 (or alternatively CatB/L—CSTB/CTSL—fusion priming enzymes), and proceeds with trafficking through endosomes. Endosomal maturation required for viral–host–cell membrane fusion involves the proteins PIKfyve and TPC2. After fusion and uncoating of the viral RNA, the replication-transcription complex is expressed, and new viral genomic RNAs (gRNAs, + and − sense) and subgenomic RNAs (sgRNAs, + and − sense) are produced. The translation of viral proteins and modulation of host protein translation is affected by protein–protein interactions (Nsp2-eIFE2/GIGYF2, Nsp9-eIF4H, and N-LARP1 are shown) and signaling. New virion structural protein N is phosphorylated (CK2, PKC, and CDK), forms RNPs, winds gRNAs, and collects at the ERGIC membrane for envelopment. Viral proteins E, M, and S traffic through the secretory pathway for further processing including addition of glycans. Filopodia formation is enhanced (proposed to be CK2 driven by Bouhaddou et al. (101)) and may improve transmission of egressing virus between cells. ACE2, angiotensin-converting enzyme 2; CTSL, cathepsin L; ERGIC, endoplasmic reticulum golgi intermediate compartment; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2; TMPRSS2, transmembrane serine protease 2.
Selected host proteins in infection
| Primary process | Gene/complex/family | Protein name | PPI? | Abundance? | Phosphorylation? | Act.? | Function in infection (known and/or hypothesized) | Cell location | Selected SARS-CoV-2 proteomics references |
|---|---|---|---|---|---|---|---|---|---|
| Host cell entry | ACE2 | Angiotensin-converting enzyme 2 | S | +/− | Virus receptor | PM | ( | ||
| TMPRSS2 | Transmembrane protease serine 2 | S | Cleaves S (“priming”), especially at S2' site | PM | ( | ||||
| CTSB | Cathepsin B | S? | Cleaves S, alternative to TMPRSS2 | EN | ( | ||||
| CTSL | Cathepsin L | S | Cleaves S, alternative to TMPRSS2 | EN | ( | ||||
| Endosomal release | PIKFYVE | 1-phosphatidylinositol 3-phosphate 5-kinase | Endosome maturation, with TPC2 | EN | ( | ||||
| Protein expression | NUP98 | Nuclear pore complex protein Nup98 | Orf6 | + | Prevent host nuclear mRNA export | NM | ( | ||
| LARP1 | La-related protein 1 | N | - | Prioritize virus protein expression | CP, NU | ( | |||
| UPF1 | Regulator of nonsense transcripts 1 | N | Binding by N represses NMD? | CP, NU | ( | ||||
| EIF4H | Eukaryotic translation initiation factor 4H | Nsp9 | Cap-dependent mRNA translation | PN | ( | ||||
| EIF4E2 | Eukaryotic translation initiation factor 4E type 2 | Nsp2 | Represses cap-dependent translation | CP | ( | ||||
| Sec61 complex | SEC61 channel-forming translocon complex | Nsp8 | Protein entry into endoplasmic reticulum | CP, ERM | ( | ||||
| BRD4 | Bromodomain-containing protein 4 | E | + | Interference with antiviral response? | NU | ( | |||
| Protein processing | FURIN | Furin | S | Cleaves S (“priming”), especially at S1/S2 site | Golgi | ||||
| Protein degradation | CUL2 | Cullin 2 | Orf10 | + | Increase degradation of restriction factors? | CP, NU | ( | ||
| Cell signaling | CDK | Cyclin-dependent kinase | − | Cell cycle arrest, S/G2 | NU, MT, CP | ( | |||
| MAPK | Mitogen-activated protein kinase | + | + | Viral replication+, stress response | NU, CP, MT | ( | |||
| AKT | RAC-alpha serine/threonine-protein kinase | + | + | −/+ | Viral replication+, cell proliferation & apoptosis regulation | NU, CP (PM) | ( | ||
| Cell structure | PHB complex | Prohibitin complex | nsp2 | Signaling interference, mitochondrial antiviral signaling, apoptosis− | MT, NU, CP, PM | ( | |||
| CK2 complex | Casein kinase II | N | + | Cytoskeleton changes, filopodia+ | CP, NU | ( | |||
| Stress, immunity | HSPA5 | Endoplasmic reticulum chaperone BiP | Unfolded protein response, virus receptor? | CP, PM | ( | ||||
| NKRF | NF-kappaB–repressing factor | (nsp10) | IL-8 induction | NO, NU, CP | ( | ||||
| CFB | Complement factor B | −/+ | + | Alternative complement pathway factor | Secreted | ( | |||
| CFD | Complement factor D | Activates complement-dependent killing | Secreted | ( | |||||
| CFI | Complement factor I | + | + | Prevented from modulating complement | Secreted | ( | |||
| CFH | Complement factor H | +/− | + | Prevented from modulating complement | Secreted | ( |
Abbreviations: Cell location—CP, cytoplasm; EN, endosome; ER, endoplasmic reticulum; ERM, ER membrane; LY, lysosome; MT, mitochondria; NM, nuclear membrane; NO, nucleolus; NU, nucleus; PM, plasma membrane; PN, perinuclear; other—HS, heparan sulfate; MAVS, mitochondrial antiviral signaling; NMD, nonsense mediated decay.
See supplemental Table S1 for more information on these proteins, complexes, and families. NKFR = nsp10 PPI may not be direct.