Literature DB >> 32686937

SARS-CoV-2-Encoded Proteome and Human Genetics: From Interaction-Based to Ribosomal Biology Impact on Disease and Risk Processes.

Olivia Sirpilla1,2,3, Jacob Bauss1, Ruchir Gupta1,2, Adam Underwood3, Dinah Qutob3, Tom Freeland3, Caleb Bupp1,4, Joseph Carcillo5, Nicholas Hartog6, Surender Rajasekaran1,7,8, Jeremy W Prokop1,2.   

Abstract

SARS-CoV-2 (COVID-19) has infected millions of people worldwide, with lethality in hundreds of thousands. The rapid publication of information, both regarding the clinical course and the viral biology, has yielded incredible knowledge of the virus. In this review, we address the insights gained for the SARS-CoV-2 proteome, which we have integrated into the Viral Integrated Structural Evolution Dynamic Database, a publicly available resource. Integrating evolutionary, structural, and interaction data with human proteins, we present how the SARS-CoV-2 proteome interacts with human disorders and risk factors ranging from cytokine storm, hyperferritinemic septic, coagulopathic, cardiac, immune, and rare disease-based genetics. The most noteworthy human genetic potential of SARS-CoV-2 is that of the nucleocapsid protein, where it is known to contribute to the inhibition of the biological process known as nonsense-mediated decay. This inhibition has the potential to not only regulate about 10% of all biological transcripts through altered ribosomal biology but also associate with viral-induced genetics, where suppressed human variants are activated to drive dominant, negative outcomes within cells. As we understand more of the dynamic and complex biological pathways that the proteome of SARS-CoV-2 utilizes for entry into cells, for replication, and for release from human cells, we can understand more risk factors for severe/lethal outcomes in patients and novel pharmaceutical interventions that may mitigate future pandemics.

Entities:  

Keywords:  COVID-19; SARS-CoV-2; host interactions; nonsense-mediated decay; nucleocapsid; proteomics; risk factors; transcriptomics; viral-induced genetics

Mesh:

Substances:

Year:  2020        PMID: 32686937      PMCID: PMC7418564          DOI: 10.1021/acs.jproteome.0c00421

Source DB:  PubMed          Journal:  J Proteome Res        ISSN: 1535-3893            Impact factor:   4.466


Introduction

The SARS-CoV-2 (COVID-19) pandemic has impacted every component of life, including research and medicine. In just a few months from the onset of infections to writing of this review, 10573 papers/objects have been published on SARS-CoV-2 (Figure ). This body of literature primarily focuses on infectious diseases, the respiratory system, public environmental occupational health, biochemistry molecular biology, virology, immunology, pharmacology, microbiology, and healthcare science services, to name a few fields (Figure A). Title extraction of these papers reveals mainly clinically connected terms (Figure B). The extensive infectious disease and clinical base of this literature has yielded knowledge of viral entry, replication, immune response, and transmission. However, in a short window of time, biochemical and molecular biology insights into SARS-CoV-2 have yielded a smaller body of literature that continues to grow (1267 out of the 10573 items), taking more time for data generation than clinical descriptions.
Figure 1

Extraction of Web of Science papers mentioning SARS-CoV-2 on July 11, 2020. (A) Extraction of publication research areas in Web of Science connected to “SARS-CoV-2”. (B) Extraction of titles of articles from panel A run through a word cloud. (C) Extraction of “biochemistry molecular biology” papers in panel A for document types on Web of Science. (D) The number of mentions from abstracts or titles in panel C for proteins/genes of human or SARS-CoV-2.

Extraction of Web of Science papers mentioning SARS-CoV-2 on July 11, 2020. (A) Extraction of publication research areas in Web of Science connected to “SARS-CoV-2”. (B) Extraction of titles of articles from panel A run through a word cloud. (C) Extraction of “biochemistry molecular biology” papers in panel A for document types on Web of Science. (D) The number of mentions from abstracts or titles in panel C for proteins/genes of human or SARS-CoV-2. Of these 1267 biochemistry/molecular biology items, 934 are primary articles (Figure C). Title and abstract word extraction from these biochemistry/molecular biology items, followed by counting mentions of all human (20368) or SARS-CoV-2 proteins, shows a heavy focus on ACE2 and spike (S) proteins (Figure D). The virus primarily enters cells through the interaction of the SARS-CoV-2 surface glycoprotein, Spike (S), interacting with the human encoded ACE2, similar to that of the SARS virus.[1,2] From the abstract/title terms, we identified 51/346 usages of ACE2 and 76/295 of Spike. Other human proteins with repeated mentions include TMPRSS2 (13 titles/49 abstract), ACE (1/32), FURIN (2/14), DPP4 (2/11), and C3 (2/2). Additional SARS-CoV-2 proteins with mentions include nsp12 (RNA-directed RNA polymerase, 20/71), nucleocapsid (N, 17/71), membrane (M, 5/48), envelope (E, 4/31), nsp5 (3CLPro/Mpro, 7/26), nsp8 (3/19), nsp16 (2′-O-methyltransferase, 3/14), ORF8 (1/10), nsp10 (3/9), nsp14 (guanine-N7 methyltransferase, 1/8), nsp3 (papain-like protease, 16/6), and nsp15 (uridylate-specific endoribonuclease, 16/4). Only nsp6 and nsp11 for SARS-CoV-2 have no mentions within any of these titles or abstracts for biochemical linked papers on SARS-CoV-2. Overall this suggests a few papers specifically related to SARS-CoV-2 proteins have been published; however, a large body of literature exists for the original SARS and other coronaviruses that can give interpretation of the diverse functions performed by the viral-coded proteins and how they interact with human biology.

SARS-CoV-2 Proteome

The advancement of knowledge of the SARS-CoV-2 proteome has been slower than clinical insights due to the need for experimental work that is slow and that is being hampered by social isolation. The 29903 base-pair single-stranded RNA genome of SARS-CoV-2 (NCBI NC_045512.2) has a 265 base-pair 5′ UTR, multiple protein-coding segments, and a 228 base-pair 3′ UTR. SARS-CoV-2 has a 79% genomic similarity with SARS-CoV, a known human pathogen, with both known to enter cells through the binding of human ACE2.[3,4] In addition to SARS-CoV and SARS-CoV-2, five other coronaviruses are capable of human-to-human transmission and infection (HKU1, NL63, OC43, 229E, and MERS-CoV).[5] Hundreds of Coronaviridae family member genomes have been sequenced in human and other vertebrate hosts,[6,7] and many structures have been solved for Coronaviridae species proteins, allowing for systematic assessments of the knowledge base. Our group implemented a sequence-to-structure-to-function analysis[8,9] to understand SARS-CoV-2 proteins, developing a robust understanding of protein conservation, structure, and molecular dynamics.[10] The data generated for each protein was then developed into the Viral Integrated Structural Evolution Dynamic Database (VIStEDD), a publicly released database of multiple tools for the virus. The database can be accessed at https://prokoplab.com/vistedd/. These tools consist of educational resources for the proteins coded by SARS-CoV-2 (molecular videos, 3D protein model prints, amino acid details of conservation, and dynamics), the mapping of critical sites to each protein, and the insights into how SARS-CoV-2 interacts with human proteins. Generating this database has given our team a diverse understanding of SARS-CoV-2, particularly for host protein interactions of each of the viral proteins.

SARS-CoV-2 Human Protein Responses

Multiple studies have begun building systemic insights for SARS-CoV-2 infections. Multiple groups have performed systematic data assessment of ACE2 expression and protein staining, suggesting the physiological cell types that can be targeted by the virus. They have shown expression in many tissues throughout humans, with expression within the lung found on the apical surface of polarized bronchial secretory epithelia cells.[11−14] Once the virus enters the cells, it results in the alteration of broad biological pathways, including translation, splicing, protein homeostasis, and nucleic acid metabolism.[15] Epithelial organoid cultures exposed to the virus produce a robust change in RNA expression patterns for cytokine and interferon intracellular immune responses that give rise to tissue signals.[16] Single cell profiling within the lungs of patients shows the intracellular cytokine/interferon response results in the recruitment of macrophages in severe cases and T-cells in moderate cases, with a high potential for therapeutic intervention.[17,18] Over activation of the cytokine/interferon response is connected to poor outcomes within patients, correlating with macrophage activation syndrome.[19] Additional adverse outcomes for the activation of apoptosis within lymphocytes have been observed and may contribute to the noted lymphopenia.[20] Proteomics and metabolomics of patient sera show the same macrophage dysfunction, while also elucidating platelet and complement dysregulation with the identification of severity classifiers.[21−23] In totality, the physiological response to the virus is likely mediated by a combination of immune system activation and the direct human interaction partners, altering cellular processes. An understanding of these detailed biological interactions can shed light on potential therapeutic opportunities while building a fundamental knowledge of viral biology.

SARS-CoV-2 Human Protein Interactions

To date, few studies have been performed that systematically look at mapping how the SARS or coronavirus proteins physically interact with human proteins. Structural level insights for coronavirus proteins are surprisingly deficient of human interaction partners.[10] A few of these proteins have been targeted for interaction assessments, such as the nucleocapsid protein[24,25] (shown below). It has been speculated that the understanding of virus–host interactions represents a major untapped potential of viral inhibitors.[26] A 2018 review highlights the literature of viral–host interactions for coronaviruses, focused on synergizing the knowledge of independent experiments for virus receptors, translation, membrane dynamics, immune regulation, cell cycle control, and replication.[27] The more recent work by Gordon et al.[28] covering the systematic affinity purification of 26 different SARS-CoV-2 proteins within human cells has elucidated many mechanisms and drug compounds for the regulation of viral processes.[28] Bringing this data together with our VIStEDD tools, we provide a current snapshot of SARS-CoV-2 viral proteins (Figure ).
Figure 2

SARS-CoV-2 protein insights from evolution, structural biology, and host protein interactions. Shown for each protein is the conservation mapped onto viral proteins and the string network of human interacting proteins, identifying enriched ontologies of the protein–protein interactions to denote human pathways of each viral protein’s function.

SARS-CoV-2 protein insights from evolution, structural biology, and host protein interactions. Shown for each protein is the conservation mapped onto viral proteins and the string network of human interacting proteins, identifying enriched ontologies of the protein–protein interactions to denote human pathways of each viral protein’s function.

Rep (ORF1ab)

ORF1ab is a large protein that is proteolytically cleaved to produce 16 different proteins, many involved in RNA replication.

Nsp1

The NMR structure of 2gdt has been solved,[29] and 250 sequences have been identified by Basic Local Alignment Search Tool (BLAST). Nsp1 interacts with proteins of the alpha DNA polymerase (Figure ) and is involved in regulating endonucleolytic RNA cleavage of mRNA, allowing the virus to enrich viral RNA within a cell.[30,31] Nsp1 has been shown to interact with ribosomal subunits, resulting in the inhibition of translation, 5′ mRNA capping changes, and mRNA destabilization.[32−34] From a SARS-CoV yeast two hybrid screen, nsp1 was identified to interact with immunophilins, showing that it alters the intracellular immune response.[35] Expression of nsp1 drives changes in interferon signaling.[36] These processes make nsp1 a potential virulence factor.[36−38] See prokoplab.com/nsp1 for additional information.

Nsp2

The protein has no solved protein structure, with ITASSER-generated predictions[39] that are mostly (67%) coiled, and 246 sequences have been identified by BLAST. All of the protein interaction partners are acetylated (Figure ). The protein has been suggested to be dispensable to viral replication but does impact rates of replication.[40] See prokoplab.com/nsp2 for additional information.

Papain-Like Proteinase/Nsp3

The protein has hundreds of solved X-ray crystal structures with a C4 zinc finger, and 3180 sequences have been identified by BLAST. The papain-like proteinase cleaves the first four nsp proteins,[41] where inhibition can block viral replication.[42] The proteinase can cleave proteins and has been shown to have deubiquitinase activity.[43−45] This deubiquitinase function has been linked to the regulation of immune system cytokine response,[46,47] specifically the type-I interferon signaling pathway,[48] and has connection to virulence.[49] See prokoplab.com/papain_like_proteinase for additional information.

Nsp4

The protein has no solved protein structure, with ITASSER-generated predictions[39] that are mostly (58%) coiled, and 3325 sequences have been identified by BLAST. Nsp4 interacts with several proteins involved in mitochondrial import for inner membrane insertion (Figure ). Nsp4 and nsp3 interact and form within the membrane and are involved in transcription complex assembly anchoring.[50,51] The complex is involved in the double membrane secretory vesicle formation[52,53] in the endoplasmic reticulum,[54] conferring with human protein interaction partners.[28] See prokoplab.com/nsp4 for additional information.

3C-Like Proteinase/Nsp5

Nsp5 has hundreds of solved X-ray crystal structures, with the protein found in a dimer form with a cysteine protease function,[55,56] and 3397 sequences have been identified by BLAST. The enzyme cleaves most of the proteins of the larger Rep protein with a highly conserved specificity, where inhibition is one of the most studied interventions.[57−59] See prokoplab.com/3c-like_proteinase for additional information.

Nsp6

The protein has no solved protein structure, with ITASSER-generated predictions[39] that are mostly (63%) coiled, and 2558 sequences have been identified by BLAST. Nsp6 interacts with multiple proteins involved in ATP hydrolysis-coupled cation transmembrane transport (Figure ). The protein is likely transmembrane-localized, along with nsp3/nsp4,[60] and is involved in autophagosome formations.[61−63] The few papers discussing nsp6 suggest a major future area of understanding and pharmaceutical intervention potential. See prokoplab.com/nsp6 for additional information.

Nsp7

There are several solved structures for nsp7 that interact with nsp12/nsp8 (6NUR, 2AHM, and 3UB0),[64−66] and 3256 sequences have been identified by BLAST. The nsp7 protein interacts with multiple small GTPases of the Ras complex, many of which are prenylation-regulated (Figure ). The nsp7/nsp8/nsp12 complex is a viral RNA-directed RNA polymerase unit, where nsp12 is enhanced through the binding of nsp7/nsp8.[67] See prokoplab.com/nsp7 for additional information.

Nsp8

There are several solved structures for nsp8 that interact with nsp12/nsp7 (6NUR and 3UB0),[64−66] and 3339 sequences have been identified by BLAST. The nsp8 protein interacts with proteins involved in translation, snRNA 3′-end processing, 7S RNA binding, and ribonucleoproteins (Figure ). In addition to the information provided for nsp7, nsp8 has been suggested to also interact with the ORF6 protein.[68] See prokoplab.com/nsp8 for additional information.

Nsp9

Nsp9 has many known protein structures, with the protein requiring dimerization to function,[69] and 3386 sequences have been identified by BLAST. Nsp9 interacts with multiple proteins of structural constituents of the nuclear pore (Figure ). Nsp9 and nsp10 interact with the nuclear factor-κB repressing factor (NKRF) and may cause an interleukin (IL)-8/IL-6-mediated chemotaxis of neutrophils and an overexuberant host inflammatory response.[70] Nsp9 is involved in viral RNA synthesis and RNA binding, which likely evolved from a protease.[71,72] See prokoplab.com/nsp9 for additional information.

Nsp10

Nsp10 has many known protein structures, including those interacting with nsp14 and nsp16, and contains two zinc binding motifs;[73] 3344 sequences have been identified by BLAST. Nsp10 stimulates nsp14 3′–5′ exoribonuclease/mismatch excision[74,75] and nsp16 2′-O-methyltransferase activities.[76,77] The interface of interaction with nsp14 and nsp16 overlaps, suggesting a dynamic regulation process[78] that may involve the linkage of functions through a spherical dodecameric structure.[79] A peptide-based inhibition of the nsp10 interaction has been proposed as a potential viral regulator.[80] See prokoplab.com/nsp10 for additional information.

Nsp11

Nsp11 is a little-known small 1.3 kDa peptide with few interaction partners.[28]

RNA-Directed RNA Polymerase (RdRp)/Nsp12

Nsp12 has multiple known protein structures with a zinc active site and a structure that interacts with nsp7/nsp8,[64−66] and 5086 sequences have been identified by BLAST. Nsp12 is involved in the replication of plus-strand RNA through complement strand synthesis and then viral RNA synthesis.[81] The enzyme is highly targeted for therapeutic inhibition of viruses. It is also known as RdRp and is the target of the drug remdesivir. See prokoplab.com/rna-directed_rna_polymerase for additional information.

Helicase/Nsp13

Nsp13 has multiple known protein structures with a zinc active site, and 5598 sequences have been identified by BLAST. Nsp13 interacts with multiple proteins involved in the centrosome–Golgi apparatus and centrosome (Figure ) and has a RNA and a DNA duplex-unwinding ability to separate strands with 5′ to 3′ polarity.[82−84] See prokoplab.com/helicase for additional information.

Guanine-N7 Methyltransferase/Nsp14

Nsp14 has multiple known protein structures with a zinc active site, and 2794 sequences have been identified by BLAST. Nsp14 has an S-adenosyl-l-methionine (SAM)-binding pocket and an exoribonuclease function that is involved in RNA capping,[85−87] and it interacts with nsp10 and is known to interact with the human DDX1 RNA helicase to enhance the virus replication.[88] See prokoplab.com/guanine-n7_methyltransferase for additional information.

Uridylate-Specific Endoribonuclease/Nsp15

Nsp15 has multiple known protein structures, and 2489 sequences have been identified by BLAST. Nsp15 is a Mn2+-dependent toric monomer to the hexamer enzyme involved in uridylate-specific cleavage[89,90] that may be regulated by nsp7/nsp8,[91] and it interacts with the retinoblastoma protein to impact the cell cycle[92] and is also known as NendoU. See prokoplab.com/uridylate-specific_endoribonuclease for additional information.

2′-O-Methyltransferase/Nsp16

Nsp16 has multiple known protein structures with Na, Mg, and S-adenosyl-l-methionine (SAM), and 2495 sequences have been identified by BLAST. Nsp16 is an SAM-based enzyme for the methylation of ribose 2′-OH in viral RNA capping[93,94] and interacts with nsp10.[77] The protein is a critical component in the inhibition of the host type-I interferon response[95] and is also known as 2′-O-MTase. See prokoplab.com/2-o-methyltransferase for additional information.

Spike (S) Surface Glycoprotein

The spike surface glycoprotein has multiple known protein structures that are heavily glycosylated and form a trimer complex[96,97] and is a known structure of the interaction with the dimer of heterodimers ACE2/SLC6A19 (6M17);[3] 6612 sequences have been identified by BLAST. S is a class-I viral fusion protein[98] and drives the specificity of cell targets through the interaction with ACE2 to enter human cells.[99,100] Following binding to the receptor, S undergoes a conformational change to allow viral entry.[101] For the protein to function correctly, it must be proteolytically cleaved by trypsin and, upon cell-binding proteases such as TMPRSS2, elevate entry through the mediating tropism.[102] S is of interest to the development of immunizations and rapid detection of coronaviruses, as its surface is exposed.[103,104] See prokoplab.com/spike for additional information.

ORF3a

ORF3a has no solved protein structure, with ITASSER-generated predictions,[39] and 65 sequences have been identified by BLAST. ORF3a is a three transmembrane helix protein where the extracellular component localizes the protein to the Golgi apparatus with a caveolin-1 binding potential[105] and is involved in the formation of viral particles.[106,107] ORF3a has been shown to impact the cell cycle.[108] See prokoplab.com/3a for additional information.

Envelope (E)

The envelope protein (E) has no solved protein structure, with ITASSER-generated predictions,[39] and 94 sequences have been identified by BLAST. E is required for viral particle formation[109] with transmembrane helix-forming pentameric α-helical bundles with channel activity[110] that can contribute to the membrane permeability.[111] See prokoplab.com/e for additional information.

Membrane (M)

The membrane protein (M) has no solved protein structure, with ITASSER-generated predictions,[39] and 1507 sequences have been identified by BLAST. M has human interaction partners that are involved in the mitochondrial matrix (Figure ) and is a critical component of viral membranes that are involved in viral budding.[112] See prokoplab.com/m for additional information.

ORF6

ORF6 has no solved protein structure, with ITASSER-generated predictions,[39] and 31 sequences have been identified by BLAST. Two of the interaction partners are involved in the transcription-dependent tethering of RNA polymerase (Figure ). ORF6 can function toward the inhibition of beta interferons[113] through the regulation of the signal transducer and activator of transcription 1 (STAT1)[114] and endoplasmic reticulum (ER) stress[115] and can interact with nsp8.[68] See prokoplab.com/orf6 for additional information.

ORF7a

ORF7a has multiple known protein structures, and 42 sequences have been identified by BLAST. ORF7a has protein interaction partners involved in ribosomal large subunit biogenesis (Figure ) and localizes to the ER and Golgi network.[116] It can regulate the cell cycle in G0/G1 progression.[117] See prokoplab.com/7a for additional information.

ORF8

ORF8 has no solved protein structure, with ITASSER-generated predictions,[39] and 35 sequences have been identified by BLAST. ORF8 has multiple interaction partners involved in the ER lumen (Figure ) and is a protein shared with SARSr-BatCoV, with a high positive selection.[118] See prokoplab.com/orf8 for additional information.

Nucleocapsid (N)

The nucleocapsid protein (N) has multiple known protein structures, and 2261 sequences have been identified by BLAST. N has protein interaction partners involved in mRNA binding, the ribonucleoprotein complex, and the mRNA surveillance pathway (Figure ) and is critical for the viral replication[119] in multiple processes, including viral RNA stability, replication, and packaging.[120] The protein is modified within the cell, including phosphorylation and ADP-ribosylation.[121,122] The protein consists of three domains, with the N-terminal domain involved in RNA binding, the internal dynamic multimer structured unit, and the C-terminal domain, an acidic dimerization region.[123−125] The protein can interact with RNA by serving as a RNA chaperone[126] while also interacting with the M protein and human hnRNPA1 through the internal multimerization domain.[127,128] See prokoplab.com/n for additional information.

ORF10

ORF10 has no solved protein structure, with ITASSER-generated predictions,[39] and is unique to SARS-CoV-2. Very little is known of its molecular function or cellular expression. See prokoplab.com/orf10 for additional information.

SARS-CoV-2 Risk Factors and Genetics Based on Human Protein Interactions

SARS-CoV-2 infection exhibits more adverse effects and outcomes in those with other comorbidities, including hypertension, diabetes mellitus, and coronary heart disease. The other risk factors for mortality include older age, elevated D-dimer levels, and a higher Sequential Organ Failure Assessment (SOFA) score.[129] The mortality associated with SARS-CoV-2 infection is tied to the patient’s progression to multiorgan dysfunction. The elderly are particularly susceptible to severe SARS-CoV-2 infection, which is most likely due to the immunosuppression and underlying comorbidities associated with advanced age. Advanced age has been shown to have a depressive effect on both the innate and the adaptive immune system, known as immunosenescence. This is associated with decreased phagocytosis and the bactericidal effects of neutrophils[130] and is also associated with the downregulation of cytokine signaling[131,132] and innate immune receptors.[133] With SARS-CoV-2 infection, emphasis is placed on the adaptive immune system to aid in clearing virally infected cells. The elderly population has been shown to have a shift toward inhibitory pathways, particularly in CD8+ T cells and to a lesser degree in CD4+ T cells,[134] which may play a role in allowing disseminated viral spread. This reduction of T cell activity is also joined by the involution of the thymus with age, leading to less naïve T cell output,[135] which further depresses immune functions. These accumulative effects on the immune system render the elderly population particularly susceptible to dispersed viral infection at baseline levels, which may ultimately result in viral sepsis. With the immunosenescence and increased prevalence of comorbidities associated with older age, it makes sense that this population is being hit the hardest by SARS-CoV-2; however, many younger adults who lack the above immunosenescence have also been killed from the infection, some of whom displayed no prior medical history. This aspect points to the idea that genetics may play a role in determining the severity of SARS-CoV-2 infection.

Cytokine Storm and Hyperferritinemic Sepsis

The immune response to SARS-CoV-2 infection in severe cases characteristically induces lymphopenia, particularly of CD-8+ T cells, and increases IL-2, IL-6, IL-10, and interferon (IFN)-γ levels.[136] This work is backed by multiple proteomic studies identifying biomarkers of severity that connect to the immune system.[21,22] The cytokine storm induced by SARS-CoV-2 is not a new phenomenon and has been demonstrated in the pathogenesis of other novel human coronaviruses, including MERS and SARS-CoV-1.[137] Similar consequences in severe coronavirus infections appear to stem from the cytokine storm of proinflammatory chemokines and cytokines, eventually resulting in Acute Respiratory Distress Syndrome (ARDS) and multiorgan dysfunction.[138] A previous study on sepsis and cytokine storm indicates the presence of genetic variants in multiple pathways that have a polygenetic contribution.[139] In many patients with SARS-CoV-2 that have severe infection, the identification of hyperferritinemic sepsis often occurs. Fever developed at day 1, sepsis developed at day 10, admission to the intensive care unit occurred at day 12 (for acute respiratory distress syndrome), and death occurred at day 19. Critically ill patients, defined as those with septic shock, multiple organ dysfunction/failure, and/or respiratory failure, accounted for approximately 5% of the study population, yet the study population displayed a case fatality rate of 49.0% in early reports from Wuhan, China.[129] Hyperferritinemia on day 4 and day 7 predicts mortality long before the development of sepsis and intensive care unit admission. Hyperferritinemia has been suggested to have genetic associations through pathogenic variants in genes targetable by IL1RAP and anti-C5 antibodies.[140] Type-1 interferonopathies, like heterozygous null variants in IRF7, have been shown to result in severe manifestations of seasonal influenza virus.[141] Similar monogenetic variants likely exist that lead to the individual risk of severe disease onset from SARS-CoV-2 in previously healthy patients. Much of the genetics around the immune activation leading to a cytokine storm and hyperferritinemic sepsis remains poorly defined and requires future initiatives and cohorts to define these genetic contributions adequately.

Clinical Coagulopathy of SARS-CoV-2 Infection

Initial SARS-CoV-2 infection is commonly associated with fever, cough, malaise, and fatigue.[142] In more severe cases, disseminated intravascular coagulation has been noted with elevated D-dimer levels in the serum of severe COVID-19 patients, placing them into thromboembolic risk.[143] Recent recommendations have been made to utilize thromboprophylaxis or full-anticoagulation therapy for patients in the thromboembolic risk category.[144] A specific protein–protein interaction was discovered between SARS-CoV-2’s ORF8 and the tissue plasminogen activator (tPA) protein of hosts.[28] The tPA, which is encoded by the PLAT gene, plays a crucial role in thrombolysis by catalyzing the conversion of plasminogen to plasmin, the major enzyme involved in lysis of blood clots. Increased the activity of tPA can lead to excessive bleeding, whereas decreased activity is associated with thromboembolus formation,[145] increasing the chances of pulmonary embolism, stroke, and myocardial infarction. The extent to which ORF8 interacts with tPA is not well understood, but its involvement may render a patient at risk for thromboembolism, as has been seen in the clinical setting. In a study by Ladenvall et al., it was found that the discovered eight single nucleotide polymorphisms and the Alu insertion polymorphism at the PLAT locus were not significant contributors to plasma tPA levels.[146] This finding indicates that inherited variants of the PLAT gene may not be directly involved with the coagulopathy in SARS-CoV-2 patients; however, the polymorphisms may render the host’s tPA protein to a tighter binding by ORF8, yielding greater repression during infection, placing the patient at higher risk for thromboembolism. It has also been shown that sepsis involves upregulation of platelet adhesion molecules and increased circulation of platelet–leukocyte aggregates.[147] This may point toward more of an immune-system-catalyzed coagulopathy, resulting in the presentation of strokes,[148] pulmonary embolisms,[149] myocardial infarctions,[150] and microvascular injury,[151] which impact severe SARS-CoV-2 patients. As coagulopathy has mainly been investigated in both viral and bacterial sepsis, there may be a dual effect of both the immune-mediated response and the protein–protein interaction of ORF8 with the host tPA in cases of SARS-CoV-2 infection. Further investigation is warranted to determine the extent of the interaction of ORF8 with the host tPA to determine if it plays into the pathogenesis of SARS-CoV-2-related coagulopathy.

Cardiac Involvement in SARS-CoV-2 Infection

SARS-CoV-2 has been associated with cardiac dysfunction, including myocardial infarction and heart failure. The underlying mechanisms for cardiac injury currently being hypothesized are indirect cardiac injury from the cytokine storm and inflammatory response,[152] severe hypoxia as a result of ARDS,[153] and direct viral invasion of cardiomyocytes.[154] Interestingly, ACE2, the host receptor for SARS-CoV-2, is expressed in the heart,[155] indicating direct viral invasion could be a potential cause of myocardial dysfunction. SARS-CoV-2’s nonstructural protein 9 (nsp9) was found to interact with the E3-ubiquitin ligase mindbomb homologue 1 (MIB1).[28] This ubiquitin ligase is a positive regulator of the Delta-mediated Notch signaling pathway, which is involved in multiple processes during cardiac development.[156] Mutations in the MIB1 have been associated with left ventricular noncompaction (LVNC) characterized by left ventricular trabeculations and reductions in cardiac systolic function. LVNC can range from being asymptomatic to presenting heart failure, depending on the extent the mutation has on the Notch pathway.[157] The prevalence of LVNC in the general population is estimated to be around 1/5000 to 1/30000. Patients with the asymptomatic form of LVNC may be at higher risk for exacerbation of cardiac dysfunction following SARS-CoV-2 due to involvement of this pathway, especially if they are unaware that they have this mutation. This may play a role in the cardiac dysfunction seen in younger SARS-CoV-2 patients who lack underlying comorbidities. Aside from cardiac development, the Notch pathway has also been implemented in cardiac repair, which was demonstrated in rat models where the Notch 3 and Notch 4 pathways were upregulated, thereby reducing postmyocardial infarctions in the setting of heart failure.[158] The mechanism behind the repair process is still under investigation; however, the disruption of the Notch signaling pathway by the interaction of nsp9 with MIB1 may prove to play a role in the cardiac involvement of SARS-CoV-2 infection. Furthermore, although vertical transmission of SARS-CoV-2 has not been seen,[159] neonates who have tested positive for the infection may need to have their cardiac function assessed over time due to MIB1’s role in cardiogenesis and repair.

Disease Connection of SARS-CoV-2 Interaction Partners

While we present a detailed assessment of two interaction partners’ connections to pathology and risk factors, many more likely exist. We postulate that if function of any protein diverges from normal biology to contribute to SARS-CoV-2 biology, it could result in a similar disease state within the cell as a loss of function or deleterious genetic mutation. Thus, to understand the SARS-CoV-2-connected diseases through the human protein interactions, we assessed ClinVar, a database of clinically identified variants. A query of the 332 SARS-CoV-2 human interaction partners through ClinVar reveals 8311 protein-based variants within the list (Figure A). In total, 188 of the queried 332 genes have a ClinVar submission. Of these ClinVar-connected genes, there are a total of 111 that have a clinical annotation of pathogenic (pathogenic or likely pathogenic), with a total of 2386 different variants (Figure B). The gene with the most pathogenic variants is FBN1, which is known to interact with SARS-CoV-2 nsp9 and is involved in autosomal dominant familial thoracic aortic aneurysms and aortic dissections and Marfan syndrome. A further analysis of clinical disorders connected with those genes with 10 or more pathogenic-associated variants, excluding FBN1 (PKP2, ACADM, PPT1, WFS1, COL6A1, PCNT, FBN2, BCS1L, NGLY1, CYB5R3, ACAD9, NEU1, GNB1, NARS2, TCF12, NPC2, PIGO, CDK5RAP2, CENPF, GGCX, FKBP10, TBK1, FBLN5, EXOSC3, POR, GPAA1, and RHOA), reveals a high connection to cardiac, neurological, diabetic, and syndromic biology. The SARS-CoV-2 ORF8 has the most genes connected by interactions to pathogenic ClinVar returns from queried genes, followed by protein M, nsp13, nsp7, and ORF9c (Figure C). ORF8 is connected to 18 genes associated to human genetic diseases (COL6A1, NGLY1, NEU1, NPC2, FKBP10, DNMT1, PLOD2, SMOC1, IL17RA, ADAM9, SIL1, LOX, POFUT1, TOR1A, HYOU1, EDEM3, EMC1, and HS6ST2), with significant enrichment of these genes to protein folding (false discovery rate (FDR) = 0.00095) and endoplasmic reticulum lumen (FDR = 2.99 × 10–5). While only associated with 3 pathogenic interaction partners, the nucleocapsid (N) protein has interesting disease genetics based on previous observations of a process known as viral-induced genetics.[160]
Figure 3

SARS-CoV-2 interaction partners and disease connections. (A) Extraction of all ClinVar variants for the 332 interaction partners shown as a percent of variants for different proteins, with the top 8 labeled. (B) Filtering of ClinVar returns in panel A for all variants annotated as pathogenic, including likely pathogenic, with the top 8 labeled. (C) For all genes in panel B, the number connected to each of the SARS-CoV-2 proteins.

SARS-CoV-2 interaction partners and disease connections. (A) Extraction of all ClinVar variants for the 332 interaction partners shown as a percent of variants for different proteins, with the top 8 labeled. (B) Filtering of ClinVar returns in panel A for all variants annotated as pathogenic, including likely pathogenic, with the top 8 labeled. (C) For all genes in panel B, the number connected to each of the SARS-CoV-2 proteins.

Nucleocapsid Protein and NMD-Regulated Genetics

The nucleocapsid (N) protein has the potential to impact and change cellular landscapes through the direct regulation of ribosomal biology.[161] The protein–protein interaction map by Gordon et al.[28] supports the hypothesis that SARS-CoV-2 N proteins interact with multiple mRNA-binding proteins and ribonucleoprotein complex proteins (Figure ). In multiple viruses, proteins have been shown to interact with these complexes to regulate a process known as nonsense-mediated decay (NMD).[162] NMD is a cellular process involved in the removal of mRNA that does not conform to the bulk of cellular mRNA, where proteins accumulate on the transcript and direct the cellular degradation of the mRNA.[163] The process is primarily used within cells to degrade mRNA molecules with nonsense and frameshift genetic variants and those with improper splicing to prevent the cell from producing truncated proteins that can drive dominant-negative or deleterious gain of function outcomes.[164] Viral RNA is usually suppressed and degraded within cells through NMD, acting as a cellular immune system process.[165−167] Thus, an evolutionary arms race has arisen where a virus can propagate more efficiently if it has a protein that can suppress NMD, keeping its RNA levels elevated.[168−171] Multiple lines of evidence for both SARS-CoV-2 and SARS-CoV suggested that the N protein is used to suppress NMD and evade cellular immune processes. Nearly all of the coronaviruses and the larger Nidovirales order genomes contain the N protein, which has been shown to interact with multiple ribosomal proteins, including crucial NMD factors.[28,162,172−174] The RNA of coronaviruses are directly inhibited by NMD, with the N protein expression blocking this inhibition.[162] Positive-sense single-stranded RNA viruses, including coronaviruses, are likely targets of NMD due to their many overlapping reading frames, retained introns, and long 3′ UTRs present within the cytoplasm of human cells. The N protein interacts with three proteins annotated to the mRNA surveillance pathway (UPF1, PABPC1, and PABPC4) and several proteins involved in mRNA binding and the ribonucleoprotein complex that are all known to have cellular interactions (Figure ). While the fine details of the N protein interaction on the factors are poorly understood, the three mRNA surveillance genes are well-connected to NMD biology. PABPC1 is known to be critical for NMD, with its removal suppressing NMD.[175,176] From plants to humans, UPF1 is considered a key regulator of NMD through its recruitment of multiple proteins to RNA.[177−180] The regulation switch of UPF1 is known to be regulated/activated through phosphorylation at various sites to allow its protein interactions,[181−183] while the N protein has been shown in multiple viral species to be phosphorylated[184−188] and likely dynamic in modifications throughout the RNA replication and viral lifecycle.[122,189] These phosphorylation switches and the interaction of N to NMD proteins are potential sites of pharmaceutical or biological regulation that have been undervalued to this point. Other notable interactions of the N protein are Ras GTPase-activating protein-binding protein homologues (G3BP1/2) and casein kinase 2 alpha (CSNK2A2),[28] suggesting the regulation of stress granule formation. NMD is found at the intersection of a variety of cellular pathways beyond mRNA surveillance and viral control. Notably, it is closely associated with the integrated stress response requiring translation initiation factor EIF2S1 for function.[190] Cellular stresses such as hypoxia and ER stress lead to the inhibition of NMD via phosphorylation of EIF2S1.[190] This phosphorylation typically induces stress granule formation as well, which has been cited to aid viral replication in some cases and weaken them in others.[191] When G3BP1 is depleted within cells, there is a significant impairment of the replication for coronaviruses and respiratory syncytial viruses (RSVs).[191,192] Multiple viruses have been shown to regulate phosphorylation of EIF2S1 at varying time points of infection with connection into NMD regulation.[193] Stress granule formation can enhance NMD inhibition, such as hypoxic conditions modulating UPF1 and EIF2S1.[194,195] The interaction of SARS-CoV-2 N protein human interactors promotes the inhibition of NMD and enhancement of both viral replication and truncated host polypeptides that can enhance viral pathogenicity (Figure ).
Figure 4

Visual representation of N protein NMD inhibition increasing viral pathogenicity.

Visual representation of N protein NMD inhibition increasing viral pathogenicity. The regulation of NMD by viral proteins is crucial for allowing the viral RNA to survive, but NMD processes within cells also regulate multiple endogenous transcripts, several in normal biology, and some based on genetic disease regulation. Many genes, including isoforms with early truncation (frameshifts and nonsense codons) and genes involved in amino acid homeostasis, tumorigenesis, and cell cycle control, are activated when NMD is inhibited within a cell.[196−199] In total, this amounts to about 10% of transcripts within a cell that are regulated by NMD processes and could be altered within the cell by SARS-CoV-2.[196] On top of this, most individuals contain at least one gene where a nonsense or frameshift variant within the genome is being suppressed, being either inherited or somatic. Assessments of human genomes reveal that every person has at least one variant regulated by NMD.[200] Recently, our group has shown a complex involvement of this regulation with rare human variants, driving adverse outcomes through a process we have termed viral-induced genetics (VIG).[160] In a patient with an Epstein–Barr virus (EBV) infection, they had an adverse immune response of classical hyperferritinemic sepsis like that of severe SARS-CoV-2 patients. This individual has both whole-exome sequencing in addition to multiple blood-based RNaseq experiments performed throughout their clinical course. Sequencing revealed a heterozygous splicing variant in the gene RNASEH2B, which is associated with recessive Aicardi–Goutieres syndrome[201] and has been connected to Type-I IFN-mediated autoimmune disease.[202,203] RNaseq of the patient, when healthy, showed that the splicing variant was present at very low levels, suggesting that the copy was being inhibited through NMD. While the patient was healthy for 16 years of life, the EBV was shown by RNaseq to inhibit NMD, resulting in the presence of the splice variant, which resulted in a dominant-negative RNASEH2B protein that drove cell dysfunction. This suggests that many human variants within genes connected to the immune system and viral response, which are usually suppressed by NMD and result in no cellular dysfunction, are activated by the virus through the inhibition of NMD and can give rise to severe viral outcomes. Just like in a computer virus, the antivirus of the computer is often targeted. When additional computer code contains a risk to system failure that is inhibited by the antivirus, if the antivirus is shut down, the other system vulnerabilities become present and often contribute to the computer failure. The full extent of these variants and the disease process remain to be elucidated but is a promising avenue for exploration of viral-induced outcomes in SARS-CoV-2.

Conclusions

The SARS-CoV-2 pandemic represents a unique challenge to scientists. Unlike previous pandemics, our knowledge of the genome and its coded proteins was gleaned within weeks of the outbreak, now with thousands of sequences within a short window. This level of insight has allowed for a pivot to a more robust insight into the viral proteome and how it interacts with host proteins. The advancement of protein-based bioinformatics and previous coronavirus research studies have proved useful in defining the function of each protein coded by the virus. Here, we show how many of these viral proteins interact with human proteins connected to biological pathways and disease connections, including numerous risk factors from immune to cardiovascular systems. Most notably, we highlight literature on the role of the viral nucleocapsid (N) protein in NMD regulation, where the inhibition of NMD allows for viral RNA stability while simultaneously activating genetics of cellular processes and viral-induced genetics. While we have seen thousands of publications on SARS-CoV-2 and other coronaviruses, the details of a proteome-wide knowledge base of SARS-CoV-2-coded proteins limit our ability to expand into the incredible potential of preventing and mitigating the current pandemic and future pandemics with a larger therapeutic toolset.
  200 in total

1.  Nonstructural proteins 7 and 8 of feline coronavirus form a 2:1 heterotrimer that exhibits primer-independent RNA polymerase activity.

Authors:  Yibei Xiao; Qingjun Ma; Tobias Restle; Weifeng Shang; Dmitri I Svergun; Rajesh Ponnusamy; Georg Sczakiel; Rolf Hilgenfeld
Journal:  J Virol       Date:  2012-02-08       Impact factor: 5.103

2.  Binding of the Methyl Donor S-Adenosyl-l-Methionine to Middle East Respiratory Syndrome Coronavirus 2'-O-Methyltransferase nsp16 Promotes Recruitment of the Allosteric Activator nsp10.

Authors:  Wahiba Aouadi; Alexandre Blanjoie; Jean-Jacques Vasseur; Françoise Debart; Bruno Canard; Etienne Decroly
Journal:  J Virol       Date:  2017-02-14       Impact factor: 5.103

3.  3C-like proteinase from SARS coronavirus catalyzes substrate hydrolysis by a general base mechanism.

Authors:  Changkang Huang; Ping Wei; Keqiang Fan; Ying Liu; Luhua Lai
Journal:  Biochemistry       Date:  2004-04-20       Impact factor: 3.162

4.  A novel RNASEH2B splice site mutation responsible for Aicardi-Goutieres syndrome in the Faroe Islands.

Authors:  Elsebet Ostergaard; Frodi Joensen; Karin Sundberg; Morten Duno; Flemming J Hansen; Mustafa Batbayli; Nicolina Sørensen; Alfred Peter Born
Journal:  Acta Paediatr       Date:  2012-09-05       Impact factor: 2.299

5.  The host nonsense-mediated mRNA decay pathway restricts Mammalian RNA virus replication.

Authors:  Giuseppe Balistreri; Peter Horvath; Christoph Schweingruber; David Zünd; Gerald McInerney; Andres Merits; Oliver Mühlemann; Claus Azzalin; Ari Helenius
Journal:  Cell Host Microbe       Date:  2014-09-10       Impact factor: 21.023

6.  UPF1 is required for nonsense-mediated mRNA decay (NMD) and RNAi in Arabidopsis.

Authors:  Luis Arciga-Reyes; Lucie Wootton; Martin Kieffer; Brendan Davies
Journal:  Plant J       Date:  2006-06-30       Impact factor: 6.417

Review 7.  Strategies for Success. Viral Infections and Membraneless Organelles.

Authors:  Aracelly Gaete-Argel; Chantal L Márquez; Gonzalo P Barriga; Ricardo Soto-Rifo; Fernando Valiente-Echeverría
Journal:  Front Cell Infect Microbiol       Date:  2019-10-11       Impact factor: 5.293

8.  Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation.

Authors:  Daniel Wrapp; Nianshuang Wang; Kizzmekia S Corbett; Jory A Goldsmith; Ching-Lin Hsieh; Olubukola Abiona; Barney S Graham; Jason S McLellan
Journal:  Science       Date:  2020-02-19       Impact factor: 47.728

9.  Potent binding of 2019 novel coronavirus spike protein by a SARS coronavirus-specific human monoclonal antibody.

Authors:  Xiaolong Tian; Cheng Li; Ailing Huang; Shuai Xia; Sicong Lu; Zhengli Shi; Lu Lu; Shibo Jiang; Zhenlin Yang; Yanling Wu; Tianlei Ying
Journal:  Emerg Microbes Infect       Date:  2020-02-17       Impact factor: 7.163

Review 10.  The SARS-CoV nucleocapsid protein: a protein with multifarious activities.

Authors:  Milan Surjit; Sunil K Lal
Journal:  Infect Genet Evol       Date:  2007-07-20       Impact factor: 3.342

View more
  7 in total

1.  SARS-CoV-2 infection: molecular mechanisms of severe outcomes to suggest therapeutics.

Authors:  Nicholas Hartog; William Faber; Austin Frisch; Jacob Bauss; Caleb P Bupp; Surender Rajasekaran; Jeremy W Prokop
Journal:  Expert Rev Proteomics       Date:  2021-04-05       Impact factor: 3.940

2.  Functional and tissue enrichment analyses suggest that SARS-CoV-2 infection affects host metabolism and catabolism mediated by interference on host proteins.

Authors:  Luciano Rodrigo Lopes
Journal:  Braz J Microbiol       Date:  2021-05-06       Impact factor: 2.476

3.  COVID-19: The Effect of Host Genetic Variations on Host-Virus Interactions.

Authors:  Suvobrata Chakravarty
Journal:  J Proteome Res       Date:  2020-12-10       Impact factor: 4.466

4.  CCR5 and Biological Complexity: The Need for Data Integration and Educational Materials to Address Genetic/Biological Reductionism at the Interface of Ethical, Legal, and Social Implications.

Authors:  Jacob Bauss; Michele Morris; Rama Shankar; Rosemary Olivero; Leah N Buck; Cynthia L Stenger; David Hinds; Joshua Mills; Alexandra Eby; Joseph W Zagorski; Caitlin Smith; Sara Cline; Nicholas L Hartog; Bin Chen; John Huss; Joseph A Carcillo; Surender Rajasekaran; Caleb P Bupp; Jeremy W Prokop
Journal:  Front Immunol       Date:  2021-12-02       Impact factor: 7.561

5.  In-silico screening to delineate novel antagonists to SARS-CoV-2 nucleocapsid protein.

Authors:  Mohd Fardeen Husain Shahanshah; D Anvitha; Vandana Gupta
Journal:  Phys Chem Earth (2002)       Date:  2022-06-21       Impact factor: 3.311

6.  Proteomics and Its Application in Pandemic Diseases.

Authors:  Suman S Thakur
Journal:  J Proteome Res       Date:  2020-11-06       Impact factor: 4.466

7.  High-Density Blood Transcriptomics Reveals Precision Immune Signatures of SARS-CoV-2 Infection in Hospitalized Individuals.

Authors:  Jeremy W Prokop; Nicholas L Hartog; Dave Chesla; William Faber; Chanise P Love; Rachid Karam; Nelly Abualkheir; Benjamin Feldmann; Li Teng; Tamara McBride; Mara L Leimanis; B Keith English; Amanda Holsworth; Austin Frisch; Jacob Bauss; Nathisha Kalpage; Aram Derbedrossian; Ryan M Pinti; Nicole Hale; Joshua Mills; Alexandra Eby; Elizabeth A VanSickle; Spencer C Pageau; Rama Shankar; Bin Chen; Joseph A Carcillo; Dominic Sanfilippo; Rosemary Olivero; Caleb P Bupp; Surender Rajasekaran
Journal:  Front Immunol       Date:  2021-07-16       Impact factor: 7.561

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.