Cheorl-Ho Kim1,2. 1. Molecular and Cellular Glycobiology Unit, Department of Biological Sciences, Sungkyunkwan University, Suwon 16419, Korea. 2. Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Seoul 06351, Korea.
Abstract
The recently emerged SARS-CoV-2 is the cause of the global health crisis of the coronavirus disease 2019 (COVID-19) pandemic. No evidence is yet available for CoV infection into hosts upon zoonotic disease outbreak, although the CoV epidemy resembles influenza viruses, which use sialic acid (SA). Currently, information on SARS-CoV-2 and its receptors is limited. O-acetylated SAs interact with the lectin-like spike glycoprotein of SARS CoV-2 for the initial attachment of viruses to enter into the host cells. SARS-CoV-2 hemagglutinin-esterase (HE) acts as the classical glycan-binding lectin and receptor-degrading enzyme. Most β-CoVs recognize 9-O-acetyl-SAs but switched to recognizing the 4-O-acetyl-SA form during evolution of CoVs. Type I HE is specific for the 9-O-Ac-SAs and type II HE is specific for 4-O-Ac-SAs. The SA-binding shift proceeds through quasi-synchronous adaptations of the SA-recognition sites of the lectin and esterase domains. The molecular switching of HE acquisition of 4-O-acetyl binding from 9-O-acetyl SA binding is caused by protein-carbohydrate interaction (PCI) or lectin-carbohydrate interaction (LCI). The HE gene was transmitted to a β-CoV lineage A progenitor by horizontal gene transfer from a 9-O-Ac-SA-specific HEF, as in influenza virus C/D. HE acquisition, and expansion takes place by cross-species transmission over HE evolution. This reflects viral evolutionary adaptation to host SA-containing glycans. Therefore, CoV HE receptor switching precedes virus evolution driven by the SA-glycan diversity of the hosts. The PCI or LCI stereochemistry potentiates the SA-ligand switch by a simple conformational shift of the lectin and esterase domains. Therefore, examination of new emerging viruses can lead to better understanding of virus evolution toward transitional host tropism. A clear example of HE gene transfer is found in the BCoV HE, which prefers 7,9-di-O-Ac-SAs, which is also known to be a target of the bovine torovirus HE. A more exciting case of such a switching event occurs in the murine CoVs, with the example of the β-CoV lineage A type binding with two different subtypes of the typical 9-O-Ac-SA (type I) and the exclusive 4-O-Ac-SA (type II) attachment factors. The protein structure data for type II HE also imply the virus switching to binding 4-O acetyl SA from 9-O acetyl SA. Principles of the protein-glycan interaction and PCI stereochemistry potentiate the SA-ligand switch via simple conformational shifts of the lectin and esterase domains. Thus, our understanding of natural adaptation can be specified to how carbohydrate/glycan-recognizing proteins/molecules contribute to virus evolution toward host tropism. Under the current circumstances where reliable antiviral therapeutics or vaccination tools are lacking, several trials are underway to examine viral agents. As expected, structural and non-structural proteins of SARS-CoV-2 are currently being targeted for viral therapeutic designation and development. However, the modern global society needs SARS-CoV-2 preventive and therapeutic drugs for infected patients. In this review, the structure and sialobiology of SARS-CoV-2 are discussed in order to encourage and activate public research on glycan-specific interaction-based drug creation in the near future.
The recently en class="Gene">merged SARS-CoV-2 is the cause of the global health crisis of thecoronavirus disease 2019 (COVID-19) pandemic. No evidence is yet available for CoV infection into hosts upon zoonotic disease outbreak, although theCoV epidemy resembles influenza viruses, which use sialic acid (SA). Currently, information on SARS-CoV-2 and its receptors is limited. O-acetylated SAs interact with the lectin-like spikeglycoprotein of SARS CoV-2 for the initial attachment of viruses to enter into the host cells. SARS-CoV-2hemagglutinin-esterase (HE) acts as the classical glycan-binding lectin and receptor-degrading enzyme. Most β-CoVs recognize 9-O-acetyl-SAs but switched to recognizing the4-O-acetyl-SA form during evolution of CoVs. Type I HE is specific for the9-O-Ac-SAs and type II HE is specific for 4-O-Ac-SAs. TheSA-binding shift proceeds through quasi-synchronous adaptations of theSA-recognition sites of the lectin and esterase domains. Themolecular switching of HE acquisition of 4-O-acetyl binding from9-O-acetyl SA binding is caused by protein-carbohydrate interaction (PCI) or lectin-carbohydrate interaction (LCI). TheHE gene was transmitted to a β-CoV lineage A progenitor by horizontal gene transfer from a 9-O-Ac-SA-specific HEF, as in influenza virus C/D. HE acquisition, and expansion takes place by cross-species transmission over HE evolution. This reflects viral evolutionary adaptation to host SA-containing glycans. Therefore, CoVHE receptor switching precedes virus evolution driven by theSA-glycan diversity of the hosts. The PCI or LCI stereochemistry potentiates theSA-ligand switch by a simple conformational shift of the lectin and esterase domains. Therefore, examination of new emerging viruses can lead to better understanding of virus evolution toward transitional host tropism. A clear example of HE gene transfer is found in the BCoVHE, which prefers 7,9-di-O-Ac-SAs, which is also known to be a target of thebovine torovirusHE. A more exciting case of such a switching event occurs in themurineCoVs, with the example of the β-CoV lineage A type binding with two different subtypes of the typical 9-O-Ac-SA (type I) and the exclusive 4-O-Ac-SA (type II) attachment factors. Theprotein structure data for type II HE also imply the virus switching to binding 4-O acetyl SA from 9-O acetyl SA. Principles of theprotein-glycan interaction and PCI stereochemistry potentiate theSA-ligand switch via simple conformational shifts of the lectin and esterase domains. Thus, our understanding of natural adaptation can be specified to how carbohydrate/glycan-recognizing proteins/molecules contribute to virus evolution toward host tropism. Under the current circumstances where reliable antiviral therapeutics or vaccination tools are lacking, several trials are underway to examine viral agents. As expected, structural and non-structural proteins of SARS-CoV-2 are currently being targeted for viral therapeutic designation and development. However, themodern global society needs SARS-CoV-2 preventive and therapeutic drugs for infectedpatients. In this review, the structure and sialobiology of SARS-CoV-2 are discussed in order to encourage and activate public research on glycan-specific interaction-based drug creation in the near future.
The recent n class="Species">coronavirus pandemic crisis is due to viral infection of severe acute respiratory syndrome-related coronavirus-2 (SARS-CoV-2), causing uncontrolled inflammatory conditions in thehuman lung. Soon after the first transmission emergence of SARS-CoV from animals to humans in China in 2003 [1], a genetically evolved beta-coronavirus genus similar to human viruses was discovered in Chinese horseshoe bats (Rhinolophus sinicus) [2]. To date, pneumonia is epidemiologically caused by diverse viruses. For example, adenovirus, influenza virus, Middle East respiratory syndrome virus (MERS-V), parainfluenza virus, respiratory syncytial virus (RSV), SARS-CoV and enteric enveloped CoV can cause pneumonia in human hosts.
The World n class="Gene">Health Organization (WHO) reported the official terminology of the2019-Novel Coronavirus (2019-nCoV) on 13 January 2020, and on February 11th, WHO edited the name of the disease caused by 2019-nCoV to Coronavirus Disease-2019 (COVID-19). In academia, the International Committee on Taxonomy of Viruses (ICTV) provided official nomenclature to the virus as SARS-CoV-2 due to the similarity between thenovel coronavirus and SARS-CoV [3]. SARS-CoV-2 is spreading and causing a global health-threatening emergency [4]. Researchers have been in a race to develop anti-viral drugs against SARS-CoV-2 even before the WHO declared a worldwide pandemic threatening human lives. Preventive and therapeutic drugs for patientsinfected with SARS-CoV-2 are yet to be discovered.
Then class="Species">CoVs as enveloped forms can also infect the gastrointestinal track (GIT), although most other enteric viruses are naked in morphology [5,6]. CoVs can also rarely infect neural cells [7]. There is, unfortunately, no solid information on how thecoronaviruses infect humans and animals with reciprocal infectivity and cause a zoonotic viral outbreak. This is in contrast to influenza viruses, which are known to selectively utilize sialic acid (SA) linkages [8]. Currently, only limited information is available on β-CoVs, such as SARS-CoV and its receptor usage and infectible cell types from different species. Host cell surfaceO-acetylated SAs are recognized by the lectin-like spikeproteins of SARS CoV-2 for the first step of attachment to host cells. Infectious virus interaction with the host cell surface is mediated by sialoglycans as themost important phenomenon in eukaryote-parasite co-evolution. O-GlcNAc, a minor glycan source, is mainly found in the nucleus and cytosol (Figure 1). Apart from the general roles of glycans, CoVs recognize host cells and attach to host cell surfacemolecules to enter the host cells. For example, activity of thehemagglutinin-esterase (HE) enzyme relies on the typical carbohydrate-binding lectin and receptor-destroying enzyme (RDE) domains. Most β-CoVs target 9-O-acetylated SAs, but certain species have switched to recognizing 4-O-acetyl SA instead [8,9]. Crystallographic data for themolecular structure of type II HEprovides an explanation for the switching mechanism to acquire 4-O acetyl SA binding. This event follows the orthodox ligand–receptor interaction (LRI), lectin–carbohydrate interaction (LCI), lectin–glycan interaction (LGI), lectin–sphingolipid interaction (LSI), protein–glycan interaction (PGI), protein–carbohydrate interaction (PCI), and also protein–protein interaction (PPI). Recently, 332 protein candidates were suggested to be SARS-CoV-2-humanprotein interacting proteins through PPIs. Among these, 66 humanproteins, as druggable host factors, were further characterized as possible FDA-approvable drugs [10]. If PPI is involved, however, carbohydrates or glycansmay serve as co-receptors or co-determinants. Previous reports suggest thatcarbohydrates act as receptor determinants in most cases. The general principles of PCI stereochemistry potentiate theSA–ligand switch by way of simple conformational shifts for the lectin and esterase domain. This indicates that our examination of natural adaptation should be directed to how carbohydrate-binding proteins measure and observe carbohydrates, leading to virus evolution toward transitional host tropism.
Figure 1
Cellular glycans are recognized by infectious agents including viruses and bacteria through protein–carbohydrate interaction (PCI) or lectin–carbohydrate interaction (LCI). The carbohydrates are uses as cellular adhesion sites in eukaryotic cells. Host cell surfaced and cytosolic glycans include glycoproteins, glycolipids and proteoglycans with minor glycan species of O-GlcNAc present in nucleus and cytosols.
The 9-n class="Chemical">carbon SAs are mainly animal-specific with anionic sugars attached to terminal sugars. SAs exist in two forms, NeuGc and NeuAc. NeuGc is a differentially modified form of the parental SA form, Neu5Ac. SAs are structurally diverse. For example, several modified SA forms are known for their structures including neuraminic acid (NeuC), N-acetyl neuraminic acid (NeuAc), N-glycolyl neuraminic acid (NeuGc), N,O-diacetyl neuraminic acid (occurs in horses), N,O-diacetyl neuraminic acid (occurs in bovines) and N-acetyl O-diacetyl neuraminic acid (occurs in bovines) (Figure 2). Sialyltransferases (STs) biosynthesize different SA linkages. SA linkage diversity occurs at the α2-3, α2-6, α2-8 or α2-9 to theSA or Gal residues (Figure 3). For example, in formation of α2,3 SA or α2,6 SA structures, α2,3-ST and α2,6-ST utilize substrates such as Galβ-1,4-GlcNAc (Figure 4). Themost frequent modification of SAs is O-acetylation at positions of C4, C7, C8 and C9 of SA (Figure 5).
Figure 2
Diverse structures of sialic acids (SA). (A) Neuraminic acid; (NeuC); (B) N-acetyl neuraminic acid (NeuAc); (C) N-glycolyl neuraminic acid (NeuGc); (D) N, O-diacetyl neuraminic acid (occurs in horse); (E) N, O-diacetyl neuraminic acid (occurs in bovine); (F) N-acetyl O-diacetyl neuraminic acid (occurs in bovine).
Figure 3
SA linkages of α2-3, α2-6, α2-8 or α2-9 to the SA or Gal residues.
Figure 4
Formation of α2,3 ST or α2,6 SA structures by α 2,3- and 2,6-sialyltransferase (ST) using substrates such as Galβ-1,4-GlcNAc.
Figure 5
Action sites of viral SA-O-acetylesterases (C4, C7, C8 and C9) specific for 4-O-SA-, 7-O-SA-, 8-O-SA and 9-O-SA and neuraminidases [6].
2. Classification of CoVs
Mammal- and avian-infectible n class="Species">CoVs consist of the broad-ranged subfamily of Coronavirinae. The ICTV recommended classification is as follows: Riboviria (Realm)—Nidovirales (Order)— Cornidovirineae (Suborder)—coronavirida (Family)—orthocoronavirinae—Coronavirus genus. The four CoV genera [11] are the α-CoV, β-CoV, γ-CoV and δ-CoV. The2019-nCoV or SARS-CoV-2 belong to the β-CoV genus and are zoonotic and cause mammalianinfection, causing respiratory disease in thehuman lung. The α-CoV and β-CoV genus target mammal hosts while the δ-CoV and γ-CoV genus target avians and certain mammals. The β-CoV genus has A, B, C and D lineages. Among these, lineage B includes SARS-CoV and SARS-CoV-2. Lineage C includes MERS-CoV. The B lineage SARS-CoV and C lineage MERS-CoV, which are classified as β-CoVs, exhibit lethal rates of 10% and 35% in humans, respectively.
CoVs in n class="Species">humans are associated with respiratory infections such as colds with clinical importance, as experienced for the previous outbreak in 2003 of SARS-humanCoV (HCoV), HCoV-HKU1 and HCoV-NL63 [12,13]. Humaninfectious HCoV includes seven species, including α-CoV (HCoV-NL63 and HCoV-229E) and β-CoV (SARS-CoV, SARS-CoV-2, HCoV-OC43, HCoV-HKU1 and MERS-CoV). CoV RNA sequences mutate at a high frequency. Among the known RNA viruses, CoVs bear the longest genome sizes of 26 to 32 kb length RNA. Nucleotide sequences of CoVssRNA genomes isolated fromCOVID-19patients in Wuhan show a high homology of 89% with the nucleotide sequence of the previously known batSARS-like CoV-ZXC-21 strain and 89% with the previous SARS-CoV. The initial Wuhan CoV isolates belong to the β-CoV genus and were therefore termed SARS-CoV-2 or 2019-nCoV [14]. SARS-CoV-2 infects humanrespiratory tracts and causes outbreaks of pneumonia. SARS-CoV-2 is a novel CoV and originates from the Wuhan district in China. The genome sequence of SARS-CoV-2 exhibits 79% sequence homology with theSARS-CoV RNA sequence and 50% with theMERS-CoV sequence [15].
3. Structure, Components and Life Cycle of CoVs
CoVs are 60–140 nn class="Gene">m in size and are enveloped (+) ssRNA viruses, which feature an RNA genome, directly available to function as mRNA and thus result in rapid infection. CoVs exhibit RNA genomes of 28–32 kb, comprised of two large overlapping open reading frames (ORFs), which encode the virus replicase (transcriptase) and structural proteins. TheSARS-CoV-2 genome is 29,891 bp in size, which encodes 9860 amino acids. ThessRNA are capped and tailed with a 5′-capping structure and 3′-poly A tail at the termini. The genome is thesame sense as virus mRNA indicating that the viral RNA is translated through its own (+) RNA to synthesize RNA dependent RNA polymerase (RdRp; PDB: 6M71).
Generally, vin class="Disease">ral families are determined by the genome structure and virion morphologies of an envelope or naked capsid. A virus with a naked capsid has a coat of nucleocapsid protein (N) coating the viral genome. Viruses with an envelope have lipidenvelopes further surrounding the outmost protein layer. The2019-nCoV (SARS-CoV-2) contains a spike (S) glycoprotein, E, dimeric HE enzyme, a membrane matrix glycoprotein (M), N and RNA [16]. The structural proteins are the S, N, M and E proteins, while the non-structural proteins are proteases such as Nsp3 and Nsp5 and RdRp such as Nsp12. Among theN, M and S glycoproteins, the S glycoprotein is a fusion protein that recognizes the host receptor and enters the host cells [17]. The S, M and E proteins anchored into the endoplasmic reticulum (ER) membrane are trafficked to the endoplasmic reticulum–Golgi intermediate compartment (ERGIC). The RNA genome linked with nucleoprotein buds into the ERGIC to form virus particles. Assembled virions transported to the vesicular surface are released to the extracellular milieu via exocytosis.
The RNA generates the replicase as two polyproteins, pp1a and pp1ab. The replicase-encoded viral proteases generate up to 16 nonstructural proteins (Nsps) in the cytosol to produce replicase enzyme and the replicase–transcriptase complex (RTC). These enzymes including RTC synthesize RNAs for replication and transcription to generate viral RNA genome. CoV genomes bear two or three protease genes and the coding enzymes cleave the replicases. Together with the replicases, nonstructural proteins, termed Nsps, assemble into the RTC complex. Nsp1 to Nsp16 are known to have multiple enzyme regions. For example, Nsp1 degrades cellular mRNAs and, consequently blocks protein translation in host cells and innate immune responses. Nsp2 recognizes the specific protein called prohibitin. Nsp3 is a multi-domain transmembrane (TM) protein with diverse activities. Ubiquitin-like 1 and acidic domains bind to Nprotein and ADP-ribose-1′-phosphatase (ADRP) activity induces cytokine expression. The papain-like protease (PLpro)(PDB:6WX4)/ deubiquitinase domain cleaves virus-produced polyprotein. Nsp4 is a TM scaffold protein for double-membrane vesicle structure. Nsp5 has a main protease domain which also cleaves virus-produced polyprotein and Nsp6 acts as a TM scaffold protein. TheNsp7 and Nsp8 proteins form theNsp7-Nsp8 hexadecameric complex, which functions as an RNA polymerase-specific clamp and a primase enzyme. Nsp9 is an RNA-binding protein that activates ExoN and 2-O-methyltrnasferase (MTase) enzyme activity. Nsp10 binds to Nsp16 and Nsp14 to form a heterodimeric complex. Nsp12 is the RdRp and Nsps13 is the RNA helicase and 5′ triphosphatase. Nsps14 is a N7 MTase and 3′-5′ exoribonuclease. ExoN of Nsap14 acts as an N7 MTase and attaches the 5′ cap to viral RNAs. Viral exoribonuclease enzyme proofreads the viral RNA genome, where Nsp15 is a viral endoribonuclease (PDB: 6VWW). Nsp16 has 2-O-MTase enzyme activity, which shields viral RNA frommelanoma differentiation associated protein-5 recognition.
3.1. Spike (S) Transmembrane Glycoprotein
In RNA viruses, tn class="Gene">he S glycoprotein (PDB: 6VSB) is the biggest protein, heavily glycosylated and its N-terminal domain (NTD) sequence binds to the host receptor to enter the ER of host cells. SARS-CoV-2 S-glycoprotein bears 22 N-glycan sequons in each protomer. Therefore, the trimeric S glycoprotein surface is dominated by 66 N-glycans. The S glycoprotein mediates direct and indirect interaction of virus with host cells in theinfection cycle. All CoVs exhibit a surface S glycoprotein, which bears the receptor-binding domain (RBD). The S glycoprotein has a distinct spike structure. When S glycoprotein binds to its host receptor, a host furin-like protease cleaves the S glycoprotein, which liberates thespike fusion peptides, allowing entry of the virus into the host cell [18]. Thefurin-like protease-generated S1 and S2 exist as a S1/S2 complex, where S1 in a homotrimeric form interacts with the host cell membrane and S2 penetrates the cytosolic area. For SARS-CoV and MERS-CoV, the S1 C-terminal domains (CTDs) have a dual role in virus entry via attachment and fusion. The S1 NTD binds to carbohydrate receptors because the S1 domains act as the RBD. TheCTD of S1 recognizes protein receptors via RBDs.
3.2. Nucleocapsid (N) Protein
In RNA viruses, tn class="Gene">he Nprotein recognizes the viral RNA genome. TheNprotein (PDB: 6M3M) binds to the RNA genome via theNTD and CTD. TheNprotein tethers to the viral RNA and replicase–transcriptase complex (RTC). NSP3 (PDB: 6VXS) of CoV blocks the innate immune responses of hosts. After entrance into the host cells, for CoV transcription and particle release, RNA chaperones such as nonspecific nucleic acid binding proteins potentiate ssRNA conformation shifts. Representatively, theNprotein is known as the RNA chaperone protein. For example, glycogen synthase kinase 3 (GSK3) phosphorylates theSARS-CoVN-protein and thus, GSK3 inhibition contributes to reduced replication activity of SARS-CoV [19]. In addition, heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1) regulates the preformed mRNA splicing in the nucleus and continuous translation. hnRNPA1 interacts with SARS-CoVNprotein to form a replication and transcription complex during ssRNA genome biosynthesis [20].
3.3. Envelope (E) Protein
Eprotein is tn class="Gene">he most abundant structural protein needed to assemble virus particles in the cytosol. As a TMprotein, E protein is the smallest structural protein with a MW of 12 kDa. The E protein has an NTD in the extracellular region and a CTD in the cytosolic region. The E protein bears an ectodomain in theNTD and an endodomain in theCTD. It has also an ion channel domain. E protein is present in the cytosolic region of infected cells and only a limited amount is incorporated into theenvelope of virions [21]. Most E proteins assemble and bud in new virus particles.
3.4. Membrane Glycoprotein (M)
Then class="Gene">M protein contains three TM domains and is an abundant structural protein with a MW of 30 kDa. It consists of a small glycosylated NTD in the extracellular region and CTD in the cytoplasmic region. TheMprotein forms a scaffold for virus assembly in the cytosol via binding to S glycoprotein and Nprotein [22]. For example, E protein and Nprotein are co-expressed with Mprotein to form virus-like particles (VLPs) that are released from the cells, as theM and E protein are involved in CoV assembly. Then, CoVs bud into the ERGIC, trafficking by membrane vesicles and transported via the exocytosis-secretory pathway [23,24]. The dimeric Mprotein binds to the nucleocapsid.-Mprotein binds to S glycoprotein for S glycoprotein retention in the ERGIC/Golgi complex. TheM-Nprotein complex keeps theNprotein–RNA complex stable, for nucleocapsid and the viral assembly.-M and E proteins constitute the virus envelope for successful release of virus-like particles.
3.5. Hemagglutinin-Esterase (HE) Dimeric Protein
HE hemagglutinates and destroys receptors. As RNA viruses, CoVs bear RDEs, which are used in effective attachment to hosts and also reversely in detachment from the hosts. For example, enveloped RNA viruses evade the hosts via their RDEs. Currently, RDE-related functional enzymes such as neuraminidase (NA) and SA-O-acetyl-esterase are known. SA-O-acetyl-esterase was originally identified in influenza C virus and in nidoviruses (CoV and torovirus) as well as in salmon anemia virus (teleost orthomyxovirus). The origin and evolution of CoVSA-O-acetylesterases are correlated to other viruses. The fusion event of S glycoprotein and HE is specific for HCoV attachment to SA-associated receptors in the host [25]. TheHE has acetylesterase activity [26]. In early SA-related biology, influenza A/B viruses were found to recognize chicken erythrocytes in 1942 [27]. They caused hemagglutination through clumping by virus-borne hemagglutinin. These phenomena were widely found in influenza viruses, paramyxoviruses, Newcastle disease (NDV) and mumps virus.
4. Relationship between C4-O- and C9-O-Acetyl SA Preferences of CoVs in Host Cell Recognition
4.1. General O-Acetylation of SA
SAs have various derivatives of n class="Gene">more than 50 chemically different structures formed from the basic N-acetylneuraminic acid (Neu5Ac) on themain ring of pyranose and theglycerol side chain. SA are modified by acetyl-, lactyl-, methyl- and sulfo-groups individually or in multiple combinations [28]. Multiple enzymes are involved in themodifications [29]. Historically, the first discovered SA was crystallized by Gunner Blix via a hot mild acid extraction of bovine submaxillary mucin in 1936. It consisted of two acetyl groups. Among these, only one acetyl group was attached to nitrogen [30]. Blix isolated a 9-O-acetyl SA of the common SAN-acetylneuraminic acid (Neu5Ac), chemically described as 9-O-acetyl-N-acetylneuraminic acid (Neu5,9Ac2). Neu5,9Ac2, Neu5Ac and Neu5Gc are naturally occurring SA species in mammals. A common modification is O-acetylation. In fact, O-acetylation of SAs is common in organisms.
Then class="Chemical">O-acetyl modification occurs in single positions of C-4, C-7, C-8 and C-9 of SA as well as in combined C-positions to yield Neu4,5Ac2, Neu5,9Ac2 and Neu5,7,9Ac3 SAs. Neu5Ac9NAc is a chemical and biologic mimic of Neu5,9Ac2 in theSA-glycans. The C-7 and/or C-9O-acetylations are catalyzed by theSAO-acetyltransferase enzyme, Cas1 domain containing 1 (CasD1) (Figure 6). CasD1 catalyzes the addition of acetyl groups to theSA C-7 at the late Golgi apparatus compartment [31]. Thereafter, an enzyme termed “migrase” transfers the additional acetyl-group from C-7 to C-9, although this enzyme has not been identified [29]. CasD1 uses acetyl-coenzyme A as a donor substrate and CMP-Neu5Ac as an acceptor substrate, but with weak activity on CMP-Neu5Gc.
Figure 6
CasD1, SA O-acetyltransferase, transfers acetyl groups to C7 position of SA (Neu5Ac), from which it migrates to the C9 position (Neu5,9Ac2). The additional acetyl group is added to C7 of SA (Neu5,7,9Ac3) by the same CasD1. The SA O-acetylesterase cleaves of the acetyl groups.
Biologically, then class="Gene">SA O-acetylation event confers merits to hosts such as protection from pathogenic invasion and maintenance of systemic self-homeostasis. TheO-acetylation event of SAsprotects theSA-containing glycans from neuraminidase (NA)/sialidase action, because O-acetyl-groups inhibit microbial NA activity. The chemical structure of theO-acetyl group is quite unstable and susceptible to esterase enzymes. Sialic acid cleavage of the di-acetylated Neu5,7,9Ac3 by bacterial NAs decreases two-fold, when compared to mono-O-acetylated Neu5Ac. TheO-acetylated glycanmodification invites interaction with viruses, antibodies and mammalian lectins [32]. Therefore, theSAO-acetylation modification confers specific functions to organisms.
4.2. Evolutionary Acquisition of C4-O-Acetyl and C9-O-Acetyl SA Recognition by HE Enzymes
4.2.1. C4-O-Acetyl Modification
For example, in tn class="Gene">he horse, C4-O-acetylmodification of Neu5Ac (SA) occupies more than 50% of the total SA content. TheC4-O-acetylated Neu5Ac, Neu4,5Ac2, inhibits theinfluenza A2 virus HA. De-acetylation reagents such as NaOH or NaIO4 treatment completely hemagglutinateNeu4,5Ac2 by elimination of theC4-O-acetyl group [33]. TheC4-O-acetyl Neu5Ac species are found in various sources such as equine erythrocyte GM3, starfish A. rubens and fish [34,35,36,37,38]. C4-O-acetylated Neu5Ac facilitates the initial attachment of viruses to target cells. Like theinfluenza C virus, infectious salmon anemia virus (ISAV), a member of the Orthomyxoviridae family, contains HE and HEF proteins to mediate virus entry and exit. C4-O-Ac Neu5Ac is themajor receptor determinant of ISAV in receptor binding and destruction [38], while theinfluenza C virus recognizes C9-O-Ac Neu5Ac. Theacetylesterase RDE of ISAV cleaves C4-O-Ac via 4-SA-O-acetylesterase with a short turnover time, whereas C9-O-Ac Neu5Ac is cleaved by 9-SA-O-acetylesterase with a long turnover time [34].
The position of n class="Gene">SA O-acetylation is linked to functions including substrate differentiation of enzymes such as NAs and esterase by C4O-Ac. Previous development of O-Ac site-selective NA inhibitors were based on the conceptual consideration of different O-Ac positions. TheO-Ac of SAs is site-specific, as C4 of Neu5Ac is considered to be a potential position for modification. Historically, inhibitors of influenza A and B viruses-sialidases were designed by Von-Itzstein in 1993 [39]. The Ac group-based C4 substitution interacts with amino acid Glu-119 present in the active site of sialidase. Guanidine-attached C4 of C2–C3 unsaturated SA (Neu5Ac2en) inhibits activity of sialidases isolated frominfluenza A virus (Singapore/1/57) and B virus (Victoria/102/85). Thesame scenario was applied for sialidase inhibition of thehuman parainfluenza virus type 3, which has HN and fusion proteins [40]. TheC4 of Neu5Ac2en was substituted by alkyl groups such as the O-ethyl group. For example, Zanamivir has a substitution with a 4-guanidino group with an IC50 of 25 μM. Thus, sialidase inhibition is important for C4modification of Neu5Ac2en. Later, oseltamivir with the tradename Tamiflu (Basel, Switzerland) and zanamivir with the tradename Relenza (London, UK) were established [41]. These drugs exhibit some adverse side effects that restrict clinical use.
4.2.2. SA C9-O-Acetyl Modification
Then class="Gene">SA 9-O-acetylation in hosts allows hosts to evade influenza A virushemagglutinin (HA) recognition and some lectins of factor H (FH), CD22/Siglec-2 and sialoadhesin/Siglec-1. Instead, theinfluenza C virus HA recognizes the hosts. β-elimination and permethylation eliminate the 9-O-acetyl group fromSAs. Chemical modification of theC-9 position of Neu5,9Ac2 generates a 9-N-acetyl analog, 9-acetamido-9-deoxy-N-acetylneuraminic acid (Neu5Ac9NAc), a mimic of Neu5,9Ac2 with influenza C virus-binding capacity, which is not cleaved by theHE [42]. SAO-acetylesterase regulates the presence of 7,9-O-Ac and 9-O-Ac. SAO-acetylation and deacetylation are involved in development, cancer and immunology. SAO-acetylation alters host lectin bindings such as siglecs [29]. The presence of 9-O-Ac can also reduce the activity of NAs [43]. SAmodifications regulate pathogen binding or pathogen NAs. Influenza A/B/C/D viruses use SA as their entry receptors. Influenza A and B subtypes bind to SAs via HA and NA to allow endocytosis of the virus and fusion of the viral envelope with endosomes. In contrast, influenza C and D subtypes bear only one coated glycoprotein, termed theHE fusion protein (HEF). TheHEF acts as the HA and NA. HEF recognizes 9-O-acetyl SA for entry into cells, while the esterase domain removes 9-O-acetyl-groups and liberates the virus frommucus and mis-assembled virus aggregates after budding. The9-O-Ac on cells prevents theNA activity and HA binding of theinfluenza A type virus [44].
5. HE of CoVs
5.1. Evolutionary Origin and Classification of the CoV HE
Certain viruses use glycon class="Chemical">proteins such as HA, HE, S and HEF for host receptor binding or destruction. Coronaviridae, Orthomyxoviridae, Paramyxoviridae and Adenoviridae utilize SAs as binding molecules for attachment and entry. However, only limited human pathogens recognize O-AcSA.
5.1.1. Influenza Virus A and B Spike Proteins of HA and NA
Influenza A and B viruses bear two n class="Gene">spikes of receptor-binding HA and NA [45].
5.1.2. Influenza C virus HA-HEF
HEF is indeed an ancient type of n class="Gene">SA-O-acetylesterase. In contrast to A/B, theinfluenza C virus bears one spike with triple functions of HEF as a homotrimer [46]. Each HEF subunit bears two Neu5,9Ac2-binding sites and binds to the 9-O-acetyl group. In parallel, another modification of O-acetylation is found. Indeed, influenza C virus bears SA-O-acetylesterase [47], which converts 5-N-acetyl-9-O-NeuAc (Neu5,9Ac2) to 5-Neu5Ac. The9-O-acetyl SA is a unique determinant for theinfluenza C virus receptor and Neu5,9Ac2 is crucial for receptor activity, but notNeu5Gc or Neu5Ac [48]. Neu5,9Ac2 is an essential determinant for influenza virus C type-specific host cell tropism. NAs cleave the α-ketosidic linkages to theD-Gal or GalNAc. SA-O-acetylesterases cleave different O-acetyl linkages (Figure 5). The OH-group of Tyr224 and theguanidino group of Arg236 interact with theCH3CO-carbonyl oxygen [49]. HEF SA-O-acetylesterase is found in several enveloped (+) ssRNA viruses of influenza C virus and also in certain CoVs and toroviruses [47]. TheCoVs are different from the orthomyxoviruses, which hold a segmented (−) ssRNA genome and are instead evolutionary linked to the family Coronaviridae, order Nidovirales [45].
5.1.3. CoV SA-O-Acetylesterase HE
CoVs and toroviruses of tn class="Gene">he Coronaviridae family are specific for theO-AcSA receptors. Their S and HEglycoproteins are similar to influenza C virusHEF. CoVs and all toroviruses bear HE gene form class I envelopemembrane proteins of about 400 amino acid residues which bear 7 to 12 N-glycosylation sites [50]. HEmultimer forms enter virions. BovineCoV (BCoV) and HCoV-OC43, similar to influenza C virus, recognize Neu5,9Ac2 and bear SA-9-O-acetylesterase [8]. CoVHEs are all O-acetylesterases. TheHE enzymes found in torovirus, CoV and influenza C virus are evolutionarily interspecies-mutated with about 30% homology by heterologous RNA recombination [51] and horizontal gene transfer. Therefore, viral HEs are diverse and widespread over evolution.
5.2. Substrate Diversity of the CoV HEs
HEs as n class="Gene">envelope proteins are found in CoVs, orthomyxoviruses and toroviruses. CoronaviralHEs are involved in virus attachment to SA species. HEprotein in β-CoVs binds to Neu5,9Ac2 formSA and agglutinates the red blood cells (RBCs) of rodents [52]. As with SA-O-acetylesterase, HE potentiates viral entry with the S protein and spreading via themucosal glycans. It contains a carbohydrate-recognizing domain (CRD) known in lectin. TheHEglycan-binding domain (GBD) mediates virus attachment to SAs on host cells. HE is the only HA. This indicates that compared to the S glycoprotein, HE is only minor a HA and the S glycoprotein mainly attaches to the cell surface. TheHEprotein of murine hepatitis virus (MHV), an enveloped CoV, binds to SA-4-acetylester or SA-9-O-acetylester of the carcinoembryonic antigen cell adhesion molecule 1a (CEACAM; known as CD66a) as the key receptor [53]. MurineCoVsHEs acquired by horizontal gene transfer, bind to C9-O-Ac Neu5Ac. However, some murineCoVHEs cannot bind to C4-O-Ac Neu5Ac. The original mouseMHVHE binds to C9-O-Ac Neu5Ac, while theMHV S-strainHE evolutionarily acquired the ability to bind to C4-O-Ac Neu5Ac [12,53,54]. In terms of structure, the C5 N- and C9 O-AcNeu5Ac-accomodating hydrophobic pocket was shifted to a C5 N- and C4-O-Ac Neu5Ac-accomodating pocket [55].
Type I HE is specific for tn class="Gene">he 9-O-acetylated SAs (9-O-Ac-SAs). Type II HE is specific for 4-O-Ac-SAs. TheSA-binding shift indicates quasi-synchronous adaptations of theSA-recognition sites of the lectin and esterase domains. Type I HEmonomers of β-CoV lineage A have a bimodular enzyme–lectin domain similar to cellular glycan/carbohydrate-modifying proteins. Originally, HE homologs are found in various viruses including toroviruses and orthomyxoviruses such as theinfluenza virus C/D and isavirus, as well as the exceptional case of β-CoV lineage A among CoVs. TheHE gene was transmitted to a β-CoV lineage A progenitor via horizontal gene transfer from a 9-O-Ac-Sia–recognizing HEF, as shown in influenza virus C/D. HE acquisition and expansion occurred by cross-species transmission over HE evolution and this phenomenon reflects viral evolutionary adaptation to host SA-containing glycans. Therefore, CoVHE receptor switching precedes virus evolution driven by SA-containing glycan diversity of hosts. For instance, the BcoVHE prefers 7,9-di-O-Ac-SAs, which is also a target of thebovine torovirusHE. For a more outstanding case, such a switching event occurred in themurineCoVs for the β-CoV lineage A type switch toward O-Ac-SA recognition. In theHE specificity of murineCoVs, two different murineCoV subtypes of virus group exist with one subtype possessing the typical 9-O-Ac-SA (type I) attachment factor and the other exclusively 4-O-Ac-SA (type II) attachment virus group [56].
The first n class="Species">coronaviral HEproteins identified were from theporcine hemagglutinating encephalomyelitis virus (PHEV), BCoV and HCoV-OC43, which bear SA-9-O-acetylesterases similar to HEF [8]. RatCoV (RCoV) has SA-4-O-acetylesterases, converting Neu4,5Ac2 to Neu5Ac [53,57,58]. Some murineCoVs prefer 4-O-Ac-SAs and others 9-O-Ac-SAs. HCoV-OC43 and BCoV prefer α2-6-SA 9-O-acetylation by their SA-O-acetyleseterases. The S glycoproteins of BCoV and HCoV-OC43 are Neu5,9Ac2-recognizing lectins and agglutinate murine, rat and chicken erythrocytes due to the enriched 9-O-Ac-SA species [52]. BCoV and HCoV-OC43 adapted to SA receptor determinants of 9-O-Ac-SA receptors [59]. For a second receptor, the binding of S glycoprotein to Neu5,9Ac2 receptor is essential for entry into cells. BCoV-infection is prevented by prior treatment of cells with NA enzyme or with viral SA-O-acetylesterases, blocking the roles of HE and S glycoprotein in SA-dependent entry to host cells.
6. CoVs Infection of Human Hosts
6.1. CoVs Utilize SAs and SA Linkages as Attachment and Entry Sites to Human Host Cells
Several β-n class="Species">CoV genera such as BCoV bind to O-acetylated SAs and bear an acetylesterase enzyme to act as a host cell RDE. Certain α-CoV and γ-CoV are deficient for the comparable acetylesterase enzyme but have a preference to NeuAc or NeuGc type SA species. Infectious bronchitis virus (IBV) and transmissible gastroenteritis virus are such examples. Additionally, both α-CoV and γ-CoV also include sub-members deficient of any SA-recognizing activity. During evolution, some subtypes of SARS-CoV and HCoV-229E acquired SA-binding capacity. TheSA-binding activities of BCoV, transmissible gastroenteritis coronavirus (TGEV) and IBV are well known [60].
6.1.1. α-Coronavirus
In α-CoVs such as n class="Species">TGEV, HA-activity is attributed to theSA-recognizing activity to α2,3-NeuGc [61,62]. TheSA-binding site is present on theN-terminal region of the S-glycoprotein of TGEV. TGEV has two types with enteric and respiratory tropism. TherespiratoryTGEV has theporcineaminopeptidase N (pAPN)-binding domain and SA-binding domain. Nucleotide 655 of the S gene is essential for enteric tropism and theS219Amutation of the S glycoprotein confers the enteric to respiratory tropism shift. In addition, a 6-nucleotide insertional mutation at nucleotide 1124, which yields theY374-T375insND shift of the S glycoprotein, causes enhanced enteric tract tropism. TGEV interacts with SA species on mucin-like glycoprotein (MGP), a highly glycosylated protein, in an SA-dependent manner, on mucin-secreting goblet cells [6]. MGP SA-binding allows virus entry via themucus layer to the intestinal enterocytes. Different fromTGEV, the S glycoprotein of porcineCoV has no hemagglutination activity due to deletion of theSA-binding site of the S glycoprotein [61]. The loss of SA-binding activity is correlated to the non-enteropathogenicity. SAs function as HA-mediated entry determinants for TGEV, causing the enteropathogenic outcome of the virus, and SA-recognition activity is also responsible for virus amplification in cells. SA-binding activity-deficient TGEV can propagate in cells through pAPN, known as CD13, as a receptor [62,63]. TheSA-binding activity potentiates infection and is crucial for intestinal infection.
6.1.2. β-Coronavirus
In β-CoV, n class="Gene">HE mediates viral attachment to O-Ac-SAs and its function relies on the combined CBD and RDE domains. Most β-CoVs target 9-O-Ac-SAs (type I), but certain strains switched to alternatively targeting 4-O-Ac-SAs (type II). For example, theSA-acetylesterase enzyme in BCoVs and HCoV-OC43 is known to have hemagglutinizing activities as a type of SA-9-O-acetylesterase [8]. TheSA-acetylesterase is theHE surfaceglycoprotein in BCoV. The three-dimensional structure of BCoVHE is similar to other viral esterases [9]. TheHE gene is found only in the β-CoV genus. Theacetylesterase of murineCoVs differs in its substrate binding specificity from that of BCoV and HCoV-OC43, which is specific for O-acetyl residue release fromSA C-9. MurineCoVs prefer to esterize 4-O-acetyl-NeuAc [64]. The β-CoVacetylesterase destroys the receptors and this specificity is similar to that of influenza viruses. Acetylesterase activity can be inhibited by diisopropyl fluorophosphate and this agent decreases viral infection levels [65]. As deduced from theSAacetylesterase of HCoV-OC43 [8], the9-O-Ac-SA species is a receptor binding determinant for erythrocytes and entry into cells [59]. The BCoVHEprotein has dual activity of acetylesterase and HA [9]. BCoV widely agglutinates erythrocytes and purified HE only agglutinatesNeu5,9Ac2-enriched erythrocytes of rats and mice. BCoV and HCoV-OC43 can agglutinate chicken erythrocytes, while purified HE cannot. In contrast to theHEprotein, purified S glycoprotein can agglutinate chicken erythrocytes [52], indicating that themajor HA is the S protein which acts as themajor SA-binding protein. However, the role of O-Ac-SAs is not certain to be essential in receptors, and SA-binding activity may be essential only to theHEprotein, but not to the S glycoprotein [54].
6.1.3. γ-Coronavirus
In γ-CoVs, n class="Species">IBV strains, known as poultry respiratoryinfectious pathogens, can agglutinate erythrocytes. IBV prefers to recognize α2,3-NeuAc and theSA functions as a host entry receptor for infection [66]. Glycosylation of IBVM41 S1 protein RBD is crucial for interaction with chicken trachea tissue and RBD N-glycosylation confers receptor specificity and enables virus replication. Theheavy glycosylated M41 RBD has 10 glycosylation sites. N-glycosylation of IBV determines receptor specificity. However, the host receptor has not yet been found. NA treatment reduces the binding of soluble S to kidney and tracheal epithelial cells. TheIBV S protein recognizes epithelial cells in a SA-dependent manner. TheSA-binding ability of IBV is necessary for infection of tracheal epithelial cells and lung respiratory epithelial cells [67]. TheSA-binding site is located on S1 of theIBV S protein, although theIBV-specific protein receptor is not known. In contrast to BCoV or HCoV-OC43, IBV lacks an RDE. SA binding of IBV is likely more essential than in other viruses such as TGEV.
6.1.4. Torovirus
In torovirus, which belongs to the fan class="Gene">mily Coronaviridae, the toroviruses are grouped into the Torovirinae subfamily and the Torovirus genus. The known toroviruses can infect four species of hosts, constituting bovine, equine, porcine and human toroviruses. They mildly infect swine and cattle through theHEprotein, which is similar to the β-CoVHEprotein [68]. TheHEprotein is a class I membrane glycoprotein which forms homodimers with a MW of 65 kDa. The RDE protein HE reversibly binds to glycans [15] through binding to SAs. Theacetyl-esterase activity disrupts SA binding. HE hemagglutinatesmouse erythrocytes and cleaves theacetyl-ester linkage of glycans and acetylated synthetic substrate p-nitrophenyl acetate (pNPA) [69]. Similar to CoV, torovirus HE is an acetylesterase type, which cleaves theO-acetyl group from theSA C-9 position using Neu5,9Ac2 and N-acetyl-7(8),9-O-NeuAc [64]. However, torovirus HE exhibits a restricted specificity for theNeu5,9Ac2 substrate, but not for theNeu5,7(8),9Ac3 substrate, with a unique SA-binding site generated by a single amino acid difference in porcineThr73 and bovineSer64 for each HE [70].
6.2. SARS-CoV-2 Recognizes 9-O-Acetyl-SAs and MERS-CoV Recognizes α2,3-SAs as Attachment Receptors
The S n class="Chemical">glycoprotein SARS-CoV-2 initiates infection of the host cells. Themolecular basis of CoV attachment to sugar/glycan receptors is an important issue, as demonstrated by recent cryo-EM defining the structure of theCoV-OC43 S glycoprotein trimer complexed with a 9-O-acetylated SA [56]. Cryo-EM structures of the trimeric ectodomain of S glycoprotein were observed using forms complexed with Neu5Ac, Neu5Gc, sialyl–LewisX (SLeX), α2,3-sialyl-N-acetyl-lactosamine (α2,3-SLacNAc) and α2,6-SLacNAc, respectively. The receptor-binding site is commonly conserved in all CoV S glycoproteins, which attach to 9-O-Ac-SA species with similar ligand-binding pockets to theCoVHEs and influenza virus C/D HEF glycoproteins, indicating conserved recognizing structures [25]. The S glycoprotein-9-O-acetyl-SA interaction resembles the ligand-binding pockets of CoVHEs and influenza virus C/D HE fusion glycoproteins. HCoV-OC43 and BCoV recognize 9-O-Ac-SA. S glycoproteins engage 9-O-acetyl-SAs. The9-O-acetyl SAs are the binding site for HCoV-OC43 S glycoprotein and related β-1 CoV S glycoproteins, however SA-binding sites on the 9-O-acetyl sialyl receptors of MERS-CoV S glycoprotein and HCoV-OC43 S glycoprotein are different [71]. Thus, CoVs use two different entry and attachment receptors. Therefore, S glycoproteins of CoVs are distinct frominfluenza virus A HAs, which bind to theNeu5Ac species by conserved binding sites. The ligand-binding sites of BCoVHE enzyme, influenzaHEF enzyme and CoV S glycoprotein have evolved 9-O-Ac-SA binding through hydrogen bonding with the 9-O-acetylcarbonyl group and hydrophobic pocket formation with the 9-O-acetylmethyl group [71,72]. However, influenza HA cannot bind to 9-O-acetyl-SAs but can bind to NeuGcs [73]. TheHCoV-OC43 S glycoprotein, HCoV-HKU1 S glycoprotein, BCoV S glycoprotein and PHEV S glycoprotein, therefore, share the ligand-binding specificity of influenza C/D HEF enzyme, although they are functionally more similar to influenza virus A/B HA, whereas CoVHE or influenza virus A/B NA have RDE activities
CoVn class="Chemical">HEs are functionally similar to influenza virus C/D HEF glycoproteins. In CoV, the S glycoprotein recognizes the9-O-Ac-SA sugar, while theHE acts as the RDE enzyme with SA-O-acetyl-esterase activity to release virions frominfected host cells. For example, HCoV-OC43 also has a similar HE as an RDE [71]. In influenza C and D viruses, HEF glycoproteins act similarly to theCoVHE [74]. In influenza A virus, RDE NA releases virions from host cells. However, MERS-CoV does not have a similar enzyme and thus MER-CoV binding to SA receptors is mediated by energetically reversible interactions of thelipidrafts with increased SA receptors [75], thus enhancing dipeptidyl peptidase 4 (DPP4) or carcinoembryonic antigen-related cell adhesion molecule 5 (CEACAM5) recognition power and viral entry [76] and membrane-associated 78-kDa glucose-regulated protein (GRP78) [77].
MERS-CoV S n class="Chemical">glycoprotein can hemagglutinatehuman erythrocytes and mediates virus entry into humanrespiratory epithelial cells. MERS-CoV S glycoprotein attachment is not observed for 9-O-acetylated or 5-N-glycolyl SAs, but is observed for α2,3-SA linkage over α2,6-SA linkages. SA-binding sites of MERS-CoV S glycoprotein and HCoV-OC43 S glycoprotein are not conserved [78], although they engage α2,3-SAs on the avian host cell surface [79]. MERS-CoV recognizes α2,3-SA and to a lesser extent the α2,6-SAs and sulfated SLeX for binding preference. Thus, S glycoproteins may have independently evolved SA recognition. The acquisition of SA-binding ability of MERS-CoV S seems to be an evolutionarily recent event, because HKU4 S1 and HKU5 S1 cannothemagglutinatehuman erythrocytes [75], indicating flexible evolutionary exchange allowing cross-species transmission towards host cell tropism of CoVs. In conclusion, CoV recognition of 9-O-Ac-SAs for infection is based on a conserved sequence for engagement of SA-related carbohydrate ligands across CoVs and orthomyxoviruses.
6.3. Host Receptors of CoVs
CoV S n class="Gene">spikes recognize diverse surfacemolecules as the attachment or entry site. Animal and human coronaviruses evolve to acquire thesame host receptors and attachment factors and overcome the interspecies barrier from animals to human. Specifically, S glycoprotein interaction with its binding receptor determines host tropism, pathogenicity and therapeutic clues [80]. CoVs recognize multiple host receptors via distinct S domains. The host receptors for β-CoVSARS-CoV includes angiotensin-converting enzyme 2 (ACE2). As a lineage C β-CoV, theMERS-CoV S glycoprotein binds to DPP4 [81,82,83]. MERS-CoV S glycoprotein recognizes α2,3-SA over α2,6-SA-bearing receptors. TheN-terminal subunits of the S1/S1A/S1B/S1D complex of MERS-CoV recognize DPP4. MERS-CoV recognizes CEACAM5 as the attachment factor for entry [78]. Among the six HCoVs, the α-CoVHCoV-229E S protein recognizes humanAPN (hAPN) [84]. α-CoVHCoV-NL63 and the lineage B β-CoVSARS-CoV S glycoproteins bind to ACE2. Meanwhile theprotein receptors specific for lineage A β-CoVs such as HCoV-HKU1 and HCoV-OC43 are not known yet.
BCoV, n class="Species">HCoV-OC43, HCoV-HKU1 and TGEV recognize O-acetyl-SAs as attachment molecules. In addition to O-acetyl-SA, HCoV-HKU1spikes additionally bind to major histocompatibility complex class I (MHC-I) C as attachment sites [85]. SARS-CoV uses dendritic cell (DC)-specific intercellular adhesion molecule (ICAM)-3–grabbing nonintegrin (DC-SIGN) for attachment [86]. For glycan interaction, HCoV-NL63 and mouse hepatitis virus utilize heparan sulfate (HS) proteoglycans as attachment enhancers [87]. In general, ACE2, APN, heat shock protein A5 (HSPA5), furin, heparan sulfateproteoglycans (HSPGs) and O-acetyl-SA are CoVs-recognizing candidates.
6.3.1. Angiotensin-Converting Enzyme 2 (ACE2) as the SARS-CoV Host Receptor
Structure and Role of the Host SARS-CoV Receptor ACE2
SARS-CoV-2 needs n class="Gene">ACE2 for entry. Host proteases such as humanACE2help viral entry through removement of a barrier to enter human cells through unknown receptors. HumanACE2 is known for its role as theSARS-CoV-2 entry receptor and theSARS-CoV receptor. The enzyme ACE-2 in the renin-angiotensin system (RAS) is associated with CoV entry into lungs. ACE2mediates SARS-2002 entry into host cells via S glycoprotein interaction with theACE2 receptor. TheACE2 levels on the plasma membrane correlate with virus infectivity. ACE2 expression is present in most tissues such as the lung epithelium. It is highly expressed by respiratory epithelial cells and type I/II lung alveolar epithelial cells [88]. The host receptor is not linked to the classification of CoVs. MERS-CoV, a β-CoV, does not recognize theACE2 receptor. In contrast, the α-CoVHCoV-NL63 recognizes theACE2 receptor. ACE2 is a membrane-anchored carboxypeptidase with 805 amino acid residues and is captopril-insensitive. It contains 17 amino acid residues as a signal peptide in theN-terminal region, a type I membrane-anchored domain in the C-terminal region, an extracellular N-terminal domain with heavy N-glycans, a N-terminal SARS-CoV-binding and carboxypeptidase site and a short C-terminal cytoplasmic tail. TheACE2 gene is located on chromosome Xp22. Two ACE2 forms are known, a membrane-bound form and a soluble form.
ACE cleaves angiotensin I (Ang I) substn class="Species">rate to Ang II. Ang II recognizes the Ang II receptor type 1 (AT1R), contributing to systemic and local vasoconstriction, fibrosis and salt retention in vascular organs. ACE2 has the opposite function of ACE. ACE2 is a close homolog to humanACE. ACE2 activity on Ang II is about 400-fold higher than that on Ang I. Ang-1 to Ang-7 recognize the G protein-coupled receptor (GPCR) Mas to activate vasorelaxation, cardioprotection, antioxidative action, antiinflammation and anti-Ang II-signaling. Therefore, theACE2-Ang-1 to Ang-7 axis is a target candidate for cardiovascular diseases. ACE2 shows similar binding structures between nCoV and SARS-CoV. The three proteins of ACE, Ang II and AT1R contribute to progression of lung injury in humans. ACE2 removes a single amino acid residue from Ang II to yield the vasodilator, named Ang 1-Ang 7. ACE2 cleaves Ang-I to Ang 1–Ang 9 and Ang II to Ang-1 to Ang-7. The biggest difference between ACE2 and ACE is thatACE2 has a non-inhibitory property by ACE inhibitors.
Pulmonary n class="Gene">ACE2 is potentially a candidate target in CoV-involved inflammatory pathogenesis. If ACE inhibitors and Ang II-AT1 blockers are dosed, ACE2 expression is increased. However, currently we have no conclusive evidence that the inhibitors help SARS-CoV or SARS-CoV-2 entry. Rather, SARS-CoV infection reduces ACE2 expression. Therefore, SARS-CoV-2 host tropism is not related to ACE2 expression. ACE2 levels and ANG II/ANG 1–7 levels regulate the pathogenic progression. ACE2 expression is upregulated by gene polymorphisms and ACE inhibitors or Angiotensin II receptor blockers such as sartans.
Host Cell ADAM17 and TMPRSS2 Competitively Cleave ACE2
A disintegrin and metallopeptidase domain (ADAM) family of Zn-metalloproteinases belongs to membrane proteins. The well-known ADAM17 is a TNF-α-converting enzyme (TACE), called the sheddase for TNF-α. Other ADAM sheddase family members include ADAM9, ADAM10 and ADAM12. ADAM17mediates ACE2 shedding. SARS-CoV S glycoprotein activates cellular TACE and consequently facilitates virus entry. Soluble ACE2 as theN-terminal carboxypeptidase domain form is derived from the original ACE2 form by an ADAM17metalloprotease in themembrane [89]. ADAM17 is indeed an enzyme that can convert membrane type pro-TNF-α to soluble TNF-α, a functional proinflammatory cytokine. Therefore, ADAM17 inhibition indicates an anti-inflammatory response and ADAM17 inhibitors are promising candidates for TNF-α-induced inflammatory diseases. The short C-terminal domain of ACE2 is removed by ADAM17 and TMPRSS2. However, TMPRSS2 cleaves ACE2 competitively with theADAM17metalloprotease. SARS-S protein-ACE2 binding leads to ADAM17/TNF-α-converting enzyme (TACE)-cleavage of ACE2, facilitating extracellular ACE2 shedding and consequent SARS-CoV entry into host cells [90,91]. Only TMPRSS2 cleavage allows SARS-CoV entry into host cells through endocytosis and fusion. Soluble ACE2 also recognizes the virus and prevents SARS-CoV-2 infection. SARS-CoV-2 infection requires membrane ACE2 and TMPRSS2. TheACE2–B0AT1 complex binds to the S glycoprotein of SARS-CoV-2. Intestinal membrane ACE2 and lung TMPRSS2-shedded ACE2 can act as alternative entry sites for SARS-CoV-2. SARS-CoV-2 infects the lungs and intestine via TMPRSS2-cleaved ACE2. If TMPRSS2 is engaged in SARS-CoV-2 entry and ACE2 downregulation, TMPRSS2 inhibition would lead to COVID-19 prevention. Although ACE2 is expressed both in type I and type II lung alveolar epithelial cells, SARS-CoV and SARS-CoV-2 target only type II epithelial cells due to theACE2–TMPRSS2 interaction. Therefore, supplementation of ACE2 (soluble ACE2) or Ang-1 to Ang-7 should be a way to reduce SARS-CoV-2-related symptoms.TMPRSS2-cleaved n class="Gene">ACE2 is involved in SARS-CoV and MERS-CoV infections. SARS-CoV-2 uses ACE2 for cell entry through TMPRSS2 priming of the S glycoprotein (Figure 7). Infection of theH7N9influenza and H1N1influenza A subtype viruses are also mediated by TMPRSS2-cleaved ACE2. This implies thatTMPRSS2 can be targeted as a strategic antiviral therapy [92]. Transmembrane protease serine 2, termed TMPRSS2, a type II TMSerprotease (TTSP), also cleaves ACE2. ThehumanTMPRSS2 gene, located on chromosome 21, comprises androgen receptor elements (AREs) in the upstream 5′-flanking region [93]. TMPRSS2 expression is regulated in an androgen-dependent manner. TheTMPRSS2 gene encodes 492 amino acids. The original form is cleaved into themajor membrane form and theminor soluble form. TMPRSS2 activates protease activated receptor 2 (PAR-2) and activated PAR-2 upregulates matrix metalloproteinase-2 (MMP-2) and MMP-9. TMPRSS2-activated hepatocyte growth factor (HGF) induces c-Met receptor signaling. TMPRSS2 activates SARS-CoV and MERS-CoV. TheSARS-CoV S glycoprotein is cleaved by host-borne TMPRSS2, human airway trypsin-like protease (HAT), TMprotease, serine 13 (MSPL), serine proteaseDESC1 (DESC1), furin, factor Xa and endosomal cathepsin L/B. SARS-CoV can enter cells upon cleavage by protease TMPRSS2 or endosomal cathepsin L/B [90]. Virus S protein precursor is cleaved by host proteases. Thespikes are cleaved by endosomal cathepsin and by Golgi or plasma membrane TMPRSS2 in the step of assembly or attachment and release. Theserine protease inhibitor camostat effectively blocks lethal SARS-CoV infection to mice. However, serine protease and cathepsin inhibitors are not effective. Thus, TMPRSS2 is suggested to be an acting protease for SARS-CoV entry into host cells, but not by cathepsin. Cis-cleavage liberates SARS-CoV S glycoprotein fragments into the extracellular supernatant. Trans-cleavage activates theSARS-CoV S glycoprotein on the target cells, potentiating efficient SARS-CoV S glycoprotein-driven viral fusion. TMPRSS2-activated SARS-CoV facilitates enveloped virus entry into cells. TMPRSS2 is important for SARS-CoV entry and infection [81,94,95,96].
Figure 7
SARS-CoV-2 uses angiotensin-converting enzyme 2 (ACE2) for cell entry through TMPRSS2 priming of S glycoprotein.
The fact tn class="Gene">hat SARS- and MERS-CoV infections are potentiated by TMPRSS2 indicates thatTMPRSS2 is a promising target for therapeutic agents. For example, several Serprotease inhibitors such as camostat mesylate inhibit TMPRSS2–ACE2-involved SARS-CoV-2 entry. camostat, a serine protease inhibitor, reduces influenza virus titers in cell culture. camostat-treated TMPRSS2 inhibition in Calu-3 cells greatly reduces SARS-CoV viral titers and improves survival rate in SARS-CoV infectedmice. A treatment of 10-μM camostat blocks MERS-CoV entry to African green monkey kidney (Vero)-TMPRSS2 cells and blocks viral RNA synthesis in Calu-3 cells upon MERS-CoV infection. Aprotinin is a polypeptide with 58 amino acid residues that was isolated frombovine lungs. Another serine protease inhibitor, nafamostat, inhibits MERS-CoV entry and infection by TMPRSS2 inhibition [93]. nafamostat mesylate blocks theTMPRSS2–ACE2-involved SARS-CoV-2envelope–PM fusion and prevents SARS-CoV-2 entry [95]. nafamostat mesylate inhibits viral entry and thrombosis in COVID-19patients. Similarly, an FDA-approved mucolytic cough suppressant, Bromhexine hydrochloride (BHH), inhibits TMPRSS2 (IC50 0.75 μM) and hence blocks infection of CoV and influenza virus. MPRSS2 as a host factor plays a pivotal role in SARS-CoV and MERS-CoV infections. FDA-approved TMPRSS2 inhibitors are yet under development. Because TMPRSS2mediates efficient viral entry and replication, it should be a promising target for new therapeutics against CoV infection.
6.3.2. Dipeptidyl peptidase-4 (DPP4) as MERS-CoV Receptor
Then class="Chemical">Ser exopeptidase DPP-4/humanCD26 (PDB: 4L72), a type II TM ectopeptidase, functions as a host cell receptor for MERS-CoV. The RBD structure was characterized by crystallography approaches of theMERS-CoV S glycoprotein–DPP4 complex. DPP4 is a single type II TMglycoprotein with a small cytoplasmic tail in theN-terminal region and is present as a homodimeric form. DPP4 cleaves X-prolinedipeptides from theN-terminal region. S glycoprotein recognizes SA species and DPP44 as the attachment and entry receptors, respectively. TheMERS-CoV S1 N-terminal domain attaches to DPP4 as the host receptor [81]. The S2 C-terminal domain of MERS-CoV anchors to cellular PM to enter. MERS-CoV S glycoprotein is cleaved at a sequence between the S1 and S2 domains [96]. Another cleavage site S2′ is present in the S2 domain. MERS CoV S glycoprotein sialyl receptors are expressed in thecamel nasal respiratory epithelial cells and thehuman lung alveolar epithelial cells, which express DPP4. Binding capacities are hindered by theSA 9-O-acetyl group or SA 5-N-glycolyl group [75].
6.3.3. CEACAM Receptor
Entry of host cells needs binding of S glycon class="Chemical">proteins to theCEACAM receptor, forming S-protein-mediated membrane fusion. The trimeric S glycoprotein bears three S1 receptor heads. The three S1 heads of the virus bind to three receptor molecules on the host cell. Cholesterol is indirectly involved in membrane fusion through CEACAM engagement into “lipidraft” microdomains, increasing multiple S protein interaction with the receptors and triggering membrane fusion [97]. Theenveloped CoV, MHV, binds to CEACAMs on cholesterol-depleted cells in BHK cell cultures. TheNTD of S1 recognizes CEACAM1. For MERS-CoV, another CEACAM5 isoform is the attachment factor for virus entry [75]. TheCoV S1 NTD has a similar tertiary structure to humangalactose-recognizing galectins. MHV S1 NTD binds murineCEACAM1a and BCoV S1 NTD binds sugar [98,99,100]. CEACAM1a is a cell adhesion protein (CAM) and its mRNA is alternatively spliced. The cryo-EM structure of MHV S complexed with CEACAM1a was elucidated [101]. Thus, HCoVs evolutionarily combined thegalectin gene of hosts into their S1 glycoprotein gene, while BCoV S1 protein is present without such gene recombination but contains thesugar-recognizing lectin capacity. MHV S1 protein also evolutionarily acquired murineCEACAM1a-recognizing activity [102]. Therefore, CoVs are under evolution to adapt their host receptor interaction to infect cross-species hosts [80,103]. On the host side, to escape the lethal pressure fromCoV infections, hosts have also evolved to acquire SA-binding proteins such as siglecs to inhibit or activate the innate immune cells.
Both raft and non-n class="Disease">raft CEACAMs are involved in the virus–cell membrane fusion event. Formation of CEACAM-associated MHV particles or CEACAM-induced MHV fusion is possible by GPI-anchored CEACAMs through the binding between CEACAM and S proteins. However, MHV can bind to both GPI- and TM-anchored CEACAMs. In addition, soluble CEACAMs also mediate S glycoprotein-driven fusion [104]. This implies thatmembrane anchors are not intrinsically necessary. In fact, CEACAMs are present in different tissue-specific isoforms [105]. Nevertheless, GPI-anchored CEACAMs are more effective for MHVinfection than TM-anchored CEACAMs. Soluble CEACAM receptors can bind to viral S glycoproteins and induce conformational shifts to acceptable S glycoprotein-involved membrane fusions [106]. For example, soluble CEACAM forms interacts with S1 fragments [107] and alters the S1–S2 association stability [108] and S1 oxidation confirmation [109]. S proteins are structurally shifted prior to membrane fusion. For the cross-linking of viruses and cells, integral hydrophobic peptides of the S2 chain are embedded into membranes via membrane hydrophobic cholesterols.
6.3.4. Membrane-Associated 78-kDa Glucose-Regulated Protein (GRP78) or HSPA5
MERS-CoV S n class="Chemical">glycoprotein also recognizes a 78-kDa glucose–regulated protein (GRP78) or heat shock 70 kDa protein 5 (HSPA5), known as binding immunoglobulin protein (BiP) or Byun1, which is encoded by theHSPA5 gene in humans. HSP5A is a ER-resident unfolded protein response (UPR) protein. Stressed cell status such as viral infection increase expression and translocation of HSPA5 to the PM to form a membrane protein complex. GRP78modulates MERS-CoV entry in the presence of theDPP4 as a host cell receptor. Additionally, lineage D β-CoV and bat CoV HKU9 (bCoV-HKU9) also bind to GRP78 [76]. A cell surface receptor, GRP78, was predicted to be another COVID-19 receptor as an S glycoprotein binding site [110]. The prediction was made using the combined technology of molecular modeling docking with structural bioinformatics. GRP78 or BiP is a chaperone protein located in the ER lumen [111]. Known ER-bound enzymes include activating transcription factor 6 (ATF6), inositol-requiring enzyme 1 (IRE1) and protein kinase RNA (PKR)-like ER kinase (PERK) [112]. Depending on threshold of unfolded protein accumulation, GRP78 releases IRE1, ATF6 and PERK, and is activated, resulting in translation inhibition and refolding. Stress-overexpressed GRP78 can avoid ER retention and is translocated to themembrane. GRP78 translocated to the cell PM can recognize viruses by its substrate-binding domain (SBD) for virus entry into the cell (Figure 8). In sequence and structural alignments and protein–protein docking, RBD of theCoVspikeprotein recognizes theGRP78 SBDβ as the host cell receptor. The predicted region III (C391–C525) and region IV (C480–C488) of the S glycoprotein and GRP78 are highly potential binding sites. Region IV is theGRP78 binding-driving force. These nine amino acid residues are being molecularly targeted for the designation and simulation of COVID-19-specific drugs. This process is themechanism underlying the cell surfaceHSPA5 (GRP78) exposure and this is exploited to be used for pathogen entry. Such pathogenic entry into host cells has been observed in multiple infections including pathogenic human viruses such as human papillomavirus, Ebola virus, Zika virus and HcoVs—as well as fungalRhizopus oryzae [113,114,115,116]. Therefore, natural products can inhibit cell-surfaceHSPA5 recognition of the viral S glycoprotein.
Figure 8
Endoplasmic-reticulum (ER) stress responses and COVID-19 receptor (S glycoprotein binding site). Balance of ER stress and unfolded protein response (UPR) are regulated. ER-stress sensor proteins are IRE1 (inositol requiring 1), ATF6 (activating TF 6) and PERK (PKR-like ER kinase).
6.3.5. Aminopeptidase N (APN) is a Receptor of α-CoV HCoV-229E
Among tn class="Gene">he six HCoVs, the α-CoVHCoV-229E S protein recognizes hAPN known as CD13 or membrane alanyl aminopeptidase (EC 3.4.11.2). Porcine epidemic diarrhea coronavirus virus (PEDV) binds to protein receptor APN of human- and pigNeuAc species as its co-receptor. Apart fromhAPN, TGEV and PEDV bind to SA species [117], although SA recognition by TGEV is not essential in the first step of entry cycle. HCoV-229E recognizes hAPN known as CD13 for its entry receptor. hAPN (PDB: 4FYQ) or CD13 (EC 3.4.11.2), which is a Zn-dependent metalloprotease, has a MW 150 kDa with 967 amino acids. CD13 is a type II TMprotein with a short cytoplasmic domain in theN-terminal region and long extracellular region in theCTD. TheCTD has a pentapeptide sequence specific for the Zinc–MMPs. TheAPN binding domain is located on theCTD of PEDV S1 (amino acid 477–629 residues), while theSA-binding domain is found in theN-terminal region of PEDV S1 (amino acid 1–320 residues) [118]. CD13 is also a receptor for HCoV-229E, human cytomegalovirus, porcineCoVTGEV, feline infectious peritonitis virus (FIPV), feline enteric virus (FeCV) and canine-infectiousCoVs [119,120,121,122]. Homodimeric CD13 digests luminal peptides. ThehAPN-encoding ANPEP gene is a dominant component in proximal tubular epithelial cells, small intestinal cells, macrophages, granulocytes and synaptic membranes. If this gene is defective, leukemia or lymphoma are transformed [123]. Porcine and humanAPN exhibit about 80% protein identity. FIPV and FeCV are in thesame group as HCoV-229E and TGEV. Thus, porcineAPN is also an attachment site for pigTGEV with an additional second receptor. HCoV-229E first binds to CD13 and consequently clusters CD13 in caveolae-associated lipidrafts [120].
6.3.6. Heparan Sulfate (HS) is the HCoV-NL63 Attachment Site
For glycan inten class="Disease">raction, HCoV-NL63 and MHV utilize heparan sulfateproteoglycans (HSPGs) as attachment enhancers [87,124]. Viruses recognize HSPGs as attachment molecules. In thespike (S) protein-deficient virions, theMprotein recognizes HSPG. The S proteins generally bind to the viral cellular receptor. However, theMprotein also acts as a receptor in the early step of HCoV-NL63infection. TheMmembrane protein of HCoV-NL63 recognizes the attachment site of HSPGs. HCoV-NL63Mprotein binds to HSPG for the initial attachment of virus to host cells and thereafter, theM and S proteins cooperate for virus entrance into the host cells [125]. HSPGs are glycosaminoglycan (GAG)-carrying proteins frequently used as a secondary receptor for viral entry. HSPGs are composed of covalent-bonded HS chains as a GAG form. TheHSGAG linkage structure of tetrasaccharide exhibits GluAβ1,3GlcNAcα1,4Galβ1,3Galβ1,4Xylβ-O-serine. Glycosyltransferases involved in HSGAG synthesis include GlcAT-II (glucuronosyltransferase) and GlcNAcT-II (N-acetylglucosaminyltransferase II) for heparan sulfate synthesis (Figure 9). GAG is used as docking sites for virus interaction with the host cell surface. GAGs contain negatively charged N- and O-sulfated sugars [126]. The biosynthetic pathway and biologic roles in early embryogenic morphogenesis and vulval morphogenesis of HS and chondroitin sulfateGAG have been elucidated in Caenorhabditis elegans [127]. The negative charges mediate the interaction of GAGs and their ligands through electrostatic forces. Interaction of HSPG with ligands potentiates many virus infectious cycles. For examples, adeno-associated virus, human T cell lymphotropic virus type 1, human papilloma virus 16, herpes viruses, hepatitis B and C viruses, Kaposi’s sarcoma-associated herpesvirus, human papilloma viruses and Merkel cell polyoma virus recognize theHSPGs [128,129]. HSPGs increase virulence upon interaction with viral factors required for viral attachment and replication.
Figure 9
Structure of glycosaminoglycan (GAG)-linkage tetrasaccharide, GluAβ1,3Galβ1,3Galβ1,4Xylβ-O-serine and HS; (A) HS structure; (B) Sulfation of sugar residues; (C) Synthesis and localization of HS in host cells. Glycosyltransferases involved in GAG synthesis include (i) GlcAT-II (glucuronosyltransferase) and (ii) GlcNAcT-II (N-acetylglucosaminyltransferase II) for heparan sulfate synthesis. β-D-glucuronic acid (GlcA), α-L-iduronic acid (IdoA) and 2-O-sulfo-α-L-iduronic acid (IdoA(2S) are composed [124].
6.3.7. Major Histocompatibility Complex Class I (MHC-I) C is an Attachment Site for HCoV-HKU1
Although HCoV-HKU1 utilizes n class="Chemical">O-acetyl-SAs as attachment sites, theHCoV-HKU1 S protein also interacts with MHC-I C (HLA-C) as an additional attachment molecule [85].
6.3.8. DC-SIGN (CD209) is a Binding Candidate for SARS-CoV Entry
SARS-CoV uses tn class="Gene">he C-type lectins of DC-SIGN and DC-L-SIGN as additional or secondary receptors. Glycans on the S glycoprotein are recognized by DC/L-SIGN for virus attachment and entry. Seven glycosylation sites of the S glycoprotein have been found to be essential for DC/L-SIGN-driven virus entry [86,130].
6.3.9. Tetraspanin CD9 is a Surface factor for MERS-CoV Entry Via Scaffold Cell Receptors and Proteases
Tetraspanin n class="Gene">CD9, but not tetraspanin CD81, associates with DPP4 and the type II TMserine protease (TTSP) member TMPRSS2, a CoV-activating protease, to form a cell surface complex [131]. This CD9–DPP4–TMPRSS2 complex permits MERS-CoV pseudovirus entrance into the host cells. The tetraspanins have four TM spanning regions linked by one large and one small loop in the extracellular region. Tetraspanins form virus entry baselines and open CoV entry routes. To help viral entry into host cells, MERS-CoV S interacts with DPP4 receptors via the RBD. Receptor involvement causes cleavage using proteases such as the previously described TMPRSS2. Association of tetraspanin CD9 with theDPP4–TMPRSS2 complex triggers the S glycoprotein. MERS-CoVs enter the cells via endocytosis and cathepsins cleave the S proteins [132].
6.4. Effects of Receptor and Ligand S Glycosylation on Virus–Host Interaction
SAs are predon class="Gene">minant surface determinants for pathogen attachment, adherence and entry to host cells. Eleven representative vertebrate virus families utilize SAs as initial entry receptors or as attachment factors. Interaction of virus with SA-containing glycans is complex because virus SA-binding lectins are inherently of very low affinity. Viruses acquire enzymes to catalyze virion elution by regional depletion of binding receptors [56]. TM S glycoprotein recognizes oligosaccharide receptors. Using cryo-EM technology and observed structures of S glycoprotein trimers of CoVOC43 complexed with 9-O-acetylated SA, S glycoprotein was demonstrated to mediate virus adhesion and entry to host cells. All CoV S proteins show conservation in binding to 9-O-acetyl-SAs. MERS-CoV also recognizes 9-carbon sugar SA species. MERS-CoV S-1A binds to SA species. For example, SAα2,3- over SAα2,6-linkages expressed in human erythrocytes and mucins are preferentially targeted by MERS-CoV S-1A. Binding is hence blocked by SAmodification to 5-N-NeuGc and 7, 9-O-NeuAc species [73]. For example, impairment of ACE2 receptor glycosylation does not influence S-glycoprotein-ACE2 interaction, however, SARS-CoV-2 virus entry into respiratory epithelial host cells was downregulated [133]. Changes in ACE2N-glycans do not apparently influence interaction with theSARS-CoV S glycoprotein, but instead, impair viral S glycoprotein-mediated membrane fusion. The receptor glycan structures decide the entry of some human viruses. Changes in ACE2 receptor sialylation influences interaction affinity between virus ligands and host receptor. Inter-species or individual genetic variations such as drift and mutation may occur in SARS-CoVs. This explains currently emerging differences in CoV responses within thesame population such as humans.
On the otn class="Gene">her hand, from the aspect of virus ligand, the S glycoprotein decorates viral surfaces and is, therefore, the target for vaccination design. Virus internalization requires potential glycosylation of viral glycoproteins. Among the three viral envelope components, S and M are themajor glycoproteins and E is nascent and notglycosylated. TheMglycoprotein consists of a short glycosylated ectodomain in theN-terminal region. The S glycoprotein expressed in hemagglutinating encephalomyelitis virus is an HA that recognizes N-acetyl-9-O-NeuAc as a binding receptor expressed on erythrocyte surfaces [134]. For example, BCoVs attach to the surface receptor of N-acetyl-9-O-NeuAc (9-O-acetylated SAs) on host cells. TGEV and PEDV are currently known as a similar class of such CoVs. PEDV infects multiple hosts including bat, pig, human and monkey, where bats are considered to be the evolutionary origin for PEDV. The S glycoprotein of SARS-CoV-2 utilizes different glycosylation patterns to recognize its receptors. Theglycosylation sites in minimal RBD exhibits similar sites to other CoVs. The trimeric SARS-CoV-2 S glycoprotein is also highly glycosylated with 66 N-glycans, but a few O-glycans [135]. Glycosylation of S glycoproteins leads to immune evasion. In theMERS-CoV and thebat-specific CoV-HKU4, glycosylation is linked to zoonotic infection for fusion-based entry [136].
7. Pharmacology of Glycan-Related Anti-SARS-CoV-2 Agents
The en class="Gene">merging CoV-pandemic requires therapeutic agents to block the recognition, binding, replication, amplification and propagation of theCoV in humans. Protease inhibitors, RNA synthase inhibitors and S2 inhibitors are potential targets, and several agents are currently being evaluated. Efficient therapeutic drugs are themost reliable option for patients. The first attachment step of the viral amplification cycle is initiated on therespiratory cell surfaces, driven by the viral S protein. This is a potential therapeutic target. Soon after theSARS-CoV-2 outbreak initiated, theCoV S glycoprotein was demonstrated to recognize ACE2 as a binding receptor on human cells. HumanTMPRSS2 enzyme influences theCoV S glycoprotein activation, to facilitate virus infection. ACE2 binding and TMPRSS2 activation facilitate theCoVs to attach to human host cells. Mouse, nonhuman primate and human cells have been analyzed using single-cell RNA new generation sequencing (NGS). For example, for humaninfection, CoVs can enter nasal goblet secretory cells, because these cells express theproteins required for SARS-CoV-2 infection. In the lungs, theproteins are stored in the alveoli like air sacs of type II pneumocytes. In the intestine, the two proteins are expressed in entero-epithelial cells. ACE2 gene expression correlates with the IFN-related genes [137]. ACE2helps lung cells to tolerate cellular damage. Therefore, CoVsmay evolutionally take advantage of the defense mechanisms of host cells, hijacking such host-borne proteins.
In SARS-CoV-2, tn class="Gene">he ACE2 receptor is an attachment, entry and infection receptor into the cell, when the S glycoprotein is cleaved by a specific serine protease. SARS-CoV-2 infection is regulated by glycosylated SARS-CoV-2 viral particles and glycosylated ACE2 in the lung epithelial cells. RBD of theCoV S glycoprotein recognizes ACE2 [82]. Amino acid residues 442, 472, 479, 480 and 487 located on the receptor-binding motif (RBM) of the S glycoprotein RBD recognize humanACE2 [138]. Trimeric viral S glycoprotein is glycosylated and cleaved by a protease, furin, into two subunits, S1 and S2. Subunit S1 is further cleaved into theSA and SB domains and the SB domain recognizes humanACE2. TheN-glycosylated S2 subunit is involved in virus-ACE2 complex formation [139]. Therefore, theglycosylated ACE2 receptor is a key molecule for virus binding and fusion.
Plasma n class="Chemical">sera prepared frominfectedpatients is an alternative medication. The WHO has suggested this trial since the 2014 Ebola epidemic and 2015 MERS-CoV outbreak. In addition, Mab therapy is another option. For example, LCA50 Mab mimics produced by modification of plasma antibodies isolated fromMERS-CoVpatients was valuable [140]. Low molecular molecules are being examined for anti-virus activities fromalkaloids, glycan derivatives and terpenoids. Recently, anti-CoV drugs are being approached using molecular modeling, docking and simulation methods. Computation-assisted drugs via molecular modeling and docking toward drug targets are applied as anti-viral compounds against CoVs. They target humanACE2, PLpro (PDB: 3e9s), theCoVmain proteinase (PDB: 6Y84), 3-chymotrypsin-like (3C-like protease; 3CLpro), RdRp, helicase, N7 methyltransferase, human DDP4, RBD, protease cathepsin L, type II TMSerprotease or TMPRSS2. CoV 3CLpro (PDB: 6WX4) and the PLpro cleave the polyproteins to assemble virus proteins. For newborn RNA genomes, RdRp is used as a replicase for the complementary RNA strand synthesis, which uses the virus RNA template.
7.1. N-Glycosylation Inhibition by Chloroquine (CLQ) and Hydroxychloroquine (CLQ-OH)
CLQ and n class="Chemical">CLQ-OH are under investigation worldwide to treat COVID-19 (Figure 10). CLQ and its derivative CLQ-OH block CoV replication, amplification and spread in in vitro culture via inhibition of ACE2 receptor glycosylation. In HCoVs, interaction of the S glycoprotein with gangliosides initially occur as the first entry step during the replication cycle of the virus. CLQ and CLQ-OH have been alternative drugs for RA and several autoimmune diseases for 70 years, although they are anti-malariaprophylaxis drugs. CLQ-OH is an aminoquinoline with less toxicity than CLQ. CLQ-OH bears an N-hydroxyethyl side chain, which increases its solubility compared to CLQ [141]. CLQ-OHmodulates activated immune cells via downregulation of TLR signaling and IL-6 production [142]. Clinical trials are also under consideration for the efficacy and safety of these drugs. Regarding the action mechanism(s), CLQ and CLQ-OH-mediated inhibition of ACE2 terminal glycosylation was considered. In in vitro Vero E6 cells, CLQ significantly inhibits SARS-CoV spread by interfering with ACE2 function, acting at the entry and post-entry steps of SARS-CoV-2 replication and infection. The binding affinity of ACE2 to S glycoprotein is simulated to be lowered by treatment with CLQ-OH or CLQ. CLQmay modify the binding affinity between ACE2 and S glycoprotein by alterations in ACE2glycosylation or modification. CLQ-OH (EC50 0.72 μM) and CLQ (EC50, 5.47 μM) inhibit SARS-CoV-2 [143].
Figure 10
Structures of (A) CLQ and (B) CLQ-related CLQ-OH as predicted UDP-GlcNAc 2-epimerase inhibitors.
Using computer sin class="Gene">mulation techniques, CLQ and CLQ-OH have been suggested to recognize the enzymatic active site of theUDP-GlcNAc 2-epimerase, known as an essential enzyme in SA biosynthesis [144], blocking the sialylation of host cells. Themechanism underlying theglycosylation inhibition may support the antiviral properties of CLQ and CLQ-OH through interactions of CLQ or CLQ-OH with NDP-saccharide mutases or glycosyltransferases [145]. CLQ was reported to inhibit quinone reductase 2 [146], known as a catalytic mimetic or structural neighbor of UDP-GlcNAc 2-epimerases [147,148]. If CLQ or CLQ-OH inhibits SA synthesis, the inhibitory properties may support the antiviral activity of CLQ or CLQ-OH against SARS-CoVs because theSARS-CoV receptor ACE2 contains SA species. In fact, CLQ exhibits in vitro anti-SARS-CoV-1 activity via defective glycosylation of viral ACE2 in Vero cells [149]. In addition. the interference of CLQ or CLQ-OH with SA synthesis may broadly be applicable as an antiviral because theHcoVs or other orthomyxoviruses also utilize SAs as entry molecules [150]. However, the detailed mechanisms should be further elucidated. TheCLQ treatment efficacy in Covid-19patients has, however, not been conclusively determined.
7.2. Interaction of Membrane Gangliosides in Lipid Rafts with CLQ and CLQ-OH
Lipidn class="Disease">rafts are also viral attachment sites. Viruses such as IBV, dengue virus, Ebola virus, hepatitis C virus, HIV, humanherpes virus 6, measles virus, Newcastle disease virus, poliovirus, West Nile virus, foot-and-mouth disease virus, simian virus 40, rotavirus, influenza virus and Marburg virus also use lipidrafts for virus entry [151,152,153,154,155,156]. In avian CoVIBV, structural proteins of theIBV virus are co-localized with PMlipidrafts embedded with theganglioside GM1. HCoV-229E entry is prevented by cholesterol depleted conditions because HCoV-229E clusters in caveolae-associated lipidrafts [157]. Caveolae of caveolin-1, -2 and -3 are cross-linked [158] and control themolecular distribution between rafts and caveolae in a regulatory mechanism. S protein-CD13 cross-linking occurs via CD13-caveolin-1 sequestering. HCoV-229E particles similarly exhibit a longitudinal distribution property. HCoV-229E-colocalized caveolin-1 undergoes the next step of virus infection. Caveolin-1 knockdown inhibited HCoV-229E endocytosis and entry and thus caveolin-1 is essential for HCoV-229E infection. TGEV also endocytoses by a clathrin-mediated mechanism in MDCK cells [159]. Other viruses including HCoV-OC43 also use an entry receptor sequestered to cross-linked caveolae [160]. In SARS-CoV, the first entry step to host cells needs ACE2 in intact lipidrafts by the S glycoprotein [151]. ACE2 is associated with caveolin-1 and GM1 in membrane rafts depending on its cell-type specific localization [161]. Raft integrity with cholesterol and ACE2 is necessary for SARS-CoV pseudovirus entry into Vero E6 cells and for SARS-CoV-microdomain-based entry. C-type lectin, CD209 L (L-SIGN), can also formlipidrafts and acts as a SARS-CoV receptor [162]. Information of theCoV entry pathways is important for therapeutic designation of SARS-CoV-targeting drugs, for example, if agents disrupt lipid-raft localization of theACE2 receptor.
CLQ binds tn class="Gene">he SAs and gangliosides in lipidrafts with a high affinity. Therefore, CLQ or CLQ-OH prevents the S glycoprotein–ganglioside binding. CLQ (or CLQ-OH) binding to SA consequently prevents S glycoprotein binding to host receptors. TheN-terminal region of SARS-CoV-2 S glycoprotein interacts with gangliosides. A ganglioside-binding site (GBS) or ganglioside-binding domain (GBD) is present in theNTD of the S glycoprotein of SARS-CoV-2. Using molecular modeling and simulation technology, CLQ has been suggested to recognize theSAs and gangliosides. Human type Neu5Ac binds to CLQ and CLQ-OH. Thus, SAs are binding targets of CLQ and CLQ-OH. CLQ and CLQ-OH have two specific recognition sites in the polar sugar residues of ganglioside GM1. The first site is found at the tip of thesugar residues of GM1 with an interaction energy of −47 kJ/mol. TheCLQ rings face theGalNAc residue of GM1, while the second site is in a large region of thesugar-ceramide junction and thesugar residues. Several amino acid residues of the S protein NTD, which are Phe-135, Asn-137 and Arg-158, recognize theganglioside GM1. The S glycoprotein NTD-GM1 complex is suggested to form a trimolecular complex with two molecules of ganglioside GM1 anchored to theNTD of S protein [163]. TheACE2-binding RBD is suggested to be a potential GBS located on a differential site of the S glycoprotein NTD. Theprotein sequence interfacing surface of theNTD is the consensus GBDs [164]. The amino acids Gly, Pro and/or Ser residues found in GBDmotifs are in thesame 111–158 amino acids of theNTD as theganglioside-attachment interface. TheGBD is conserved throughout viral isolates from worldwide COVID-19patients. TheGBD potentially increases viral attachment ability to PMlipidrafts and contact between host ACE-2 and S protein [165]. The interaction between CLQ-OH and 9-O-acetyl-NeuAc is also similar to the9-O-acetyl-NeuAc-CLQ interaction. TheCLQ-OH OH group enhances the interaction of CLQ with SA via a hydrogen bond [163]. In conditions with CLQ or CLQ-OH derivative treatment, the S glycoprotein cannot bind to gangliosides in in silico studies, which are used to uncover the action mechanism. CLQ and CLQ-OH prevent the binding of S glycoprotein to gangliosides. TheCLQ-SA complex is formed in a mixed surface and balls by the positioning of the negative charged COOH group of Neu5Ac and one of the two cationic charges of CLQ [163]. CoVs preferentially bind to 9-O-acetyl-NeuAc [60], differentiating with other viral properties.
As CLQ inten class="Disease">racts with theGM1sugar part, theN-terminal domain of the S protein loses viral attachment capacity to the cell receptors [166]. The S protein NTD and theCLQ/CLQ-OHmaintain thesame position during GM1 binding, consequently preventing GM1 binding to the S protein and the drug at thesame time, because theNTD and theCLQ/CLQ-OH simultaneously recognize GM1. Asn-167 forms a hydrogen bond with theGalNAc residue, whereas an aromatic Phe-135 stacks to the Glc residue of GM1. Therefore, the antiviral activities of CLQ and CLQ-OH is to block the interaction between theSARS-CoV-2 S glycoprotein and gangliosides on host cell surfaces. Thelipid composition of host cell PM can also be a potential target for preventive and therapeutic drugs against such viruses.
8. Conclusions
Then class="Species">SARS-CoV-2-caused COVID-19 pandemic is a global public health issue in the 21st century. In order to coordinate efforts against and mitigate the public health consequences of the spreading and outbreak of the disease, the international community has been exchanging independent information and knowledge. The scientists and epidemiologists exchange COVID-19 information to highlight interdisciplinary approaches. The pandemic outbreak issue has raised interest in the pathology and epidemiology of the disease. The current COVID-19 pandemic resulted in establishment of theCOVID Action Platform of the World Economic Forum (WEF) to perform evidence-based cutting-edge research and analyze the fast-evolving pandemic. The Virus Outbreak Data Network (VODAN) of the GoFair Data Alliance also pursues to apply the best remedy against the pandemic infection. In the aspect of basic biology, most β-CoVs recognize 9-O-acetyl SAs, but certain viruses have switched to binding 4-O-acetyl SA. Originally, HE is found in other viruses such as toroviruses and orthomyxoviruses (influenza C/D and isavirus). The exceptional β-CoV lineage A is the only one to bear HE among theCoVs. Virus entry and replication inhibitors need to be urgently developed. Computation-fused artificial intelligence accelerates therapeutic agent development. TheSARS-CoV-2 genome shows 80% similarity to the previous SARS-CoV and SARS-targeting agents can be commonly treated to the related SARS-CoV-2patients. For better understanding of the entry pathway of SARS-CoV-2, the importance of thecarbohydrates including SAs on cell and virus surfaces is again emphasized.
Authors: Mark J G Bakkers; Qinghong Zeng; Louris J Feitsma; Ruben J G Hulswit; Zeshi Li; Aniek Westerbeke; Frank J M van Kuppeveld; Geert-Jan Boons; Martijn A Langereis; Eric G Huizinga; Raoul J de Groot Journal: Proc Natl Acad Sci U S A Date: 2016-05-16 Impact factor: 11.205
Authors: Antonina Naskalska; Agnieszka Dabrowska; Artur Szczepanski; Aleksandra Milewska; Krzysztof Piotr Jasik; Krzysztof Pyrc Journal: J Virol Date: 2019-09-12 Impact factor: 5.103
Authors: Eman Humaid Alketbi; Rania Hamdy; Abdalla El-Kabalawy; Viktorija Juric; Marc Pignitter; Kareem A Mosa; Ahmed M Almehdi; Ali A El-Keblawy; Sameh S M Soliman Journal: Rev Med Virol Date: 2021-01-13 Impact factor: 11.043
Authors: Sean X Gu; Tarun Tyagi; Kanika Jain; Vivian W Gu; Seung Hee Lee; Jonathan M Hwa; Jennifer M Kwan; Diane S Krause; Alfred I Lee; Stephanie Halene; Kathleen A Martin; Hyung J Chun; John Hwa Journal: Nat Rev Cardiol Date: 2020-11-19 Impact factor: 32.419
Authors: Jairo Lumpuy-Castillo; Ana Lorenzo-Almorós; Ana María Pello-Lázaro; Carlos Sánchez-Ferrer; Jesús Egido; José Tuñón; Concepción Peiró; Óscar Lorenzo Journal: Int J Mol Sci Date: 2020-09-04 Impact factor: 5.923