Literature DB >> 32502733

Exploring the genomic and proteomic variations of SARS-CoV-2 spike glycoprotein: A computational biology approach.

Syed Mohammad Lokman1, Md Rasheduzzaman1, Asma Salauddin1, Rocktim Barua1, Afsana Yeasmin Tanzina1, Meheadi Hasan Rumi1, Md Imran Hossain1, A M A M Zonaed Siddiki2, Adnan Mannan3, Md Mahbub Hasan4.   

Abstract

The newly identified SARS-CoV-2 has now been reported from around 185 countries with more than a million confirmed human cases including more than 120,000 deaths. The genomes of SARS-COV-2 strains isolated from different parts of the world are now available and the unique features of constituent genes and proteins need to be explored to understand the biology of the virus. Spike glycoprotein is one of the major targets to be explored because of its role during the entry of coronaviruses into host cells. We analyzed 320 whole-genome sequences and 320 spike protein sequences of SARS-CoV-2 using multiple sequence alignment. In this study, 483 unique variations have been identified among the genomes of SARS-CoV-2 including 25 nonsynonymous mutations and one deletion in the spike (S) protein. Among the 26 variations detected in S, 12 variations were located at the N-terminal domain (NTD) and 6 variations at the receptor-binding domain (RBD) which might alter the interaction of S protein with the host receptor angiotensin-converting enzyme 2 (ACE2). Besides, 22 amino acid insertions were identified in the spike protein of SARS-CoV-2 in comparison with that of SARS-CoV. Phylogenetic analyses of spike protein revealed that Bat coronavirus have a close evolutionary relationship with circulating SARS-CoV-2. The genetic variation analysis data presented in this study can help a better understanding of SARS-CoV-2 pathogenesis. Based on results reported herein, potential inhibitors against S protein can be designed by considering these variations and their impact on protein structure.
Copyright © 2020 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  COVID-19; Genomic variants; SARS-CoV-2; Sequence analysis; Spike protein

Mesh:

Substances:

Year:  2020        PMID: 32502733      PMCID: PMC7266584          DOI: 10.1016/j.meegid.2020.104389

Source DB:  PubMed          Journal:  Infect Genet Evol        ISSN: 1567-1348            Impact factor:   3.342


Introduction

Coronavirus disease (COVID-19) is a pandemic manifesting respiratory illness and first reported in Wuhan, Hubei province of China in December 2019. The death toll rose to more than 68,000 among 1,250,000 confirmed cases around the Globe (until April 4, 2020) (WHO (World Health Organization), 2020). The virus causing COVID-19 is named as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Based on the phylogenetic studies, the SARS-CoV-2 is categorized as a member of the genus Betacoronavirus, the same lineage that includes SARS coronavirus (SARS-CoV) (Gorbalenya et al., 2020) that caused SARS (Severe Acute Respiratory Syndrome) in China during 2002 (Ksiazek et al., 2003). Recent studies showed that SARS-CoV-2 has a close relationship with bat SARS-like CoVs (Li et al., 2005; Zhou et al., 2020), though the intermediate hosts for zoonotic transmission of SARS-CoV-2 from bats to humans remain undiscovered. SARS-CoV-2 has been identified as an enveloped, single-stranded positive-sense RNA virus with a genome size of approximately 29.9 kb encoding 27 proteins from 14 ORFs including 15 non-structural, 8 accessory, and 4 major structural proteins. Two-thirds of the viral RNA harbors the first ORF (ORF1ab) dedicated for translating polyprotein 1a (pp1a) and polyprotein 1ab (pp1ab), which later undergo proteolytic cleavage to form 15 non-structural proteins. Spike glycoprotein (S), membrane (M), envelope (E) and nucleocapsid (N) are the four major structural proteins of SARS-CoV-2 (Wu et al., 2020a; Wu et al., 2020b). Interestingly, S glycoprotein is characterized as the critical determinant for viral entry into host cells which consists of two functional subunits namely S1 and S2. The S1 subunit recognizes and binds to the host receptor through the receptor-binding domain (RBD) whereas S2 is responsible for fusion with the host cell membrane (Wrapp et al., 2020; Walls et al., 2020; Chen et al., 2020). MERS-CoV uses dipeptidyl peptidase-4 (DPP4) as an entry receptor (Raj et al., 2013) whereas SARS-CoV and SARS-CoV-2 utilize ACE-2 (angiotensin-converting enzyme 2) (Li et al., 2003), abundantly available in lung alveolar epithelial cells and enterocytes, suggesting S glycoprotein as a potential drug target to halt the entry of SARS-CoV-2 (Letko et al., 2020). According to recent reports, neutralizing antibodies are generated in response to the entry and fusion of surface-exposed S protein (mainly RBD domain) which is predicted to be an important target for vaccine candidates (Chen et al., 2020; Hoffmann et al., 2020; Tian et al., 2020). However, SARS-CoV-2 has emerged with remarkable properties like glutamine-rich 42 aa long exclusive molecular signature (DSQQTVGQQDGSEDNQTTTIQTIVEVQPQLEMELTPVVQTIE) in position 983–1024 of polyprotein 1ab (pp1ab) (Cárdenas-Conejo et al., 2020), diversified receptor-binding domain (RBD), unique furin cleavage site (PRRAR↓SV) at S1/S2 boundary in S glycoprotein which could play roles in viral pathogenesis, diagnosis, and treatment (Coutard et al., 2020). To date, few genomic variations of SARS-CoV-2 are reported (Ceraolo and Giorgi, 2020; Lu et al., 2020). There are growing evidences that spike protein, a 1273 amino acid long glycoprotein having multiple domains, possibly plays a major role in SARS-CoV-2 pathogenesis. Viral entry to the host cell is initiated by the receptor-binding domain (RBD) of S1 head. Upon receptor-binding, proteolytic cleavage occurs at S1/S2 cleavage site and two heptad repeats (HR) of S2 stalk form a six-helix bundle structure triggering the release of the fusion peptide. As it comes into close proximity to the transmembrane anchor (TM), the TM domain facilitates membrane destabilization required for fusion between virus-host membranes (Bosch et al., 2003; Liu et al., 2004). Insights into the sequence variations of S glycoprotein among available genomes are key to understanding the biology of SARS-CoV-2 infection, developing antiviral treatments and vaccines. In this study, we have analyzed 320 genomic sequences of SARS-CoV-2 to identify mutations between the available genomes followed by the amino acid variations in the glycoprotein S to foresee their impact on the viral entry to host cell from the structural biology viewpoint.

Methods and materials

Dataset

All available sequences (320 whole genome and surface glycoprotein amino acid sequences of SARS-CoV-2) related to the COVID-19 pandemic were retrieved from NCBI Virus Variation Resource repository (https://www.ncbi.nlm.nih.gov/labs/virus/) (Hatcher et al., 2017). Among the protein sequences, 11 were discarded due to incomplete sequence coverage. In addition, all 40 S glycoprotein sequences from different coronavirus families were retrieved for phylogenetic analysis. The NCBI reference sequence of SARS-CoV-2 S glycoprotein, accession number YP_009724390 was used as the canonical sequence for the analyses of spike protein variants.

Phylogenetic analysis

Variant analyses of SARS-CoV-2 genomes were performed in the Genome Detective Coronavirus Typing Tool Version 1.13 which is specially designed for this virus (https://www.genomedetective.com/app/typingtool/cov/) (Cleemput et al., 2020). For multiple sequence alignment (MSA), Genome Detective Coronavirus Typing Tool uses a reference dataset of 431 whole genome sequences (WGS) where 386 WGS were from known nine coronavirus species. The dataset was then aligned with MUSCLE (Edgar, 2004). Entropy (H(x)) plot of nucleotide variations in SARS-CoV-2 genome was constructed using BioEdit (Hall et al., 1999). MEGA X (version 10.1.7) was used to construct the MSAs and the phylogenetic tree using pairwise alignment and neighbor-joining methods in ClustalW (Kumar et al., 2018; Saitou and Nei, 1987). Tree structure was validated by running the analysis on 1000 bootstraps (Efron et al., 1996) replications dataset and the evolutionary distances were calculated using the Poisson correction method (Zuckerkandl and Pauling, 1965).

Homology modeling of S glycoprotein

Variant sequences of SARS-CoV-2 were modeled in Swiss-Model (Andrew et al., 2018) using the Cryo-EM spike protein structure of SARS-CoV-2 (PDB ID 6VSB; (Wrapp et al., 2020)) as a template. The overall quality of models was assessed in RAMPAGE server (Prisant et al., 2003) by generating Ramachandran plots (Supplementary Table 1). PyMol and BIOVIA Discovery Studio were used for structure visualization and superpose (DeLano et al., 2002).

Results

Genomic variations of SARS-CoV-2

Multiple sequence alignment of the available 320 genomes of SARS-CoV-2 were performed and 483 variations were found throughout the 29,903 bp long SARS-CoV-2 genome with in total 115 variations in UTR region, 130 synonymous variations that cause no amino acid alteration, 228 non-synonymous variations causing change in amino acid residue, 16 INDELs, and 2 variations in non-coding region (Supplementary Table 2). Among the 483 variations, 40 variations (14 synonymous, 25 non-synonymous mutations and one deletion) were observed in the region of ORF S that encodes S glycoprotein which is responsible for viral fusion and entry into the host cell (Li, 2015). Notably, most of the SARS-CoV-2 genome sequences were deposited from the USA (250) and China (50) (Supplementary Fig. 1). Positional variability of the SARS-CoV-2 genomes was calculated from the MSA of 320 SARS-CoV-2 whole genomes as a measure of Entropy value (H(x)) (Manaresi et al., 2017). Excluding 5′ and 3′ UTR, ten hotspots of hypervariable positions were identified, of which seven were located at ORF1ab (1059C > T, 3037C > T, 8782C > T, 14408C > T, 17747C > T, 17858A > G, 18060C > T), and one at ORF S (23403A > G), ORF3a (25563G > T), and ORF8 (28,144 T > C), respectively. The variability at position 8782 and 28,144 were found to be the highest among the other hotspots (Fig. 1 ).
Fig. 1

Nucleotide sequence variation among 320 SARS-CoV-2 whole genomes. A. Positional organization of major structural protein-encoding genes in orange color (S = Spike protein, E = Envelope protein, M = Membrane protein, N = Nucleocapsid protein) and accessory protein ORFS in blue colors. B. Variability within 320 SARS-CoV-2 genomic sequences represented by entropy (H(x)) value across genomic location. Two highest frequency of alterations were found at position 8785 of ORF1a and 28,144 of ORF8. C. The respective alignment view of each highly variable regions. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Nucleotide sequence variation among 320 SARS-CoV-2 whole genomes. A. Positional organization of major structural protein-encoding genes in orange color (S = Spike protein, E = Envelope protein, M = Membrane protein, N = Nucleocapsid protein) and accessory protein ORFS in blue colors. B. Variability within 320 SARS-CoV-2 genomic sequences represented by entropy (H(x)) value across genomic location. Two highest frequency of alterations were found at position 8785 of ORF1a and 28,144 of ORF8. C. The respective alignment view of each highly variable regions. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Phylogenetic analysis of S glycoprotein

The phylogenetic analysis of a total of 66 sequences (26 unique SARS-CoV-2 and 40 different coronavirus S glycoprotein sequences) was performed. The evolutionary distances showed that all the SARS-CoV-2 spike proteins cluster in the same node of the phylogenetic tree confirming the sequences were similar to Refseq YP_009724390 (Fig. 2 ). Bat coronaviruses have a close evolutionary relationship as different strains were found in the nearest outgroups and clades (Bat coronavirus BM48–31, Bat hp-beta coronavirus, Bat coronavirus HKU9) conferring that coronavirus has a vast geographical spread and bat is the most prevalent host (Fig. 2). In other clades, the clusters were speculated through different hosts which may describe the evolutionary changes of surface glycoprotein due to cross-species transmission. Viral hosts reported from different spots at different times are indicative of possible recombination.
Fig. 2

Sequence phylogeny of SARS-CoV-2 spike glycoprotein variants and other coronavirus spike proteins based on amino acid sequences, retrieved from NCBI database using neighbor-joining methods in ClustalW and tree structure was validated by running the analysis on 1000 bootstraps. The branch length is indicated in the scale bar. The accession number YP_009724390 represents identical sequences out of SARS-CoV-2 spike proteins.

Sequence phylogeny of SARS-CoV-2 spike glycoprotein variants and other coronavirus spike proteins based on amino acid sequences, retrieved from NCBI database using neighbor-joining methods in ClustalW and tree structure was validated by running the analysis on 1000 bootstraps. The branch length is indicated in the scale bar. The accession number YP_009724390 represents identical sequences out of SARS-CoV-2 spike proteins.

SARS-CoV-2 spike protein variation analysis

The S glycoprotein sequences of SARS-CoV-2 were retrieved from the NCBI Virus Variation Resource repository and aligned using ClustalW. The relative positions of SARS-CoV-2 spike protein domains were measured by aligning with the SARS-CoV spike protein (Fig. 3 ) (Yuan et al., 2017; Gui et al., 2017). From the sequence identity matrix, 26 unique variants among unique SARS-CoV-2 spike glycoprotein sequences were identified to have 25 substitutions and a deletion (Fig. 4A and Supplementary Table 3). 215 sequences were found identical with SARS-CoV-2 S protein reference sequence (YP_009724390) while 64 sequences were identical with the same variation of D614G (Supplementary Table 4, 5). Among all the variations, twelve (Y28N, T29I, H49Y, L54F, N74K, E96D, D111N, Y145Del, F157L, G181V, S221W, and S247R) were located at the N-terminal domain (NTD). Another 6 variations (A348T, R408I, G476S, V483A, H519Q, A520S) were found at the receptor-binding domain (RBD) while only two variations (A930V, and D936Y) were found at the heptad repeat 1 (HR1) domain. Single variations were found in signal peptide (L5F) domain, sub-domain-2 (D614G), sub-domain-3 (A1078V), heptad repeat 2 domain (D1168H), and cytoplasmic tail domain (D1259H) each. Notably, the substitution of Cystine by Phenylalanine was observed at 19 amino acids upstream of the fusion peptide domain (Fig. 4A). The mutation of Aspartic acid to Glycine at position 614 was observed 71 times with entropy value over 0.5 among the available 320 SARS-CoV-2 spike protein sequences (Supplementary Fig. 2).
Fig. 3

Overall architecture of the SARS-CoV-2 S glycoprotein. A.Schematic diagram of the SARS-CoV S glycoprotein showing domain organization (Reconstructed from Y. Yuan et al., 2017 and M. Gui et al., 2017). B. Schematic domain organization diagram of the SARS-CoV-2 S glycoprotein constructed by aligning with SARS-CoV S protein domain. C. Homology model of SARS-CoV-2 S protein reference sequence YP_009724390 with PDB:6VSB. S protein trimer with two protomers surface shadowed (left). Ribbon diagram of SARS-CoV-2 S glycoprotein monomer from B. Here, NTD: N-terminal domain; RBD: receptor-binding domain; SD: subdomain; CR: connecting region; HR: heptad repeat; CH: central helix; BH: β-hairpin; FP: fusion peptide; TM: transmembrane domain; CT: cytoplasmic tail.

Fig. 4

Variability within 320 SARS-CoV-2 S protein sequences. A. Schematic representation of mutations across the spike protein domain organization. Blue, red, and black color represents charge of the amino acid residue as positive, negative, and neutral respectively. B—N, Superposed structures of SARS-CoV-2 spike protein variants with the Cryo-EM structure of SARS-CoV-2 Spike Protein (PDB: 6VSB). Template residues are indicated by green color and variants' residues are indicated as red color. Here, B: Y28N, C: T29I, D: H49Y, E: L54F, F: D111N, G: S221W, H: A348T, I: R408I, J: H519Q, K: A520S, L: D614G, M: F797C, N: A930V, O: D936Y, and P: A1078V. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Overall architecture of the SARS-CoV-2 S glycoprotein. A.Schematic diagram of the SARS-CoV S glycoprotein showing domain organization (Reconstructed from Y. Yuan et al., 2017 and M. Gui et al., 2017). B. Schematic domain organization diagram of the SARS-CoV-2 S glycoprotein constructed by aligning with SARS-CoV S protein domain. C. Homology model of SARS-CoV-2 S protein reference sequence YP_009724390 with PDB:6VSB. S protein trimer with two protomers surface shadowed (left). Ribbon diagram of SARS-CoV-2 S glycoprotein monomer from B. Here, NTD: N-terminal domain; RBD: receptor-binding domain; SD: subdomain; CR: connecting region; HR: heptad repeat; CH: central helix; BH: β-hairpin; FP: fusion peptide; TM: transmembrane domain; CT: cytoplasmic tail. Variability within 320 SARS-CoV-2 S protein sequences. A. Schematic representation of mutations across the spike protein domain organization. Blue, red, and black color represents charge of the amino acid residue as positive, negative, and neutral respectively. B—N, Superposed structures of SARS-CoV-2 spike protein variants with the Cryo-EM structure of SARS-CoV-2 Spike Protein (PDB: 6VSB). Template residues are indicated by green color and variants' residues are indicated as red color. Here, B: Y28N, C: T29I, D: H49Y, E: L54F, F: D111N, G: S221W, H: A348T, I: R408I, J: H519Q, K: A520S, L: D614G, M: F797C, N: A930V, O: D936Y, and P: A1078V. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Alterations of amino acid residual charge from positive to neutral (H49Y, R408I, H519Q), negative to neutral (D111N, D614G, D936Y), negative to positive (D1168H, D1259H), and neutral to positive (N74K, S247R) were seen in variants QHW06059, QHS34546, QIS61422, QIS61338, QIK50427, QIS30615, QIS60978, QIS60582, QIO04367, and QHR84449 respectively due to the substitution of amino acids that differs in charge. The remaining 15 variants were mutated with the amino acids that are similar in charge (Fig. 4 A). The SARS-CoV-2 spike protein variants were superposed with the cryo-electron microscopic structure of SARS-CoV-2 spike protein (Wrapp et al., 2020). L5F, N74K, E96D, F157L, G181V, S247R, G476S, V483A, D1168H, and D1259H variants were excluded from superposition due to absence of respective residues in the 3D structure of the template (PDB: 6VSB). The superposition showed that most of the residual change was causing incorporation of bulky amino acid residues (T29I, H49Y, L54F, S221W, A348T, H519Q, A520S, A930V, D936Y, and A1078V) in place of smaller sized residues except for Y28N, D111N, R408I, D614G, and F797C (Fig. 4 B—P). The sequence comparison of spike glycoprotein between SARS-CoV-2 variants and SARS-CoV (Uniprot ID: P59594) revealed nearly 77.46% similarity and identified the presence of an additional 22 amino acids in SARS-CoV-2 spike protein variants resulting from a total of 5 insertions (Supplementary Fig. 3). Among these, the major insertion consisting of 7 amino acids (GTNGTKR) at position 72–78 followed by 4 amino acids (NKSW) at position 149–152 and 6 amino acids (SYLTPG) at 247–252 occurred in the N-terminal domain. Insertion of Glycine at 482 was found in receptor binding domain, preceding another insertion of 4 amino acids (NSPR) at position 679–682, just upstream of S1/S2 cleavage site that leads to form a furin-like cleavage site (PRRARS) in the S protein variants of SARS-CoV-2 (Supplementary Fig. 3). The S2 subunit of spike protein, especially the heptad repeat region 2, fusion peptide domain, transmembrane domain, and cytoplasmic tail were found to be highly conserved in the SARS-CoV and the SARS-CoV-2 variants while the S1 subunit was more diverse, specifically the N-terminal domain (NTD) and receptor-binding domain (RBD). The spatial distribution of S protein sequences having different variations over time reveals that most of the variants (17 out of 240 S glycoprotein sequences) were reported from the US followed by 3 out of 2 sequences (including Y145 deletion) and 2 out of 50 sequences from India and China, respectively (Fig. 5 ). Only one variant was found out of only one available sequence in the repository from Sweden, Australia, South Korea and Peru. Interestingly, all the sequences are unique among countries from where they were reported except D614G, which was found both in the US and Peru (Fig. 5). Moreover, we have also analyzed sequences from Brazil, Italy, Nepal, Pakistan, Spain, Taiwan, and Vietnam but no variation in the S glycoprotein sequences were found when compared to the Refseq YP_009724390.
Fig. 5

The spatial distribution of variations found in S glycoprotein over time. The surface plot illustrates the frequency distribution of each variation over time. The geographic location of the sample is presented with flags and if the frequency of each variation (if more than one from a single country) is shown below the respective flag.

The spatial distribution of variations found in S glycoprotein over time. The surface plot illustrates the frequency distribution of each variation over time. The geographic location of the sample is presented with flags and if the frequency of each variation (if more than one from a single country) is shown below the respective flag.

Discussion

COVID-19 is one of the most contagious pandemics the world has ever had with 1,250,000 confirmed cases to date (April 4, 2020) and the cases have increased as high as 5 times in less than a month (WHO (World Health Organization), 2020). Phylogenetic analysis showed that SARS-CoV-2 is a unique coronavirus presumably related to Bat coronaviruss (BM48–31, Hp-betacoronavirus). During this study, we investigated the available genomes of SARS-CoV-2 and found variations in 483 positions resulting in 130 synonymous and 228 non-synonymous variants. Out of them, 25 non-synonymous variants were observed in the spike protein of SARS-CoV-2. Viral spike protein is thought to have a crucial role in drug and vaccine development as reported previously in managing the viruses like SARS-CoV and MERS-CoV (Tian et al., 2020; Kapadia et al., 2005; Elshabrawy et al., 2012; Du et al., 2013). Likewise, a number of studies targeting SARS-CoV-2 spike protein have been undertaken for the therapeutic measures (Xia et al., 2020), but the unique structural and functional details of SARS-CoV-2 spike protein are still under scrutiny. We also found a variant (R408I) at receptor-binding domain (RBD) that mutated from positively charged Arginine residue to neutral and smaller sized Isoleucine residue (Fig. 4 I). This change might alter the interaction of viral RBD with the host receptor because the R408 residue of SARS-CoV-2 is known to interact with the ACE2 receptor for viral entry (Ortega et al., 2020). Similarly, alterations of RBD (G476S, V483A, H519Q, and A520S) also could affect the interaction of SARS-CoV-2 spike protein with other molecules which requires further investigations. QIA98583 and QIS30615 variants were found to have an alteration of Alanine to Valine (A930V), and Aspartic acid to Tyrosine (D936Y) respectively in the alpha helix of the HR1 domain. Previous reports have indicated that HR1 domain plays a significant role in viral fusion and entry by forming helical bundles with HR2, and mutations including alanine substitution by valine (A1168V) in HR1 region are predominantly responsible for conferring resistance to mouse hepatitis coronaviruses against HR2 derived peptide entry inhibitors (Bosch et al., 2008). This study hypothesizes the mutation (A930V) found in that of SARS-CoV-2 might also have a role in the emergence of drug-resistance virus strains. Also, the mutation (D1168H) found in the heptad repeat 2 (HR2) of SARS-CoV-2 could play a vital role in viral pathogenesis. Moreover, we found that 20 variants including one deletion out of 26 were located within S1 especially within NTD and RBD region of glycoprotein S (Fig. 4A), the region is responsible for the preliminary interaction with the host cell receptor ACE2. This indicates that the NTD and RBD are very prone to mutations. However, the NTD and RBD portions harbor potential epitopes that might serve as potential peptide vaccine candidates against SARS-CoV-2 as reported in different studies (Bhattacharya et al., 2020; Rasheed et al., 2020; Rahman et al., 2020). The reason behind choosing the sequences from S protein domain NTD and RBD as antigenic determinants is they are situated in the outer surface of the virus that could be more accessible for the immune system (Fig. 3C). So the variations reported herein within the outer domains of S glycoprotein could help to design effective epitope-based vaccines or antivirals. The SARS-CoV-2 S protein contains additional furin protease cleavage site, PRRARS, in S1/S2 domain which is conserved among all 320 sequences as revealed during this study (Supplementary Fig. 3). This unique signature is thought to make SARS-CoV-2 more virulent than SARS-CoV and regarded as novel features of the viral pathogenesis (Walls et al., 2020). According to previous reports, the more the host cell proteases able to process the coronavirus S protein, the more acceleration in viral tropism observed (Walls et al., 2020; Klenk and Garten, 1994; Steinhauer, 1999; Millet and Whittaker, 2015). Apart from that, this could also promote viruses to escape antiviral therapies targeting transmembrane protease TMPRSS2 (ClinicalTrials.gov, NCT04321096) which is a well-reported protease to cleave at S1/S2 of S glycoprotein (Fehr and Perlman, 2015). Comparative analyses between SARS-CoV and SARS-CoV-2 spike glycoprotein showed 77% similarity between them where the most diverse region was the N-terminal domain and receptor-binding domain. The insertion of 17 additional amino acid residues in the N-terminal domain of SARS-CoV-2 and its high sequence diversity suggests that it may have a role in binding with other cell receptors in humans. This is because the N-terminal domain could function as the receptor-binding domain of various coronaviruses (Li, 2015; Yuan et al., 2017). A similar phenomenon has been observed in mouse hepatitis coronavirus (MHV) and porcine transmissible gastroenteritis coronavirus (TGEV) where the NTD is reported to attach with the host entry receptor (Williams et al., 1991; Delmas et al., 1992). The spatial distribution of variations in amino acid sequences of S glycoprotein of SARS-CoV-2 showed that 17 variations out of 26 were found among the sequences deposited from the US. But there were 7 unique variations found among the sequences from 5 countries (from 4 continents namely Australia, Europe, Latin America, and Asia) out of 6 sequences available during the study period (Fig. 5). Although the number of sequences analyzed herein is too small to speculate the exact trend of S glycoprotein evolution, these variants might play a vital role in adaptation to a new geospatial environment. The variation analyses in amino acids indicated the structural features of different domains of the SARS-CoV-2 spike proteins. The variations have multiple effects on the structure resulting in change in stability, favoring various interactions, and conformational diversity. Augmented infection kinetics and viral spreading may have an association with the structural changes and composition of residues in the viral spike protein. However, to identify the actual role of involvement of S glycoprotein, a larger dataset regarding genomics and proteomics of SARS-CoV-2 is required as this protein is vital to understand the viral pathogenicity, evolution and development of therapeutics. Further analyses of all the S glycoprotein and SARS-CoV-2 genomes with different epidemiological aspects are warranted to get a better understanding of the pathogenesis of SARS-CoV-2.

Credit author statement

Syed Mohammad Lokman: Visualization; Formal analysis; Data curation; Writing – draft. Md. Rasheduzzaman: Investigation; Formal analysis; Writing – draft. Asma Salauddin: Investigation; Formal analysis. Rocktim Barua: Investigation; Visualization. Afsana Yeasmin Tanzina: Formal analysis; Writing – draft. Meheadi Hasan Rumi: Investigation; Visualization. Md. Imran Hossain: Investigation; Formal analysis. AMAM Zonaed Siddiki: Supervision; Writing - review & editing. Adnan Mannan: Conceptualization; Project administration; Writing - review & editing; Supervision. Md. Mahbub Hasan: Formal analysis; Validation; Visualization; Methodology; Data curation; Resources; Writing - review & editing; Supervision.

Declaration of Competing Interest

The authors would like to declare that there is no known contending financial interests or personal relationships that could affect the work reported in this paper.
  28 in total

Review 1.  Characterization of SARS-CoV-2 different variants and related morbidity and mortality: a systematic review.

Authors:  SeyedAhmad SeyedAlinaghi; Pegah Mirzapour; Omid Dadras; Zahra Pashaei; Amirali Karimi; Mehrzad MohsseniPour; Mahdi Soleymanzadeh; Alireza Barzegary; Amir Masoud Afsahi; Farzin Vahedi; Ahmadreza Shamsabadi; Farzane Behnezhad; Solmaz Saeidi; Esmaeil Mehraeen
Journal:  Eur J Med Res       Date:  2021-06-08       Impact factor: 2.175

2.  The D614G mutation in the SARS-CoV-2 spike protein reduces S1 shedding and increases infectivity.

Authors:  Lizhou Zhang; Cody B Jackson; Huihui Mou; Amrita Ojha; Erumbi S Rangarajan; Tina Izard; Michael Farzan; Hyeryun Choe
Journal:  bioRxiv       Date:  2020-06-12

Review 3.  Autoinflammatory and autoimmune conditions at the crossroad of COVID-19.

Authors:  Yhojan Rodríguez; Lucia Novelli; Manuel Rojas; Maria De Santis; Yeny Acosta-Ampudia; Diana M Monsalve; Carolina Ramírez-Santana; Antonio Costanzo; William M Ridgway; Aftab A Ansari; M Eric Gershwin; Carlo Selmi; Juan-Manuel Anaya
Journal:  J Autoimmun       Date:  2020-06-16       Impact factor: 7.094

Review 4.  A Crowned Killer's Résumé: Genome, Structure, Receptors, and Origin of SARS-CoV-2.

Authors:  Shichuan Wang; Mirko Trilling; Kathrin Sutter; Ulf Dittmer; Mengji Lu; Xin Zheng; Dongliang Yang; Jia Liu
Journal:  Virol Sin       Date:  2020-10-17       Impact factor: 4.327

5.  Patient-derived SARS-CoV-2 mutations impact viral replication dynamics and infectivity in vitro and with clinical implications in vivo.

Authors:  Hangping Yao; Xiangyun Lu; Qiong Chen; Kaijin Xu; Yu Chen; Minghui Cheng; Keda Chen; Linfang Cheng; Tianhao Weng; Danrong Shi; Fumin Liu; Zhigang Wu; Mingjie Xie; Haibo Wu; Changzhong Jin; Min Zheng; Nanping Wu; Chao Jiang; Lanjuan Li
Journal:  Cell Discov       Date:  2020-10-29       Impact factor: 10.849

Review 6.  Immune-checkpoint inhibitors from cancer to COVID‑19: A promising avenue for the treatment of patients with COVID‑19 (Review).

Authors:  Silvia Vivarelli; Luca Falzone; Francesco Torino; Giuseppa Scandurra; Giulia Russo; Roberto Bordonaro; Francesco Pappalardo; Demetrios A Spandidos; Giuseppina Raciti; Massimo Libra
Journal:  Int J Oncol       Date:  2020-12-14       Impact factor: 5.650

7.  Temporal increase in D614G mutation of SARS-CoV-2 in the Middle East and North Africa.

Authors:  Malik Sallam; Nidaa A Ababneh; Deema Dababseh; Faris G Bakri; Azmi Mahafzah
Journal:  Heliyon       Date:  2021-01-20

8.  Yeast display of MHC-II enables rapid identification of peptide ligands from protein antigens (RIPPA).

Authors:  Rongzeng Liu; Wei Jiang; Elizabeth D Mellins
Journal:  Cell Mol Immunol       Date:  2021-06-11       Impact factor: 11.530

9.  SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity.

Authors:  Lizhou Zhang; Cody B Jackson; Huihui Mou; Amrita Ojha; Haiyong Peng; Brian D Quinlan; Erumbi S Rangarajan; Andi Pan; Abigail Vanderheiden; Mehul S Suthar; Wenhui Li; Tina Izard; Christoph Rader; Michael Farzan; Hyeryun Choe
Journal:  Nat Commun       Date:  2020-11-26       Impact factor: 14.919

10.  Analysis of SARS-CoV-2 mutations in Mexico, Belize, and isolated regions of Guatemala and its implication in the diagnosis.

Authors:  María Teresa Hernández-Huerta; Laura Pérez-Campos Mayoral; Carlos Romero Díaz; Margarito Martínez Cruz; Gabriel Mayoral-Andrade; Luis Manuel Sánchez Navarro; María Del Socorro Pina-Canseco; Eli Cruz Parada; Ruth Martínez Cruz; Eduardo Pérez-Campos Mayoral; Alma Dolores Pérez Santiago; Gabriela Vásquez Martínez; Eduardo Pérez-Campos; Carlos Alberto Matias-Cervantes
Journal:  J Med Virol       Date:  2020-11-01       Impact factor: 20.693

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.