Literature DB >> 35450296

Annotating Spike Protein Polymorphic Amino Acids of Variants of SARS-CoV-2, Including Omicron.

Gusti Ngurah Mahardika1, Nyoman B Mahendra2, Bayu K Mahardika1, Ida B K Suardana3, Made Pharmawati4.   

Abstract

The prolonged global spread and community transmission of severe acute respiratory syndrome virus 2 (SARS-CoV-2) has led to the emergence of variants and brought questions regarding disease severity and vaccine effectiveness. We conducted simple bioinformatics on the spike gene of a representative of each variant. The data show that a number of polymorphic amino acids are located mostly on the amino-terminal side of the S1/S2 cleavage site. The Omicron variant diverges from the others, with the highest number of amino acid substitutions, including the receptor-binding site (RBS), epitopes, S1/S2 cleavage site, fusion peptide, and heptad repeat 1. The current sharp global increase in the frequency of the Omicron genome constitutes evidence of its high community transmissibility. In conclusion, the proposed guideline could give an immediate insight of the probable biological nature of any variant of SARS-Cov-2. As the Omicron diverged the farthest from the original pandemic strain, Wuhan-Hu-1, we expect different epidemiological and clinical patterns of Omicron cases. On vaccine efficacy, slight changes in some epitopes while others are conserved should not lead to a significant reduction in the effectiveness of an approved vaccine.
Copyright © 2022 Gusti Ngurah Mahardika et al.

Entities:  

Year:  2022        PMID: 35450296      PMCID: PMC9017565          DOI: 10.1155/2022/2164749

Source DB:  PubMed          Journal:  Biochem Res Int


1. Introduction

The emergence of various variants of the severe acute respiratory syndrome virus 2 (SARS-CoV-2) has led to questions about disease severity and vaccine effectiveness. As of 19 December 2021, over 273 million cases and over 5.3 million deaths have been reported globally (https://www.who.int). The World Health Organization defines SARS-CoV-2 variants as variants of concern (VOCs), variants of interest (VOIs), and variants under monitoring (VUMs). The VOCs listed in the GISAID database, which was accessed on December 26, 2021, are Omicron, Delta, Alpha, Beta, and Gamma; Lambda and Mu are VOIs; and GH/490R is a VUM. The impact of amino acid substitutions in the variants must depend on their genetic make-up. The SARS-CoV-2 belongs to the lineage B (sarbecovirus) of β-CoVs of the Coronaviridae family, which is enveloped with single-stranded positive-sense RNA [1]. The genome of SARS-CoV-2 is more than 29 kb, which encodes for structural proteins of phosphorylated nucleocapsid (N) protein, spike glycoprotein (S), hemagglutinin-esterase (HE), membrane (M) protein, and envelope (E) protein, as well as nonstructural proteins of ORF1a, ORF1b, ORF3a, ORF3b, ORF6, ORF7a, ORF7b, ORF8, ORF9a, ORF9b, and ORF10 [1, 2]. ORF1a and 1b are also translated following -1 ribosomal frameshifting to produce ORF1ab protein [3]. Proteins translated from ORF1a, 1b, and 1ab form viral polymerase complex [4], while proteins of other ORF's are also known as accessory proteins [5]. However, it is generally believed that the spike protein is a major pathogenic coronavirus determinant. This surface protein possesses major immunogenic domains, and most gene-based vaccines target only the spike gene of SARS-CoV-2. The protein is highly glycosylated and cysteine rich, with two cleavage sites: S1/S2 and S2′ [6]. The glycosylation pattern of the SARS-CoV-2 spike involves N-linked and O-linked glycosylation [7]. Spike also has two protease cleavage sites, which are critical for virus activation and replication. Protease cleavage of the spike has been established as a critical determinant of coronavirus tropism and pathogenesis [8]. The epitopes of the spike protein seem to be linear and conformational. One group [9] mapped nine linear epitopes along the spike protein designated IdA-IdI; another group [10] identified 16 epitopes. Some epitopes overlap. Due to folding and trimerization, conformational epitopes in the spike protein of SARS-CoV-2 have been predicted [11]. The biological consequences of each variant are poorly understood. In this study, we performed functional annotation of amino acid changes in the spike protein of variants based on existing knowledge of coronaviruses as well as rapidly accumulating knowledge about SARS-CoV-2.

2. Materials and Methods

The sequence of the original SARS-CoV-2 strain of Wuhan-Hu-1 (Accession Number NC_045512) was downloaded from GenBank. The open reading frame (ORF) of the spike protein was selected as determined in the database. Ten complete sequences of each annotated variant were selected randomly from GISAID and downloaded. Using the spike gene of Wuhan-Hu-1, the first 15 nucleotides of the 5′-terminus were searched, and the sequence prior to the marked sequence was deleted. The last 15 nucleotides of Wuhan-Hu-1 were used to mark the 3′-end of the sequence, and all nucleotides after that marked sequence were deleted. We then manually selected the sequence data without any undefined or any “N” residue. When there was no “clean” sequence data for each variant, we evaluated another set of variant data. The selected sequences were translated into amino acid sequences and aligned using MEGA-X software [12]. Using the same software, the data were exported in Mega format and analyzed further for polymorphic or variable amino acids. The evolutionary history of variants was inferred using the neighbor-joining method [13]. Evolutionary distances were computed using the Kimura 2-parameter method [14]. The probable biological function of each residue was annotated using the guidelines shown in Supplementary Material 1.

3. Results

The dataset containing the representative of each variant used is available in Supplementary Material 2. Polymorphic amino acid residues of the spike protein of SARS-CoV-2 Wuhan Hu-1 and all variants are presented in Table 1. The data show that a number of polymorphic amino acids are located in the S1 domain, on the N-terminal side of the S1/S2 cleavage site. The number of polymorphic amino acids in this region is 73; the S2 domain has only 15. Glycosylation motif loss (GML) occurs once in Delta due to the T19R substitution and in Lambda due to the T76I substitution. Additional glycosylation motif (AGM) gain occurred twice in the Gamma variant, i.e., T20N and R191S, and once in Lambda, i.e., R249N. Cysteine residue loss (CRL) occurred once in the GH/490 variant due to the deletion of C136. Ten residues of the receptor binding site (RBS) are polymorphic, with only a single residue difference from Wuhan-Hu-1 in the Alpha and GH variants; the other variants, except for Omicron, carry two residues, and Omicron shows nine amino acid differences from Wuhan-Hu-1. The number of amino acid substitutions in various linear epitopes is 19; that of probable conformational epitopes is 16. The number of amino acid changes from Wuhan-Hu-1 at the mapped linear epitopes of various variants is 2, 6, 4, 2, 3, 3, 10, and 3 for Alpha, Beta, Gamma, Delta, Lambda, Mu, Omicron, and GH/490R, respectively. A single amino acid difference from Wuhan-Hu-1 at probable conformational epitopes occurred in Beta, Gamma, Delta, Mu, and GH/490R; three changes in Alpha, and seven each occurred in Lambda and Omicron. At the S1/S2 cleavage site of the Alpha, Beta, Delta, Lambda, and GH/490R variants differ from Wuhan-Hu-1 in one residue, with Omicron differing in two. All residues at this site have changed from the nonbasic amino acids Q/N/P to the basic amino acids H/R/K. At the fusion peptide site, only a single amino acid substitution occurred in Omicron. In heptad repeat 1 (HR1), a single amino acid alteration occurred in the Alpha, Gamma, and Mu variants, with Omicron displaying three alterations. In heptad repeat 2 (HR2) and the transmembrane domain, a single amino acid change occurred in the Gamma variant only.
Table 1

Polymorphic amino acids residues of spike protein of SARS-CoV-2 Wuhan Hu-1 and all variants with possible biological function.

Amino acid positionSARS-CoV-2 variantKnown Function/probable biological impact∗∗
Wuhan Hu-1AlphaBetaGammaDeltaLambdaMuOmicronGH/490R
9PLSP
18LFFNTD
19TRNTD; GML
20TNNTD; AGM
26PSNTD; IdA; PCE
67AVNTD
69HDelDelNTD
70VDelDelNTD; PCE
75GVNTD
76TINTD; GML
80DANTD
95TIIINTD
136CDelNTD; CRL
137NDelNTD
138DYDelNTD
139PDelNTD
140FDelNTD
141LDelNTD
142GDDDelNTD
143VDelDelDelNTD
144DelDelDelDelDelDelTDelDelNTD
145YVSDelDelNTD
146YNDelNTD
154MTNTD
157EGNTD
158FDelNTD
159RDelNTD
191RSNTD; AGM
212NINTD; PCE
213LVNTD; PCE
214VRNTD; PCE
215RENTD
216DelDelDelDelDelDelDelPDelNTD
217DelDelDelDelDelDelDelEDelNTD
218DGGNTD
244LDelNTD; IdB
245LDelNTD; IdB
246ADelNTD; IdB
248HPNTD; IdB
249RNNTD; AGM
250SDelNTD; PCE
251YDelNTD; PCE
252LDelNTD; PCE
253TDelNTD; PCE
254PDelNTD; PCE
255GDelNTD; PCE
256DDelNTD; PCE
342GDRBD; IdD
349RKRBD; IdD
374SLRBD
376SPRBD
378SFRBD; IdE/He4
420KNTNRBD; He5
443NKRBD; RBS
449GSRBD; RBS
455LRQRBD; RBS; IdF
480SNRBD; RBS
481TKKRBD; RBS
487EKKKAKRBD; RBS; IdG
493FSRBD; RBS; IdG
496QRRBD; RBS; IdG
499GSRBD; RBS; IdG
501QRRBD; RBS; IdG
504NYYYYYRBD; RBS
508YHRBD
550TKIdH/He6-7
573ADIdH/He6-7
617DGGGGGGGGIdH/He6-7
658HYY
678QHS1/S2-CS
682NKS1/S2-CS; PCE
684PHRHHRS1/S2-CS; PCE
693QHS1/S2-CS; PCE
704AV
719TIGML
767NKHe9-11
799DYFP
858NK
862TN
953DNNHR1
957QHHR1
972NKHR1
984LFHR1; PCE
985SAHR1; PCE
1023AS
1030TI
1121DH
1179VFHR2; TM

The positions were determined after alignment of all variants as available at supplementary material. Numbering 1–143 is equal to residues no. 1–143 of Wuhan-Hu-1. Number 144–215 is Wuhan-Hu-1 plus 1. Number >215 become Wuhan-Hu-1 plus 3; SP: signal peptide; NTD: N-terminal domain of S1; S1/S2 CS: S1/S2 cleavage site; RBD: receptor binding domain; RBS: receptor binding site; FP: fusion peptide; HR1 or HR2: heptad repeat 1 or 2; TM: transmembrane; IdA, IdB, IdC, IdD, IdE/He4, IdF, IdG, IdH/He6-7, IdI/He12-13, He1, He2-3, He5, He8, He9-11, He14, He15, and He16: corresponding linear epitopes as described in Supplementary Material 1; PCE: probable conformational epitopes; GML: glycosylation motive loss; AGM: additional glycosylation motive; CRL: cysteine residue loss.

The phylogenetic analysis presented in Figure 1 shows two clusters of variants, with good bootstrap support of 88%. The Alpha, Delta, Mu, and Omicron variants form one cluster, and GH/490R, Beta, and Gamma form another. Lambda appears to have emerged directly from Wuhan-Hu-1. In the first cluster, Omicron forms a long branch from the other members of the group.
Figure 1

Evolutionary relationships of variants of SARS-CoV-2. The evolutionary history was inferred using the neighbor-joining method [13]. The evolutionary distances were computed using the Kimura 2-parameter method [14]. Evolutionary analyses were conducted in MEGA X [12]. The tree was rooted to Wuhan-Hu-1.

4. Discussion

The number of whole-genome sequence data for SARS-CoV-2 submitted to international databases poses a major computational challenge to obtain an indication of the possible impact of each strain before clinical and experimental data are available, especially in resource-limited countries. At the time of writing of this paper, the number of submitted whole-genome SARS-CoV-2 sequences was approximately 6.5 million. Here, we offer a simple bioinformatic protocol for gene mining and predicting the possible biological meaning of genetic changes in strains. The GISAID initiative has enabled data mining by providing a variant tag for each submitted sequence. We proposed a guideline of the probable biological function of each residue in the spike protein of SARS-Cov-2 based on current knowledge which could be adjusted following the fast flow of upcoming research reports. It is generally believed that the phenotypic nature of viruses is mostly polygenic, whereby the entire genetic composition of strains determines the biology of the virus. This should also be true for coronaviruses, including SARS-CoV-2. The mechanism of pathogenesis of the Middle East severe acute respiratory syndrome coronavirus (MERS-CoV) mainly occurs through interaction between spike and cellular receptors, papain-like protease PLpro, and accessory proteins such as p4a and membrane M protein [15]. For SARS-CoV-2, various nonstructural proteins, such as PLpro [16] and various accessory proteins [5], have been described as contributing to virus biology and pathogenesis. However, focusing on the spike gene is also important, as a body of literature provides evidence on the key functions of this protein. The S1 domain, which carries major antigenic determinants, mediates receptor recognition and viral attachment to initiate host cell entry [17]. The NTD domain contributes to the host range [18]. Binding of the receptor-binding domain to receptors initiates infection [19], and the S2 domain mediates membrane fusion [17, 20, 21]. Various polymorphic amino acids are located mostly in the S1 domain, downstream from the S1/S2 cleavage site. The number of polymorphic amino acids at this site is 73, with only 15 in the S2 domain. We believe that as the S1 domain of the spike is located on the surface of the virion, thus allowing many substitutions, but that the S2 domain must be conserved to preserve virus integrity. Glycosylation and cysteine residues are also crucial for maintaining virus integrity. Our data show that glycosylation motif loss occurred once in the Delta and Lambda variants and that additional glycosylation motif gain occurred twice in Gamma and once in Lambda. Cysteine residue loss (CRL) occurred only once in the GH/490R variant. The newest variant, Omicron, which was recently identified, exhibits nine amino acid changes in the RBD, whereas other variants are more conserved, with only one or two residue differences from Wuhan-Hu-1. Therefore, it is plausible to expect biological changes in Omicron that differ from those of other variants as well as the original strain. Tracking variant occurrence on the GISAID website revealed a sharp global increase from under 1% on November 29, 2021, to 50% of circulating strains on December 29, 2021. Higher transmissibility is evident. The Omicron linear and conformational epitopes diverge most from the Covid-19 origin strain Wuhan-Hu-1, with ten and seven amino acid substitutions, respectively. Seven amino acid substitutions at conformational epitopes also occur in the Lambda variant. The other variants have only 2–6 amino acid changes in linear and 1–3 substitutions in conformational epitopes of spike. The identified SARS-CoV-2 spike epitopes consist of at least 10–20 residues or longer. As MHC-1-presenting B- and T cell epitopes are limited to 9–11 residues and MHC-II-presenting epitopes are limited to 9–22 [22], the identified epitopes of the spike protein of SARS-CoV-2 need to be refined. Moreover, due to the multiple epitopes of more than 20, a slight change at some epitopes while others are conserved should not lead to a significant reduction in the effectiveness of existing vaccines. Reports on the reduced efficacy of vaccines against variants based on in vitro experiments [23] should not cause concern, as they might not significantly reduce vaccine efficacy in vivo. Indeed, the immune system consists of an array of components, such as cytokines, complement activation, and macrophage opsonization [24-26], as already reported in SARS-CoV-2 [27-30]. Moreover, cellular mediated immunity must pose critical role in a complete immune protection in SARS-CoV-2 [31], which is not involved in an in vitro neutralization testing. The S1/S2 cleavage site changes from nonbasic Q/N/P to basic H/R/K amino acids might be critical to the biology of the virus, especially Omicron. Although other variants show a single amino acid change, Omicron exhibits two. Moreover, all variants carry more basic S1/S2 cleavage sites, and Omicron has the most basic S1/S2 cleavage site. It is well-documented for influenza viruses that the polybasic cleavage site allows for ubiquitous cellular protease activation for the virus to initiate infection [32-34]. The cleavage site of the origin of SARS-CoV-2 is indeed polybasic with an NH-PRRAR-COOH motif. A P681H change occurred in Alpha, Mu, and Omicron, with P681R in delta and GH/490R. A more basic cleavage site was acquired in Omicron due to the N679K substitution. For the Delta variant, the more basic cleavage site might have contributed to its transmissibility and clinical outcomes [35, 36]. Therefore, it is plausible to expect higher transmissibility of Omicron as its cleavage site is more basic than that of the delta variant. However, its clinical consequences among nonimmune people, who are unvaccinated or have not experienced natural infection, are expected to be reported soon. Despite the protein changes of Omicron, it has a relatively conserved S2 domain. Omicron shows amino acid alterations in the fusion peptide and HR1, with one and three substitutions, respectively, which might have an impact on fusion capability. As described previously, these domains mediate membrane fusion in infected cells [17, 20, 21]. Phylogenetic analysis showed that the Omicron forms a long branch from the other members of the cluster. This phenomenon is most frequently observed due to isolation and having no known close relatives [37]. The ancestor of this variant might have been circulating without notice, or the number of genome sequences from the area of circulation was limited. Another explanation is that dramatic changes might have occurred shortly before its identification. We expect some biological changes due to amino acid substitution of the spike protein of SARS-CoV-2, especially for omicron. The most significant number of amino acid substitutions occurred in Omicron, with divergence in the RBS, epitopes, S1/S2 cleavage site, fusion peptide, and HR1. The sharp global increase in dominance of 50% of circulating viruses shows evidence of their high community transmissibility. As multiple epitopes of more than 20 residues exist on spike, a slight change in some epitopes while others are conserved should not lead to a significant reduction in existing vaccine effectiveness. The emergence of Omicron with dramatic changes and an unknown recent ancestor should draw attention from the global community for resource mobilization. In conclusion, the proposed guideline could give an immediate insight into the probable biological nature of any variant of SARS-Cov-2. The Omicron diverged the farthest from the original pandemic strain Wuhan-Hu-1 with divergence in the RBS, epitopes, S1/S2 cleavage site, fusion peptide, and HR1. Therefore, we expect different epidemiological and clinical pattern of Omicron cases. On the vaccine efficacy, slight changes in some epitopes while others are conserved should not lead to a significant reduction in existing vaccine effectiveness.
  36 in total

Review 1.  The molecular biology of coronaviruses.

Authors:  Paul S Masters
Journal:  Adv Virus Res       Date:  2006       Impact factor: 9.937

2.  A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences.

Authors:  M Kimura
Journal:  J Mol Evol       Date:  1980-12       Impact factor: 2.395

3.  Effectiveness of the CoronaVac vaccine in older adults during a gamma variant associated epidemic of covid-19 in Brazil: test negative case-control study.

Authors:  Otavio T Ranzani; Matt D T Hitchings; Murilo Dorion; Tatiana Lang D'Agostini; Regiane Cardoso de Paula; Olivia Ferreira Pereira de Paula; Edlaine Faria de Moura Villela; Mario Sergio Scaramuzzini Torres; Silvano Barbosa de Oliveira; Wade Schulz; Maria Almiron; Rodrigo Said; Roberto Dias de Oliveira; Patricia Vieira da Silva; Wildo Navegantes de Araújo; Jean Carlo Gorinchteyn; Jason R Andrews; Derek A T Cummings; Albert I Ko; Julio Croda
Journal:  BMJ       Date:  2021-08-20

Review 4.  Recent Aspects on the Pathogenesis Mechanism, Animal Models and Novel Therapeutic Interventions for Middle East Respiratory Syndrome Coronavirus Infections.

Authors:  Sinosh Skariyachan; Sneha Basavaraj Challapilli; Swathi Packirisamy; Supreetha Toplar Kumargowda; Vaishnavi Sneha Sridhar
Journal:  Front Microbiol       Date:  2019-03-26       Impact factor: 5.640

5.  Variable impact of the hemagglutinin polybasic cleavage site on virulence and pathogenesis of avian influenza H7N7 virus in chickens, turkeys and ducks.

Authors:  David Scheibner; Reiner Ulrich; Olanrewaju I Fatola; Annika Graaf; Marcel Gischke; Ahmed H Salaheldin; Timm C Harder; Jutta Veits; Thomas C Mettenleiter; Elsayed M Abdelwhab
Journal:  Sci Rep       Date:  2019-08-09       Impact factor: 4.379

6.  Efficacy of NVX-CoV2373 Covid-19 Vaccine against the B.1.351 Variant.

Authors:  Vivek Shinde; Sutika Bhikha; Zaheer Hoosain; Moherndran Archary; Qasim Bhorat; Lee Fairlie; Umesh Lalloo; Mduduzi S L Masilela; Dhayendre Moodley; Sherika Hanley; Leon Fouche; Cheryl Louw; Michele Tameris; Nishanta Singh; Ameena Goga; Keertan Dheda; Coert Grobbelaar; Gertruida Kruger; Nazira Carrim-Ganey; Vicky Baillie; Tulio de Oliveira; Anthonet Lombard Koen; Johan J Lombaard; Rosie Mngqibisa; As'ad E Bhorat; Gabriella Benadé; Natasha Lalloo; Annah Pitsi; Pieter-Louis Vollgraaff; Angelique Luabeya; Aliasgar Esmail; Friedrich G Petrick; Aylin Oommen-Jose; Sharne Foulkes; Khatija Ahmed; Asha Thombrayil; Lou Fries; Shane Cloney-Clark; Mingzhu Zhu; Chijioke Bennett; Gary Albert; Emmanuel Faust; Joyce S Plested; Andreana Robertson; Susan Neal; Iksung Cho; Greg M Glenn; Filip Dubovsky; Shabir A Madhi
Journal:  N Engl J Med       Date:  2021-05-05       Impact factor: 91.245

7.  Clinical Features of Adult COVID-19 Patients without Risk Factors before and after the Nationwide SARS-CoV-2 B.1.617.2 (Delta)-variant Outbreak in Korea: Experience from Gyeongsangnam-do.

Authors:  Byung-Han Ryu; Sun In Hong; Su Jin Lim; Younghwa Cho; Cheolgu Hwang; Hyungseok Kang; Si-Ho Kim; Yu Mi Wi; Kyung-Wook Hong; In-Gyu Bae; Oh-Hyun Cho
Journal:  J Korean Med Sci       Date:  2021-12-20       Impact factor: 2.153

Review 8.  SARS-CoV-2 variants, spike mutations and immune escape.

Authors:  William T Harvey; Alessandro M Carabelli; Ben Jackson; Ravindra K Gupta; Emma C Thomson; Ewan M Harrison; Catherine Ludden; Richard Reeve; Andrew Rambaut; Sharon J Peacock; David L Robertson
Journal:  Nat Rev Microbiol       Date:  2021-06-01       Impact factor: 78.297

9.  Responsiveness to a pandemic alert: use of reverse genetics for rapid development of influenza vaccines.

Authors:  R J Webby; D R Perez; J S Coleman; Y Guan; J H Knight; E A Govorkova; L R McClain-Moss; J S Peiris; J E Rehg; E I Tuomanen; R G Webster
Journal:  Lancet       Date:  2004-04-03       Impact factor: 79.321

10.  Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine.

Authors:  Wanbo Tai; Lei He; Xiujuan Zhang; Jing Pu; Denis Voronin; Shibo Jiang; Yusen Zhou; Lanying Du
Journal:  Cell Mol Immunol       Date:  2020-03-19       Impact factor: 11.530

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.