| Literature DB >> 33936927 |
Aanchal Mathur1, Sibi Raj1, Niraj Kumar Jha2, Saurabh Kumar Jha2, Brijesh Rathi3, Dhruv Kumar1.
Abstract
The novel coronavirus SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) has led to a global crisis by infecting millions of people across the globe eventually causing multiple deaths. The prominent player of the virus has been known as the spike protein which enters the host system and leads to the infection. The S2 subunit is the most essential in this process of infection as it helps the SARS-CoV-2 to infect the host by binding to the human angiotensin converting enzyme 2 (hACE2), with the help of the receptor binding domain found at the S2 subunit of the virus. Studies also hypothesize that the S glycoproteins present in the virus interacts with different hosts in different ways which might be due to the mutations taking place in the genome of the virus over time. This work aims to decipher the similarities and differences in the sequences of spike proteins from samples of SARS-CoV-2 acquired from different infected individuals in different countries with the help of in silico methods such as multiple sequence alignment and phylogenetic analysis. It also aims to understand the differential infection rates among the infected countries by studying the amino acid composition and interactions of the virus with the host. © King Abdulaziz City for Science and Technology 2021.Entities:
Keywords: COVID-19; Glycoproteins; Mutational heterogeneity; SARS-CoV-2; Spike proteins
Year: 2021 PMID: 33936927 PMCID: PMC8070983 DOI: 10.1007/s13205-021-02791-y
Source DB: PubMed Journal: 3 Biotech ISSN: 2190-5738 Impact factor: 2.406
Fig. 1Schematic representation of the binding of S1 subunit of the SARS-CoV-2 molecule to the ACE2 present in a human cell. The receptor binding domain binds identifies and binds to the ACE2 in the host organism
Fig. 2Analysis of the number of SARS-CoV-2 cases in different countries from January 2020 to May 31st 2020
Fig. 3MSA of 1247 sequences determined that 63 sequences consisted of the D614G mutation
Mutations deciphered after multiple sequence alignment using MEGA X
| S. no. | Name of sequence | Accession number | Country | Sequence length | Amino acid substitutions |
|---|---|---|---|---|---|
| 1 | Surface glycoprotein Severeoacute respiratory syndrome coronavirus 2(Ref_SeQ) | YP_009724390 | China | 1273 base pairs | 614(G- > D) |
| 2 | Surface glycoprotein Severeoacute respiratory syndrome coronavirus 2 | QIU81825 | China | 1273 base pairs | 614(G- > D) |
| 3 | Surface glycoprotein (SARS-CoV-2) | QJQ84088 | China | 1273 base pairs | 614(G- > D) |
| 4 | Surface glycoprotein (SARS-CoV-2) | QIE07471 | China | 1273 base pairs | 614(G- > D) |
| 5 | Surface glycoprotein (SARS-CoV-2) | QHZ00358 | China | 1273 base pairs | 614(G- > D) |
| 6 | Surface glycoprotein (SARS-CoV-2) | QIS30006 | China | 1273 base pairs | 614(G- > D) |
| 7 | S protein (SARS-CoV-2) | QII57161 | China | 1273 base pairs | 614(G- > D) |
| 8 | Surface glycoprotein (SARS-CoV-2) | QHN73795 | China | 1273 base pairs | 614(G- > D) |
| 9 | Surface glycoprotein (SARS-CoV-2) | QIA20044 | China | 1273 base pairs | 24(Y- > N) 614(G- > D) |
| 10 | Surface glycoprotein Severe acute respiratory syndrome coronavirus 2 | QIQ68554 | China | 1273 base pairs | 614(G- > D) |
| 11 | Surface glycoprotein (SARS-CoV-2) | QJD07628 | Hong Kong | 1273 base pairs | 614(G- > D) |
| 12 | Surface glycoprotein (SARS-CoV-2) | QJD07640 | Hong Kong | 1273 base pairs | 614(G- > D) |
| 13 | Surface glycoprotein (SARS-CoV-2) | QJD07652 | Hong Kong | 1273 base pairs | 614(G- > D) |
| 14 | Surface glycoprotein (SARS-CoV-2) | QJD07664 | Hong Kong | 1273 base pairs | 614(G- > D) |
| 15 | Surface glycoprotein (SARS-CoV-2) | QJD07676 | Hong Kong | 1273 base pairs | 614(G- > D) |
| 16 | Surface glycoprotein (SARS-CoV-2) | QIT07011 | Hong Kong | 1273 base pairs | 8(L- > V) 614(G- > D) |
| 17 | Surface glycoprotein (SARS-CoV-2) | QIT08268 | Hong Kong | 1273 base pairs | 8(L- > V) 614(G- > D) |
| 18 | Surface glycoprotein (SARS-CoV-2) | QIT08280 | Hong Kong | 1273 base pairs | 8(L- > V) 614(G- > D) |
| 19 | Surface glycoprotein (SARS-CoV-2) | QIT08304 | Hong Kong | 1273 base pairs | – |
| 20 | Surface glycoprotein (SARS-CoV-2) | QIK02132 | Hong Kong | 1273 base pairs | 614(G- > D) |
| 21 | S glycoprotein (SARS-CoV-2) | QJR84345 | India | 1273 base pairs | 614(G- > D) |
| 22 | Surface glycoprotein (SARS-CoV-2) | QJC19491 | India | 1273 base pairs | – |
| 23 | Surface glycoprotein (SARS-CoV-2) | QJQ28429 | India | 1273 base pairs | – |
| 24 | S glycoprotein (SARS-CoV-2) | QHS34546 | India | 1273 base pairs | 408(R- > I) 614(G- > D) |
| 25 | Surface glycoprotein (SARS-CoV-2) | QJS39639 | India | 1273 base pairs | – |
| 26 | Surface glycoprotein (SARS-CoV-2) | QJQ28417 | India | 1273 base pairs | – |
| 27 | S- glycoprotein (SARS-CoV-2) | QJR84453 | India | 1273 base pairs | – |
| 28 | S- glycoprotein (SARS-CoV-2) | QJQ28393 | India | 1273 base pairs | – |
| 29 | Surface glycoprotein (SARS-CoV-2) | QJF77846 | India | 1273 base pairs | 28(Y- > H) 614(G- > D) |
| 30 | Surface glycoprotein (SARS-CoV-2) | QJF77870 | India | 1273 base pairs | 614(G- > D) |
| 31 | Surface glycoprotein (SARS-CoV-2) | QJS53338 | Greece | 1273 base pairs | – |
| 32 | Surface glycoprotein (SARS-CoV-2) | QJS53350 | Greece | 1273 base pairs | – |
| 33 | S- glycoprotein (SARS-CoV-2) | QJS53362 | Greece | 1273 base pairs | – |
| 34 | Surface glycoprotein (SARS-CoV-2) | QJS53374 | Greece | 1273 base pairs | – |
| 35 | Surface glycoprotein Severe acute respiratory syndrome coronavirus 2 | QJS53386 | Greece | 1273 base pairs | 789(Y- > D) |
| 36 | Surface glycoprotein (SARS-CoV-2) | QJS53398 | Greece | 1273 base pairs | 614(G- > D) 1122(v- > L) |
| 37 | Surface glycoprotein (SARS-CoV-2) | QJS53410 | Greece | 1273 base pairs | 188(N- > D) |
| 38 | Surface glycoprotein (SARS-CoV-2) | QJS53422 | Greece | 1273 base pairs | – |
| 39 | Surface glycoprotein (SARS-CoV-2) | QJS53434 | Greece | 1273 base pairs | – |
| 40 | Surface glycoprotein (SARS-CoV-2) | QJS53446 | Greece | 1273 base pairs | – |
| 41 | S-glycoprotein (SARS-CoV-2) | QJT72086 | France | 1273 base pairs | 153(M- > I) 614(G- > D) 845(A- > S) |
| 42 | Surface glycoprotein (SARS-CoV-2) | QJT72098 | France | 1273 base pairs | – |
| 43 | Surface glycoprotein Severe acute respiratory syndrome coronavirus 2 | QJT72110 | France | 1273 base pairs | – |
| 44 | Surface glycoprotein (SARS-CoV-2) | QJT72122 | France | 1273 base pairs | – |
| 45 | Surface glycoprotein (SARS-CoV-2) | QJT72134 | France | 1273 base pairs | 5(L- > F) 614(G- > D) |
| 46 | Surface glycoprotein (SARS-CoV-2) | QJT72146 | France | 1273 base pairs | – |
| 47 | Surface glycoprotein (SARS-CoV-2) | QJT72158 | France | 1273 base pairs | – |
| 48 | Surface glycoprotein (SARS-CoV-2) | QJT72170 | France | 1273 base pairs | – |
| 49 | Surface glycoprotein (SARS-CoV-2) | QJT72182 | France | 1273 base pairs | 614(G- > D) 845(A- > S) |
| 50 | Surface glycoprotein (SARS-CoV-2) | QJT72194 | France | 1273 base pairs | – |
| 51 | Surface glycoprotein (SARS-CoV-2) | QJQ84568 | Thailand | 1273 base pairs | 614(G- > D) |
| 52 | S- glycoprotein (SARS-CoV-2) | QJQ84580 | Thailand | 1273 base pairs | 614(G- > D) |
| 53 | Surface glycoprotein (SARS-CoV-2) | QJQ84592 | Thailand | 1273 base pairs | 614(G- > D) |
| 54 | Surface glycoprotein (SARS-CoV-2) | QJQ84604 | Thailand | 1273 base pairs | 614(G- > D) |
| 55 | Surface glycoprotein (SARS-CoV-2) | QJQ84616 | Thailand | 1273 base pairs | 614(G- > D) |
| 56 | Surface glycoprotein (SARS-CoV-2) | QJQ84628 | Thailand | 1273 base pairs | 614(G- > D) |
| 57 | S- glycoprotein (SARS-CoV-2) | QJQ84652 | Thailand | 1273 base pairs | 614(G- > D) |
| 58 | Surface glycoprotein (SARS-CoV-2) | QJQ84664 | Thailand | 1273 base pairs | 614(G- > D) |
| 59 | Surface glycoprotein (SARS-CoV-2) | QJQ84676 | Thailand | 1273 base pairs | 614(G- > D) 829(A- > T) |
| 60 | Surface glycoprotein (SARS-CoV-2) | QJQ84700 | Thailand | 1273 base pairs | 614(G- > D) 829(A- > T) |
| 61 | S- glycoprotein (SARS-CoV-2) | QJD47718 | Taiwan | 1273 base pairs | 49(H- > Y) 614(G- > D) 884(S- > F) |
| 62 | Surface glycoprotein (SARS-CoV-2) | QJD47728 | Taiwan | 1273 base pairs | 614(G- > D) 791(T- > I) |
| 63 | Surface glycoprotein (SARS-CoV-2) | QJD47740 | Taiwan | 1273 base pairs | 614(G- > D) 791(T- > I) |
| 64 | Surface glycoprotein (SARS-CoV-2) | QJD47752 | Taiwan | 1273 base pairs | 614(G- > D) 791(T- > I) |
| 65 | Surface glycoprotein (SARS-CoV-2) | QJD47764 | Taiwan | 1273 base pairs | 614(G- > D) |
| 66 | S- glycoprotein (SARS-CoV-2) | QJD47776 | Taiwan | 1273 base pairs | 614(G- > D) |
| 67 | Surface glycoprotein (SARS-CoV-2) | QJD47788 | Taiwan | 1273 base pairs | 614(G- > D) |
| 68 | Surface glycoprotein Severe acute respiratory syndrome coronavirus 2 | QJD47800 | Taiwan | 1273 base pairs | 765(R- > L) |
| 69 | Surface glycoprotein (SARS-CoV-2) | QJD47812 | Taiwan | 1273 base pairs | – |
| 70 | Surface glycoprotein (SARS-CoV-2) | QJD47824 | Taiwan | 1273 base pairs | – |
| 71 | Surface glycoprotein (SARS-CoV-2) | QJR85233 | Australia | 1273 base pairs | 614(G- > D) |
| 72 | Surface glycoprotein (SARS-CoV-2) | QJR85269 | Australia | 1273 base pairs | 614(G- > D) |
| 73 | Surface glycoprotein (SARS-CoV-2) | QJR85281 | Australia | 1273 base pairs | 614(G- > D) |
| 74 | Surface glycoprotein (SARS-CoV-2) | QJR85305 | Australia | 1273 base pairs | 614(G- > D) |
| 75 | Surface glycoprotein (SARS-CoV-2) | QJR85341 | Australia | 1273 base pairs | 614(G- > D) |
| 76 | Surface glycoprotein (SARS-CoV-2) | QJR85353 | Australia | 1273 base pairs | 614(G- > D) |
| 77 | Surface glycoprotein (SARS-CoV-2) | QJR85365 | Australia | 1273 base pairs | 614(G- > D) |
| 78 | Surface glycoprotein (SARS-CoV-2) | QJR85377 | Australia | 1273 base pairs | – |
| 79 | Surface glycoprotein (SARS-CoV-2) | QJR85401 | Australia | 1273 base pairs | – |
| 80 | S- glycoprotein (SARS-CoV-2) | QJR85425 | Australia | 1273 base pairs | 614(G- > D) |
| 81 | Surface glycoprotein (SARS-CoV-2) | QJU11421 | USA | 1273 base pairs | – |
| 82 | Surface glycoprotein (SARS-CoV-2) | QJU11433 | USA | 1273 base pairs | – |
| 83 | Surface glycoprotein (SARS-CoV-2) | QJU11445 | USA | 1273 base pairs | 614(G- > D) |
| 84 | Surface glycoprotein (SARS-Co V-2) | QJU11457 | USA | 1273 base pairs | – |
| 85 | Surface glycoprotein (SARS-CoV-2) | QJU11469 | USA | 1273 base pairs | – |
| 86 | Surface glycoprotein (SARS-CoV-2) | QJU11481 | USA | 1273 base pairs | 258(W- > L) |
| 87 | Surface glycoprotein (SARS-CoV-2) | QJU11493 | USA | 1273 base pairs | – |
| 88 | Surface glycoprotein (SARS-CoV-2) | QJU11505 | USA | 1273 base pairs | 614(G- > D) |
| 89 | Surface glycoprotein (SARS-CoV-2) | QJT43404 | USA | 1273 base pairs | – |
| 90 | Surface glycoprotein (SARS-CoV-2) | QJS54526 | USA | 1273 base pairs | – |
| 91 | Surface glycoprotein (SARS-CoV-2) | QJC21005 | Spain | 1273 base pairs | – |
| 92 | Surface glycoprotein Severeoacute respiratory syndrome coronaviruso2 | QJC21017 | Spain | 1273 base pairs | – |
| 93 | S- glycoprotein (SARS-CoV-2) | QIU78707 | Spain | 1273 base pairs | – |
| 94 | Surface glycoprotein (SARS-CoV-2) | QIU78719 | Spain | 1273 base pairs | – |
| 95 | Surface glycoprotein (SARS-CoV-2) | QIU78731 | Spain | 1273 base pairs | 614(G- > D) |
| 96 | Surface glycoprotein (SARS-CoV-2) | QIU78743 | Spain | 1273 base pairs | 614(G- > D) |
| 97 | S- glycoprotein (SARS-CoV-2) | QIU78755 | Spain | 1273 base pairs | 614(G- > D) |
| 98 | Surface glycoprotein (SARS-CoV-2) | QIU78767 | Spain | 1273 base pairs | 614(G- > D) |
| 99 | Surface glycoprotein (SARS-CoV-2) | QIU78779 | Spain | 1273 base pairs | – |
| 100 | S- glycoprotein (SARS-CoV-2) | QIQ08790 | Spain | 1273 base pairs | 614(G- > D) |
| 101 | Surface glycoprotein (SARS-CoV-2) | QJC19419 | Germany | 1273 base pairs | 271(Q- > R) 614(G- > D) |
| 102 | Surface glycoprotein (SARS-CoV-2) | QJC19431 | Germany | 1273 base pairs | – |
| 103 | Surface glycoprotein (SARS-CoV-2) | QJC19443 | Germany | 1273 base pairs | – |
| 104 | Surface glycoprotein (SARS-CoV-2) | QJC19455 | Germany | 1273 base pairs | 558(K- > R) 614(G- > D) |
| 105 | Surface glycoprotein (SARS-CoV-2) | QJC19467 | Germany | 1273 base pairs | – |
| 106 | Surface glycoprotein (SARS-CoV-2) | QJC19479 | Germany | 1273 base pairs | – |
| 107 | Surface glycoprotein (SARS-CoV-2) | QJD23141 | Czech Republic | 1273 base pairs | 115(Q- > R) |
| 108 | Surface glycoprotein (SARS-CoV-2) | QJD23153 | Czech Republic | 1273 base pairs | 1229(M- > I) |
| 109 | Surface glycoprotein (SARS-CoV-2) | QJD23165 | Czech Republic | 1273 base pairs | – |
| 110 | Surface glycoprotein (SARS-CoV-2) | QJD23177 | Czech Republic | 1273 base pairs | – |
| 111 | Surface glycoprotein (SARS-CoV-2) | QJD23189 | Czech Republic | 1273 base pairs | – |
| 112 | Surface glycoprotein (SARS-CoV-2) | QJD23201 | Czech Republic | 1273 base pairs | – |
| 113 | Surface glycoprotein (SARS-CoV-2) | QJD23213 | Czech Republic | 1273 base pairs | – |
| 114 | Surface glycoprotein (SARS-CoV-2) | QJI53859 | Puerto Rico | 1273 base pairs | 614(G- > D) |
| 115 | Surface glycoprotein (SARS-CoV-2) | QJI53883 | Puerto Rico | 1273 base pairs | 614(G- > D) |
| 116 | Surface glycoprotein (SARS-CoV-2) | QJI53907 | Puerto Rico | 1273 base pairs | – |
| 117 | Surface glycoprotein Severeoacute respiratory syndrome coronaviruso2 | QJI53919 | Puerto Rico | 1273 base pairs | – |
| 118 | Surface glycoprotein (SARS-CoV-2) | QJI53931 | Puerto Rico | 1273 base pairs | 614(G- > D) |
| 119 | Surface glycoprotein (SARS-CoV-2) | QJI53955 | Puerto Rico | 1273 base pairs | 239(Q- > R) |
| 120 | Surface glycoprotein (SARS-CoV-2) | QJI53979 | Puerto Rico | 1273 base pairs | – |
| 121 | S- glycoprotein (SARS-CoV-2) | QJD20837 | Srilanka | 1273 base pairs | 614(G- > D) |
| 122 | Surface glycoprotein (SARS-CoV-2) | QJD20849 | Srilanka | 1273 base pairs | – |
| 123 | Surface glycoprotein (SARS-CoV-2) | QJD20861 | Srilanka | 1273 base pairs | – |
| 124 | Surface glycoprotein (SARS-CoV-2) | QJD20873 | Srilanka | 1273 base pairs | 614(G- > D) |
| 125 | Surface glycoprotein (SARS-CoV-2) | QIZ15537 | South Africa | 1273 base pairs | – |
| 126 | Surface glycoprotein (SARS-CoV-2) | QJQ84843 | Iran | 1273 base pairs | 22(T- > I) 614(G- > D) |
| 127 | Surface glycoprotein (SARS-CoV-2) | QIX12195 | Iran | 1273 base pairs | 614(G- > D) |
| 128 | S-glycoprotein (SARS-CoV-2) | QIT06987 | Israel | 1273 base pairs | 614(G- > D) |
| 129 | Surface glycoprotein (SARS-CoV-2) | QIT06999 | Israel | 1273 base pairs | – |
| 130 | Surface glycoprotein (SARS-CoV-2) | QJQ04445 | Kazakhstan | 1273 base pairs | 614(G- > D) |
| 131 | Surface glycoprotein (SARS-CoV-2) | QJQ04457 | Kazakhstan | 1273 base pairs | – |
| 132 | Surface glycoprotein (SARS-CoV-2) | QJQ04469 | Kazakhstan | 1273 base pairs | – |
| 133 | Surface glycoprotein Severeoacute respiratory syndrome coronaviruso2 | QJQ04481 | Kazakhstan | 1273 base pairs | 614(G- > D) |
| 134 | Surface glycoprotein Severeoacute respiratory syndrome coronaviruso2 | QJD23225 | Malaysia | 1273 base pairs | 614(G- > D) |
| 135 | Surface glycoprotein (SARS-CoV-2) | QJD23237 | Malaysia | 1273 base pairs | 614(G- > D) |
| 136 | Surface glycoprotein (SARS-CoV-2) | QJD23249 | Malaysia | 1273 base pairs | 292(A- > V),293(L- > M),294(D- > I),295(P- > H),296(L- > F),297(S- > W) 491(P- > L) 519(H- > Q) 614(G- > D) |
| 137 | Surface glycoprotein Severe acute respiratory syndrome coronavirus 2 | QIB84673 | Nepal | 1273 base pairs | 614(G- > D) |
| 138 | Surface glycoprotein (SARS-CoV-2) | QIS60276 | Pakistan | 1273 base pairs | 614(G- > D) |
| 139 | Surface glycoprotein (SARS-CoV-2) | QIQ22760 | Pakistan | 1273 base pairs | 614(G- > D) |
| 140 | S- glycoprotein (SARS-CoV-2) | QIV14984 | South Korea | 1273 base pairs | 614(G- > D) |
| 141 | Surface glycoprotein (SARS-CoV-2) | QIV14996 | South Korea | 1273 base pairs | 614(G- > D) |
| 142 | Surface glycoprotein (SARS-CoV-2) | QIV15008 | South Korea | 1273 base pairs | 614(G- > D) |
| 143 | Surface glycoprotein (SARS-CoV-2) | QHZ00379 | South Korea | 1273 base pairs | 221(S- > W) 614(G- > D) |
| 144 | Surface glycoprotein (SARS-CoV-2) | QIC50498 | Italy | 1273 base pairs | 614(G- > D) |
| 145 | Surface glycoprotein (SARS-CoV-2) | QIA98554 | Italy | 1273 base pairs | 614(G- > D) |
| 146 | Surface glycoprotein (SARS-CoV-2) | QJA41641 | Brazil | 1273 base pairs | 74(N- > K) 614(G- > D) |
| 147 | Surface glycoprotein Severeoacute respiratory syndrome coronaviruso2 | QIG55994 | Brazil | 1273 base pairs | 614(G- > D) |
Fig. 4Condensed circular Phylogenetic Tree of predominantly related samples from, Puerto Rico, USA, China, Hong Kong and Australia
Fig. 5Figure represent the WT and Mutated sequence of S2 domain of Spike Glycoprotein of SARS-CoV-2. Aspartic acid is replaced by Glycine at 614 position of S2 domain of Spike Glycoprotein of SARS-CoV-2
Fig. 6Structural 3D representation in ribbon style of WT (614:D(Aspartic acid)) and mutated (614:G(Glycine)) spike glycoprotein of SARS-CoV-2