| Literature DB >> 32615316 |
Sayantan Laha1, Joyeeta Chakraborty1, Shantanab Das1, Soumen Kanti Manna2, Sampa Biswas3, Raghunath Chatterjee4.
Abstract
The recent pandemic of SARS-CoV-2 infection has affected more than 3.0 millionEntities:
Keywords: Frequent mutation; Hot-spot mutations; Protein stability; SARS-CoV-2; Spike glycoprotein
Year: 2020 PMID: 32615316 PMCID: PMC7324922 DOI: 10.1016/j.meegid.2020.104445
Source DB: PubMed Journal: Infect Genet Evol ISSN: 1567-1348 Impact factor: 3.342
Nucleotide and protein alignment of SARS-CoV-2 genes.
| Name | Total No of Samples | No. of nucleotides | Variable sites | Variable Sites per 100 bp | Synonymous | Non Synonymous | Singleton Informative Site per 100 bp | Parsimony Informative Site per 100 bp | No. of Amino acids | Variable Site | Variable Sites per 100 aa | Singleton Informative Site per 100 aa | Parsimony Informative Site per 100 aa |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Envelope Protein | 664 | 228 | 4 | 1.75 | 2 | 2 | 1.32 | 0.44 | 76 | 2 | 0.88 | 0.88 | 0.00 |
| Membrane Protein | 649 | 669 | 7 | 1.20 | 2 | 5 | 0.60 | 0.60 | 223 | 5 | 0.75 | 0.45 | 0.30 |
| Nucleocapsid Protein | 664 | 1260 | 50 | 3.97 | 24 | 26 | 2.54 | 1.43 | 420 | 26 | 2.06 | 1.43 | 0.63 |
| ORF10 | 660 | 117 | 4 | 3.42 | 2 | 2 | 2.56 | 0.85 | 39 | 2 | 1.71 | 1.71 | 0.00 |
| ORF1ab | 639 | 21,291 | 383 | 1.80 | 148 | 235 | 1.34 | 0.46 | 7097 | 235 | 1.10 | 0.83 | 0.28 |
| ORF3a | 660 | 828 | 28 | 3.38 | 8 | 20 | 2.05 | 1.33 | 276 | 20 | 2.42 | 1.33 | 1.09 |
| ORF6 | 657 | 186 | 6 | 3.23 | 3 | 3 | 1.61 | 1.61 | 62 | 3 | 1.61 | 1.08 | 0.54 |
| ORF7a | 658 | 366 | 9 | 2.73 | 3 | 6 | 1.91 | 0.82 | 122 | 6 | 1.64 | 1.09 | 0.55 |
| ORF7b | 446 | 132 | 2 | 1.51 | 2 | 0 | 0.76 | 0.76 | 44 | 0 | 0 | 0 | 0 |
| ORF8 | 661 | 366 | 9 | 2.46 | 1 | 8 | 1.37 | 1.09 | 122 | 8 | 2.19 | 1.09 | 1.09 |
| Spike Protein | 643 | 3822 | 74 | 1.96 | 30 | 44 | 1.52 | 0.44 | 1274 | 44 | 1.13 | 0.92 | 0.21 |
Fig. 1Frequently mutated residues, i.e. those observed in a minimum of 10 SARS-Cov-2 isolates are plotted on the respective proteins of the SARS-CoV-2 genome. The tabular presentation depicted the number of occurrences (#) of wildtype and mutant residues among the sequences of SARS-CoV-2 ORFs.
Strains with co-occurring major mutations in N, S, ORF3a, ORF8 and ORF1ab proteins.
| Proteins (Positions) | N(203,204) | S(614) | orf3a(57,251) | orf8(24,62,84) | orf1ab (75,265,971,3606,4715,5550,5828,5865,6158) | # of isolates | Freq. (%) | ||||||||||||
| Amino acid position | 203 | 204 | 614 | 57 | 251 | 24 | 62 | 84 | 75 | 265 | 971 | 3606 | 4715 | 5550 | 5828 | 5865 | 6158 | ||
| SARS-CoV-2 | R | G | D | Q | G | S | V | S⁎ | D | T | P | L | P | V | L⁎ | C⁎ | F | 186 | 30.6 |
| R | G | G⁎ | H⁎ | G | S | V | L | D | I⁎ | P | L | L⁎ | V | P | Y | F | 124 | 20.4 | |
| R | G | D | Q | G | S | V | L | D | T | P | L | P | V | P | Y | F | 81 | 13.3 | |
| R | G | G⁎ | H⁎ | G | S | V | L | D | T | P | L | L⁎ | V | P | Y | F | 44 | 7.2 | |
| R | G | D | Q | G | S | V | S⁎ | D | T | P | L | P | V | P | Y | F | 28 | 4.6 | |
| R | G | G⁎ | Q | G | S | V | L | D | T | P | L | L⁎ | V | P | Y | F | 28 | 4.6 | |
| K⁎ | R⁎ | G⁎ | Q | G | S | V | L | D | T | P | L | L⁎ | V | P | Y | F | 23 | 3.8 | |
| R | G | D | Q | V⁎ | S | V | L | D | T | P | F⁎ | P | V | P | Y | F | 20 | 3.3 | |
| R | G | G⁎ | H⁎ | G | L⁎ | V | L | D | I⁎ | P | L | L⁎ | V | P | Y | F | 20 | 3.3 | |
| R | G | D | Q | G | S | L⁎ | S⁎ | E⁎ | T | L⁎ | L | P | V | P | Y | L⁎ | 17 | 2.8 | |
| R | G | D | Q | G | S | V | L | D | T | P | F⁎ | P | V | P | Y | F | 11 | 1.8 | |
| R | G | G⁎ | H⁎ | G | S | V | L | D | I⁎ | P | L | L⁎ | L⁎ | P | Y | F | 8 | 1.3 | |
| R | G | D | Q | V⁎ | S | V | L | D | T | P | L | P | V | P | Y | F | 5 | 0.8 | |
| R | G | D | Q | G | S | V | S⁎ | D | T | P | F⁎ | P | V | L⁎ | C⁎ | F | 4 | 0.7 | |
| R | G | D | Q | G | S | V | S⁎ | D | T | P | F⁎ | P | V | P | Y | F | 3 | 0.5 | |
| R | G | D | Q | G | S | L⁎ | S⁎ | E⁎ | T | L⁎ | L | P | V | P | Y | F | 1 | 0.2 | |
| R | G | D | Q | G | S | L⁎ | S⁎ | D | T | P | L | P | V | P | Y | F | 1 | 0.2 | |
| R | G | G⁎ | H⁎ | G | S | V | L | D | I⁎ | P | F⁎ | L⁎ | V | P | Y | F | 1 | 0.2 | |
| K⁎ | G | G⁎ | H⁎ | G | S | V | L | D | I⁎ | P | L | L⁎ | V | P | Y | F | 1 | 0.2 | |
| R | G | G⁎ | H⁎ | G | S | V | S⁎ | D | I⁎ | P | L | L⁎ | V | P | Y | F | 1 | 0.2 | |
| Bat coronavirus RaTG13 | R | G | D | Q | G | S | V | S | D | T | P | V | P | V | P | Y | F | ||
| Pangolin coronavirus | R | G | D | Q | G | S | V | S | D | N | Q | V | P | V | P | Y | F | ||
| SARS coronavirus TW11 | R | G | D | Q | G | E | F | Y | D | T | V | V | P | V | P | Y | F | ||
| SARS coronavirus GD01 | R | G | D | Q | G | E | F | Y | D | T | V | V | P | V | P | Y | F | ||
Mutated residues.
Fig. 2Phylogeny (generated from GISAID Next hCoV-19 App) with colour coding for the wild type and substituted residues of A. Spike glycoprotein (S) at the 614th amino acid position, B. ORF1b at 314, 1427 and 1464th amino acid positions, C. Nucleocapsid (N) at 203 and 204th amino acid positions, D. ORF1a at 75 and 265th, E. ORF3a at 57 and 251st, and F. ORF8 at 24, 62 and 84th amino acid positions. All possible combinations of residues at the frequently mutated sites have been represented by a distinct colour for each ORF/protein, the strains are colour-coded as per the combination of residues at the reference sites. The horizontal axis depicts the dates around which the isolates were sequenced and submitted.
The following are the supplementary data related to this article.Supplementary Fig. 1: Phylogeny (generated from GISAID Next hCoV-19 App) with colour coding for the wild type and all substituted residues of ORF1a and ORF1b.
Potential implications of SARS-COV2 mutations on viral protein structure and stability.
| Protein | Mutant | Relative abundance of the mutation (%) | Expected change in charge state | Mean chemical difference index | Difference in Kyte-Doolittle hydropathy index | Apparent partition energy (parent residue) | Apparent partition energy (mutant residue) | Difference in apparent partition energy | Potential difference in number of H-bonds | Potential difference in stability due to H-bond | Presumptive difference due to salt bridge interactions |
|---|---|---|---|---|---|---|---|---|---|---|---|
| S | D614G | 69.8 | 1 | 94 | 3.1 | 0.41 | 0.31 | 0.1 | -2 | 1–3 | 3 |
| V483A | 0.9 | 0 | 64 | −2.4 | −0.46 | 0.05 | −0.51 | 0 | 0 | 0 | |
| N | R203K | 4.4 | 0 | 26 | 0.6 | 0.12 | 0.57 | −0.45 | −2 | 1–3 | 0 |
| G204R | 4.1 | −1 | 125 | −4.1 | 0.31 | 0.12 | 0.19 | 4 | -(2–6) | −3 | |
| ORF1ab | D75E | 2.9 | 0 | 45 | 0 | 0.41 | 0.46 | −0.05 | 0 | 0 | 0 |
| T265I | 34.3 | 0 | 89 | 5.2 | 0.38 | −0.69 | 1.07 | −1 | 0.5–1.5 | 0 | |
| P971L | 2.9 | 0 | 98 | 5.4 | 0.46 | −2.67 | 3.13 | 0 | 0 | 0 | |
| L3606F | 6.7 | 0 | 22 | −1 | −2.67 | −1.03 | −1.64 | 0 | 0 | 0 | |
| P4715L | 68.8 | 0 | 98 | 5.4 | 0.46 | −2.67 | 3.13 | 0 | 0 | 0 | |
| V5550L | 1.6 | 0 | 32 | −0.4 | −0.46 | −2.67 | 2.21 | 0 | 0 | 0 | |
| P5828L | 43.7 | 0 | 98 | 5.4 | −2.67 | 0.46 | 3.13 | 0 | 0 | 0 | |
| Y5865C | 43.7 | 0 | 194 | 3.8 | −0.84 | −0.25 | 0.59 | −1 | 0.5–1.5 | 0 | |
| F6158L | 2.9 | 0 | 22 | 1 | −1.03 | −2.67 | 1.64 | 0 | 0 | 0 | |
| ORF3a | Q57H | 49.7 | −1 | 24 | 0.3 | 0.38 | −0.41 | 0.79 | 1 | -(0.5–1.5) | −3 |
| G251V | 4.1 | 0 | 109 | 4.6 | 0.31 | −0.46 | 0.77 | 0 | 0 | 0 | |
| ORF8 | S24L | 3.9 | 0 | 145 | 4.6 | 0.12 | −2.67 | 2.79 | −1 | 0.5–1.5 | 0 |
| V62L | 3.0 | 0 | 32 | −0.4 | −0.46 | −2.67 | 2.21 | 0 | 0 | 0 | |
| L84S | 67.5 | 0 | 145 | −4.6 | −2.67 | 0.12 | −2.79 | 1 | -(0.5–1.5) | 0 |
In (kcal/mol). More positive values indicate mutant is relatively less comfortable to be exposed in water and negative value the otherwise.
In (kcal/mol). Negative values indicate the mutation may to contribute towards stability through H-bonding and positive value the otherwise.
In (kcal/mol) Negative value indicates the mutation may to contribute towards stability through salt-bridge and positive value the otherwise. It should be noted that a residue cannot engage in all probable H-bond interactions and salt-bridge at the same time.
Total free energy and solvation polar energy changes of Spike glycoprotein mutants (Protein Stability analysis using FoldX).
| Substitution in S protein | Change in total energy (ΔΔG) in kcal/mol | Change in Solvation Polar energy | ||||
|---|---|---|---|---|---|---|
| PDB ID: | PDB ID: | PDB ID: | PDB ID: | PDB ID: | PDB ID: | |
| A27V | 1.86108 | 1.25673 | NA | 2.56145 | 2.29815 | NA |
| Y28N | 3.65923 | 3.87214 | 2.27761 | −3.09106 | −1.99917 | −2.83006 |
| T29I | 5.84563 | 1.66361 | −3.83215 | 1.39819 | 0.408085 | −0.427407 |
| 1.31686 | 1.07831 | 1.53239 | ||||
| D111N | −1.06165 | 2.71599 | −1.01473 | 0.267207 | 0.175352 | −1.32065 |
| S221W | 2.40534 | 2.09354 | 31. | 4.01364 | 3.33787 | 9.05937 |
| −1.21194 | −1.08835 | −0.641467 | ||||
| A348T | 2.07674 | 9.89322 | 1.00384 | 4.89654 | 4.78981 | 3.54288 |
| R408I | 1.12775 | 3.10298 | 1.50848 | −0.474226 | −2.09272 | −0.452398 |
| H519Q | −2.43624 | −0.545803 | −4.62908 | −0.642438 | −2.29236 | −0.688364 |
| 1.80759 | 1.36561 | 1.62258 | ||||
| A570V | 3.62668 | 6.71797 | 8.03373 | 4.26631 | 1.79105 | 2.23977 |
| 8.89E-01 | 0.867421 | 0.992304 | ||||
| F797C | 15. | 10.927 | 12. | −3.28608 | −5.36719 | −2.97664 |
| A930V | 8.81798 | 9.24991 | 3.19247 | 3.04437 | 2.96865 | 3.47189 |
| A1078V | 3.60391 | 2.84238 | 2.42529 | 2.56687 | 2.67595 | 3.10323 |
Mutants that show reduction in total free energy in all three conformations are presented in bold.
Fig. 3Ribbon diagram of Spike Protein (PDB_ID 6VXX), A. In two views; colour code: NTD blue, RBD green, CTD2 light blue, CTD3 orange, S1/S2 linker red and S2 sky blue. B. Superposition of RBD of DOWN/closed conformation (6VXX) with UP/open (6VYB) and pre-fusion state (6VSB) in faded green colour ribbons.
Fig. 4Location of mutations. A. Ribbon diagram of trimeric S protein with colour code as described in Fig. 3. The amino acids undergone to mutations are represented by Vander-wall radii with purple for destabilizing/neutral mutation points and grey for stabilizing ones. B. The monomeric unit of S-protein with a label for the same amino acids as in A. C. A semi-transparent electro-static surface presentation of the S-protein with glycans as Van der Waal presentation. D. Mutations in the monomeric unit of S protein, where ribbon size is proportional to the average isotropic displacement of amino acid residue. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 5The implication of D614G mutation, A. Wild-type (close conformation; pdb_id 6VXX) and D614G mutant structures have been represented as electrostatic potential surfaces. The position of mutation is highlighted. B. A close view near the position of the mutation. C. Superposition of a monomer of the DOWN/close wild-type (6VXX) and the UP/open (6VYB) of S protein showing a hinge-bending motion of RBD (green) around NTD linker (blue) and CTD2 (light blue). The location of D614 residue in CTD3 (orange) is indicated and represented as Van der Waal's presentation. The neighbouring glycans are represented by a stick model. Free energy change due to D614G single and D614G containing double mutants are given. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Geographical and temporal distribution of Spike protein mutants.
| Mutations in spike protein | No. of sample | Country | First Collection Date | Last collection date | |
|---|---|---|---|---|---|
| A27V | 1 | USA | 23.03.2020 | NA | |
| Y28N | 1 | Australia | 28.02.2020 | NA | |
| T29I | 3 | Australia, USA, Netherlands | 21.03.2020 | 28.03.2020 | |
| R408I | 1 | India | 27.01.2020 | NA | |
| H519Q | 1 | Belgium | 29.02.2020 | NA | |
| 2 | USA | 13.03.2020 | 03.04.2020 | ||
| A570V | 1 | China | 29.01.2020 | NA | |
| V772I | 1 | Turkey | 17.03.2020 | NA | |
| F797C | 1 | Sweden | 07.02.2020 | NA | |
| A930V | 1 | India | 31.01.2020 | NA | |
| D/L | Innumerable | Across the globe | 24.12.2019 | 15.04.2020 | |
| G/L | Innumerable | Across the globe | 24.01.2020 | 20.04.2020 | |
| D/F | 1 | Netherlands | 31.03.2020 | NA | |
| D/D | Innumerable | Across the globe | 24.12.2019 | 15.04.2020 | |
| G/D | Innumerable | Across the globe | 24.01.2020 | 20.04.2020 | |
| D/H | 1 | Singapore | 05.01.2020 | NA | |
| D/S | Innumerable | Across the globe | 24.12.2019 | 15.04.2020 | |
| G/S | Innumerable | Across the globe | 24.01.2020 | 20.04.2020 | |
| D/F | 1 | Switzerland | 26.02.2020 | NA | |
Mutants with reduction in total free energies are presented in bold.