| Literature DB >> 33623833 |
Jayanta Kumar Das1, Antara Sengupta2, Pabitra Pal Choudhury3, Swarup Roy4.
Abstract
SARS-CoV-2 is mutating aene">nd creating divergeene">nt variaene">nts by altering the composition of esseene">ntial constitueene">nt proteins. Pharmacologically, it is crucial to understaene">nd the diverse mechaene">nism of mutations for stable vaccine or aene">nti-viral drug desigene">n. Our curreene">nt study conceene">ntrates on all the constitueene">nt proteins of 469Entities:
Keywords: COVID-19; Codon position; Deleterious substitutions; Functional domain; Non-synonymous mutations; Protein stability
Year: 2021 PMID: 33623833 PMCID: PMC7893251 DOI: 10.1016/j.genrep.2021.101044
Source DB: PubMed Journal: Gene Rep ISSN: 2452-0144
Topological structure of all SARS-CoV-2 proteins shown by respective genomic location. For each protein, the range of CDS region and amino acids used in this paper are numbered starting from 1 to length of the nucleotide or protein sequence.
| Gene/protein | Genome location (nucleotide) | Protein length (aa) | Nucleotide location used | Amino acid location used |
|---|---|---|---|---|
| ORF1ab | 266–21,555 | 7096 | 1–21,288 | 1–7096 |
| S | 21,563–25,384 | 1273 | 1–3819 | 1–1273 |
| ORF3a | 25,393–26,220 | 275 | 1–825 | 1–275 |
| E | 26,245–26,472 | 75 | 1–225 | 1–75 |
| M | 26,523–27,191 | 222 | 1–666 | 1–222 |
| ORF6 | 27,202–27,387 | 61 | 1–183 | 1–61 |
| ORF7a | 27,394–27,759 | 121 | 1–363 | 1–121 |
| ORF7b | 27,756–27,887 | 43 | 1–129 | 1–43 |
| ORF8 | 27,894–28,259 | 121 | 1–363 | 1–121 |
| N | 28,274–29,533 | 419 | 1–1257 | 1–419 |
| ORF10 | 29,558–29,674 | 38 | 1–114 | 1–38 |
The number of collected samples from Indian isolates, unique variant, sample to variant ratio in each SARS-CoV-2 protein.
| Protein | # collected samples | # noise free samples | # unique variant | Sample to variant ratio |
|---|---|---|---|---|
| ORF1ab | 462 | 400 | 262 | 1.52 |
| E | 460 | 445 | 3 | 148.33 |
| M | 460 | 457 | 18 | 25.38 |
| N | 463 | 455 | 53 | 8.58 |
| S | 462 | 436 | 90 | 4.84 |
| ORF3a | 459 | 445 | 33 | 13.48 |
| ORF6 | 460 | 459 | 3 | 153.0 |
| ORF7a | 460 | 454 | 11 | 41.27 |
| ORF7b | 456 | 455 | 3 | 151.66 |
| ORF8 | 461 | 451 | 11 | 41.00 |
| ORF10 | 460 | 460 | 3 | 153.33 |
Fig. 1Comparison of unique variant among ten different countries. (A) Percentage of unique variant in each SARS-CoV-2 protein. The number at the top of the bar indicates the number of noise-free collected samples; (B) number of common unique variant between India and other country.
Fig. 2Distribution of observed number of mutations (x-axis) and relative frequency of number of variants (y-axis) for each SARS-CoV-2 protein. The five proteins ORF1ab, ORF3a, S, N, M are observed multiple mutations in different variants, whereas in six proteins, ORF6, ORF7a, ORF7b, ORF8, ORF10 and E are found exactly a single mutation in each variant.
Fig. 3Average number of mutation per variant. Proteins are ranked by avg. mutation, highest (left) to lowest (right).
Number of mutated positions (or locations) in each SARS-CoV-2 protein.
| Gene | Gene length | # mutated position | Mutated position (%) |
|---|---|---|---|
| ORF1ab | 21,291 | 328 | 1.541 |
| E | 228 | 2 | 0.877 |
| M | 669 | 15 | 2.242 |
| N | 1260 | 49 | 3.889 |
| S | 3822 | 83 | 2.172 |
| ORF3a | 828 | 33 | 3.986 |
| ORF6 | 186 | 2 | 1.075 |
| ORF7a | 336 | 10 | 2.976 |
| ORF7b | 132 | 2 | 1.515 |
| ORF8 | 366 | 10 | 2.732 |
| ORF10 | 117 | 2 | 1.709 |
Fig. 4Quantification of synonymous and non-synonymous mutation. (A) Percentage of synonymous vs. non-synonymous mutation type in three codon positions taking all proteins together; (B) percentage of non-synonymous and synonymous mutation type in all SARS-CoV-2 protein.
Fig. 5Percentage of mutation in SARS-CoV-2 proteins in each codon position (1st, 2nd and 3rd). (A) Protein-wise in each codon position, and (B) aggregate by all proteins in three different codon positions; (C) overall percentage of synonymous vs. non-synonymous mutation taking all codon positions and proteins.
Percentage of synonymous (syn) and non-synonymous (non-syn) mutation in three different codon positions (CP)-1st/2nd/3rd in each of the SARS-CoV-2 protein.
| Protein | CP | Type | Percentage |
|---|---|---|---|
| E | 1st | Non-syn | 100.00 |
| M | 1st | Non-syn | 26.67 |
| M | 1st | Syn | 6.67 |
| N | 1st | Non-syn | 33.96 |
| N | 1st | Syn | 1.89 |
| ORF10 | 1st | Non-syn | 50.00 |
| ORF1ab | 1st | Non-syn | 22.80 |
| ORF1ab | 1st | Syn | 2.74 |
| ORF3a | 1st | Non-syn | 33.33 |
| ORF7a | 1st | Non-syn | 40.00 |
| ORF7b | 1st | Non-syn | 50.00 |
| ORF8 | 1st | Non-syn | 30.00 |
| S | 1st | Non-syn | 20.48 |
| S | 1st | Syn | 1.20 |
| M | 2nd | Non-syn | 20.00 |
| N | 2nd | Non-syn | 26.42 |
| ORF1ab | 2nd | Non-syn | 26.44 |
| ORF3a | 2nd | Non-syn | 30.30 |
| ORF7a | 2nd | Non-syn | 20.00 |
| ORF7b | 2nd | Non-syn | 50.00 |
| ORF8 | 2nd | Non-syn | 60.00 |
| S | 2nd | Non-syn | 28.92 |
| M | 3rd | Syn | 46.67 |
| N | 3rd | Non-syn | 9.43 |
| N | 3rd | Syn | 28.30 |
| ORF10 | 3rd | Syn | 50.00 |
| ORF1ab | 3rd | Non-syn | 8.51 |
| ORF1ab | 3rd | Syn | 39.51 |
| ORF3a | 3rd | Non-syn | 9.09 |
| ORF3a | 3rd | Syn | 27.27 |
| ORF6 | 3rd | Non-syn | 50.00 |
| ORF6 | 3rd | Syn | 50.00 |
| ORF7a | 3rd | Syn | 40.00 |
| ORF8 | 3rd | Syn | 10.00 |
| S | 3rd | Non-syn | 15.66 |
| S | 3rd | Syn | 33.73 |
Fig. 6(A) Number of distinct mutation type count shown for non-synonymous category in each protein and each codon position; (B) number of distinct mutation type count shown for non-synonymous category by aggregate all codon positions.
Fig. 7The quantification of nucleotide mutation type in non-synonymous category. (A) Percentage of each type of nucleotide mutation; (B) mutation type by associate number of SARS-CoV-2 protein count.
The percentage of nucleotide mutation type for all non-synonymous cases shown for three codon positions independently and arranged by highest to lowest percentage. Mut-type: Mutation type;
| Codon position-1st | Codon position-2nd | Codon position-3rd | |||
|---|---|---|---|---|---|
| Mut-type | Percentage | Mut-type | Percentage | Mut-type | Percentage |
| G>T | 33.09 | C>T | 48.98 | G>T | 68 |
| C>T | 25.00 | G>T | 16.33 | A>C | 6 |
| G>A | 16.91 | A>G | 12.24 | G>A | 6 |
| A>G | 6.62 | T>C | 7.48 | G>C | 6 |
| A>C | 5.88 | G>A | 6.12 | C>A | 4 |
| T>C | 3.68 | C>A | 2.72 | T>A | 4 |
| G>C | 2.94 | A>C | 1.36 | T>G | 4 |
| C>A | 2.21 | G>C | 1.36 | A>T | 2 |
| C>G | 1.47 | T>G | 1.36 | ||
| A>T | 0.74 | A>T | 0.68 | ||
| T>A | 0.74 | C>G | 0.68 | ||
| T>G | 0.74 | T>A | 0.68 | ||
Percentage of nucleotide mutation type for all non-synonymous cases shown by three codon positions for all proteins. CP: codon position; Mut-type: mutation type.
| Protein | CP | Mut-type | Percentage |
|---|---|---|---|
| ORF1ab | 2 | C>T | 23.16 |
| 3 | G>T | 11.58 | |
| 1 | C>T | 10.53 | |
| 1 | G>T | 10.53 | |
| 2 | A>G | 8.42 | |
| 1 | G>A | 7.37 | |
| 2 | T>C | 4.74 | |
| 1 | A>G | 4.21 | |
| 2 | G>A | 2.63 | |
| 2 | G>T | 2.63 | |
| 1 | A>C | 2.11 | |
| 1 | T>C | 2.11 | |
| 2 | C>A | 1.58 | |
| 2 | A>C | 1.05 | |
| 3 | C>A | 1.05 | |
| 3 | G>C | 1.05 | |
| 3 | A>C | 0.53 | |
| 2 | A>T | 0.53 | |
| 1 | C>A | 0.53 | |
| 1 | C>G | 0.53 | |
| 3 | G>A | 0.53 | |
| 1 | G>C | 0.53 | |
| 1 | T>A | 0.53 | |
| 2 | T>A | 0.53 | |
| 1 | T>G | 0.53 | |
| 2 | T>G | 0.53 | |
| E | 1 | G>T | 100.00 |
| M | 1 | C>T | 28.57 |
| 2 | C>T | 28.57 | |
| 1 | G>T | 28.57 | |
| 2 | G>T | 14.29 | |
| N | 2 | C>T | 18.92 |
| 1 | G>T | 18.92 | |
| 1 | G>A | 10.81 | |
| 2 | G>T | 10.81 | |
| 1 | C>T | 8.11 | |
| 2 | G>A | 5.41 | |
| 1 | G>C | 5.41 | |
| 3 | G>T | 5.41 | |
| 3 | A>C | 2.70 | |
| 1 | A>G | 2.70 | |
| 1 | C>A | 2.70 | |
| 3 | G>A | 2.70 | |
| 2 | G>C | 2.70 | |
| 3 | G>C | 2.70 | |
| S | 2 | C>T | 18.52 |
| 2 | G>T | 18.52 | |
| 3 | G>T | 14.81 | |
| 1 | G>T | 12.96 | |
| 1 | C>T | 5.56 | |
| 1 | A>C | 3.70 | |
| 2 | A>G | 3.70 | |
| 3 | T>A | 3.70 | |
| 3 | A>C | 1.85 | |
| 1 | A>T | 1.85 | |
| 1 | C>A | 1.85 | |
| 1 | G>A | 1.85 | |
| 2 | G>A | 1.85 | |
| 3 | G>A | 1.85 | |
| 1 | G>C | 1.85 | |
| 2 | G>C | 1.85 | |
| 1 | T>C | 1.85 | |
| 3 | T>G | 1.85 | |
| ORF3a | 1 | G>T | 20.83 |
| 1 | C>T | 16.67 | |
| 2 | C>T | 16.67 | |
| 2 | G>T | 12.50 | |
| 1 | A>C | 4.17 | |
| 3 | A>T | 4.17 | |
| 2 | C>G | 4.17 | |
| 1 | G>A | 4.17 | |
| 3 | G>T | 4.17 | |
| 2 | T>C | 4.17 | |
| 2 | T>G | 4.17 | |
| 3 | T>G | 4.17 | |
| ORF6 | 3 | G>T | 100.00 |
| ORF7a | 1 | G>A | 33.33 |
| 1 | C>G | 16.67 | |
| 1 | C>T | 16.67 | |
| 2 | C>T | 16.67 | |
| 2 | G>T | 16.67 | |
| ORF8 | 2 | C>T | 33.33 |
| 1 | G>T | 22.22 | |
| 1 | A>C | 11.11 | |
| 2 | C>A | 11.11 | |
| 2 | G>A | 11.11 | |
| 2 | T>C | 11.11 | |
| ORF7b | 2 | C>T | 50.00 |
| 1 | G>A | 50.00 | |
| ORF10 | 1 | C>T | 100.00 |
Amino acid substitution type by associated protein and number of mutated locations in that protein.
| Substitution type | Protein (#mutated position) |
|---|---|
| A>D | ORF1ab-(2) |
| A>S | M-(1),N-(2),ORF1ab-(4),ORF3a-(3),ORF8-(1),S-(3) |
| A>T | ORF1ab-(2),ORF7b-(1) |
| A>V | M-(2),N-(2),ORF1ab-(16),ORF3a-(1),ORF8-(2),S-(4) |
| C>F | ORF1ab-(1),S-(3) |
| C>Y | ORF1ab-(1) |
| D>E | ORF1ab-(1) |
| D>G | ORF1ab-(4) |
| D>N | N-(1),ORF1ab-(2) |
| D>Y | N-(3),ORF1ab-(6),ORF3a-(1),S-(3) |
| E>D | ORF1ab-(5),ORF6-(1),S-(2) |
| E>G | ORF1ab-(1) |
| E>K | ORF1ab-(3),ORF7a-(1) |
| E>Q | N-(1),ORF1ab-(1),S-(1) |
| F>L | ORF1ab-(1),S-(1) |
| G>A | S-(1) |
| G>C | N-(1),ORF1ab-(3) |
| G>D | ORF1ab-(3),S-(1) |
| G>E | ORF8-(1) |
| G>R | N-(2),ORF1ab-(1) |
| G>S | N-(1),ORF1ab-(2),S-(1) |
| G>T | N-(2) |
| G>V | ORF1ab-(2),ORF3a-(1),ORF7a-(1),S-(1) |
| G>W | N-(1) |
| H>Q | ORF3a-(1),S-(1) |
| H>R | ORF1ab-(2) |
| H>Y | M-(1),N-(1),ORF1ab-(5),ORF3a-(1),S-(1) |
| I>K | ORF1ab-(1) |
| I>L | ORF1ab-(1),ORF8-(1) |
| I>T | ORF1ab-(4),ORF3a-(1) |
| K>E | ORF1ab-(1) |
| K>N | ORF1ab-(7),ORF3a-(1) |
| K>Q | ORF3a-(1),S-(2) |
| K>R | ORF1ab-(7),S-(1) |
| K>T | ORF1ab-(1) |
| L>F | M-(1),N-(1),ORF10-(1),ORF1ab-(10),ORF3a-(3),ORF7a-(1),S-(2) |
| L>I | ORF1ab-(1) |
| L>P | ORF1ab-(2) |
| L>S | ORF8-(1) |
| L>V | ORF1ab-(1) |
| L>W | ORF3a-(1) |
| M>I | N-(2),ORF1ab-(10),S-(3) |
| N>D | ORF1ab-(2) |
| N>H | ORF1ab-(1) |
| N>K | S-(1) |
| N>L | ORF1ab-(2) |
| N>Y | S-(1) |
| P>A | ORF1ab-(1) |
| P>L | N-(1),ORF1ab-(7),ORF7a-(1),ORF8-(1) |
| P>R | ORF3a-(1) |
| P>S | N-(2),ORF1ab-(5),S-(1) |
| P>T | N-(1) |
| Q>E | ORF7a-(1) |
| Q>H | ORF1ab-(2),S-(4) |
| Q>K | S-(1) |
| Q>P | ORF1ab-(1) |
| Q>R | ORF1ab-(2),S-(1) |
| R>C | ORF1ab-(2) |
| R>G | N-(1) |
| R>I | ORF3a-(1) |
| R>K | N-(2) |
| R>L | M-(1),N-(1),ORF1ab-(1),ORF3a-(1) |
| R>M | S-(1) |
| R>Q | ORF1ab-(1) |
| R>S | N-(1) |
| S>F | ORF1ab-(3),S-(2) |
| S>G | ORF1ab-(2) |
| S>I | N-(3),ORF1ab-(1),S-(3) |
| S>L | N-(1),ORF1ab-(2),ORF3a-(1),ORF7b-(1) |
| S>N | N-(1) |
| S>P | ORF1ab-(2) |
| S>R | ORF1ab-(2) |
| S>T | ORF1ab-(1) |
| T>A | ORF1ab-(3) |
| T>I | N-(3),ORF1ab-(15),ORF3a-(2),S-(4) |
| T>K | ORF1ab-(1) |
| T>M | ORF1ab-(1) |
| T>N | ORF8-(1) |
| V>A | ORF1ab-(3) |
| V>F | E-(2),M-(1),ORF1ab-(5),ORF3a-(1) |
| V>G | ORF1ab-(1) |
| V>I | ORF1ab-(4),ORF3a-(1),ORF7a-(1) |
| V>L | ORF1ab-(2),ORF8-(1),S-(1) |
| W>C | ORF3a-(1) |
| W>L | S-(2) |
| Y>H | ORF1ab-(1),S-(1) |
Fig. 8(A) The amino acid substitution type observed with more than two mutated positions in SARS-CoV-2 genome. (B) The amino acid substitution type associated with more than two SARS-CoV-2 proteins.
Fig. 9The non-synonymous amino acid substitution type count in each of the SARS-CoV-2 protein.
Fig. 10The non-synonymous amino acid substitution categorization by percentage of deleterious and neutral mutation type predicted by PROVEAN score. (A) Percentage is shown for SARS-CoV-2 proteins taking all codon positions together; (B) percentage is shown for three codon positions in each of SARS-CoV-2 proteins.
The non-synonymous amino acid substitutions in ORF1ab protein with the predicted PROVEAN score and ΔΔG prediction value.
| Substitution | PROVEAN score | Type | ΔΔ | RI | Freq. |
|---|---|---|---|---|---|
| −0.673 | Neutral | 1 | |||
| −0.733 | Neutral | 2 | |||
| −0.553 | Neutral | 1 | |||
| −1.223 | Neutral | 1 | |||
| D147E | −1.123 | Neutral | 0.01 | 4 | 1 |
| 0.027 | Neutral | 1 | |||
| −1.198 | Neutral | 1 | |||
| 0.327 | Neutral | 2 | |||
| S212L | 0.097 | Neutral | 0.24 | 1 | 1 |
| −0.693 | Neutral | 1 | |||
| −0.088 | Neutral | 2 | |||
| −0.135 | Neutral | 1 | |||
| 0.518 | Neutral | 1 | |||
| −1.072 | Neutral | 1 | |||
| K338R | −0.685 | Neutral | 0.05 | 2 | 1 |
| A339V | −0.465 | Neutral | 0.07 | 1 | 1 |
| E347D | −0.548 | Neutral | −0.39 | 6 | 1 |
| H417Y | 0.379 | Neutral | 0.26 | 7 | 1 |
| S443P | −0.678 | Neutral | −0.23 | 2 | 1 |
| −0.633 | Neutral | 4 | |||
| Q575R | −0.331 | Neutral | −0.49 | 6 | 3 |
| E633D | −0.233 | Neutral | −0.37 | 7 | 5 |
| E658K | −0.707 | Neutral | −0.44 | 7 | 1 |
| G662R | −1.425 | Neutral | −0.35 | 7 | 1 |
| −0.531 | Neutral | 69 | |||
| 0.035 | Neutral | 1 | |||
| T882I | −0.691 | Neutral | −0.1 | 2 | 1 |
| 0.996 | Neutral | 1 | |||
| E940D | 0.515 | Neutral | −0.41 | 4 | 1 |
| G989V | 0.35 | Neutral | −0.36 | 5 | 1 |
| −0.887 | Neutral | 1 | |||
| P1054L | −1.268 | Neutral | −0.44 | 0 | 2 |
| T1055I | −0.496 | Neutral | −0.38 | 2 | 3 |
| −0.535 | Neutral | 1 | |||
| −0.909 | Neutral | 5 | |||
| H1160Y | 0.734 | Neutral | 0.11 | 4 | 3 |
| −0.667 | Neutral | 1 | |||
| −0.511 | Neutral | 1 | |||
| 0.092 | Neutral | 2 | |||
| A1283V | −0.232 | Neutral | −0.15 | 2 | 2 |
| A1298V | 0.22 | Neutral | −0.01 | 1 | 1 |
| T1429I | 0.457 | Neutral | −0.5 | 5 | 1 |
| A1432V | 0.864 | Neutral | 0.07 | 2 | 1 |
| S1534I | 0.319 | Neutral | 0.36 | 1 | 7 |
| −0.057 | Neutral | 1 | |||
| −2.402 | Neutral | 1 | |||
| M1588I | −0.746 | Neutral | −0.08 | 0 | 2 |
| D1625Y | −2.143 | Neutral | 0.01 | 2 | 2 |
| −1.551 | Neutral | 2 | |||
| M1769I | −0.349 | Neutral | −0.11 | 5 | 3 |
| −0.753 | Neutral | 6 | |||
| T1822I | −0.406 | Neutral | 0.1 | 0 | 1 |
| −0.808 | Neutral | 1 | |||
| −0.326 | Neutral | 1 | |||
| T1854I | −0.193 | Neutral | −0.25 | 3 | 1 |
| T1874I | −1.364 | Neutral | −0.16 | 3 | 1 |
| −0.936 | Neutral | 1 | |||
| D1940Y | −0.872 | Neutral | −0.16 | 2 | 1 |
| −0.464 | Neutral | 1 | |||
| K1973R | −0.294 | Neutral | −0.37 | 1 | 1 |
| S2015R | −0.501 | Neutral | −0.17 | 0 | 5 |
| −0.166 | Neutral | 10 | |||
| K2029E | −0.63 | Neutral | −0.5 | 6 | 1 |
| −0.431 | Neutral | 1 | |||
| −1.038 | Neutral | 3 | |||
| T2093I | 0.565 | Neutral | 0.1 | 4 | 1 |
| S2103F | −0.372 | Neutral | 0.24 | 6 | 1 |
| −1.386 | Neutral | 1 | |||
| S2242P | 0.105 | Neutral | −0.06 | 3 | 1 |
| −0.03 | Neutral | 1 | |||
| −0.361 | Neutral | 1 | |||
| H2357Y | 0.301 | Neutral | 0.38 | 7 | 3 |
| S2488F | 2.899 | Neutral | −0.05 | 2 | 1 |
| K2511N | −0.966 | Neutral | −0.38 | 0 | 2 |
| H2520R | −0.243 | Neutral | −0.29 | 3 | 2 |
| A2593V | −1.178 | Neutral | −0.24 | 0 | 1 |
| 3 | |||||
| −1.595 | Neutral | 1 | |||
| H2831Y | 3.17 | Neutral | 0.27 | 6 | 1 |
| A2891V | −0.835 | Neutral | 0 | 3 | 1 |
| 0.071 | Neutral | 1 | |||
| A2994V | −1.769 | Neutral | −0.05 | 0 | 3 |
| T3058I | 1.463 | Neutral | −0.48 | 4 | 7 |
| 2 | |||||
| 0.614 | Neutral | 3 | |||
| T3150I | 0.112 | Neutral | −0.43 | 1 | 2 |
| −0.785 | Neutral | 1 | |||
| 2 | |||||
| K3353R | −1.343 | Neutral | −0.13 | 1 | 1 |
| 1 | |||||
| Q3390R | −0.324 | Neutral | −0.33 | 4 | 1 |
| −0.05 | 0 | 5 | |||
| −1.913 | Neutral | 1 | |||
| −0.882 | Neutral | 1 | |||
| −2.291 | Neutral | 1 | |||
| K3499R | −0.421 | Neutral | −0.25 | 2 | 5 |
| −1.432 | Neutral | 14 | |||
| −1.397 | Neutral | 1 | |||
| 0.174 | Neutral | 1 | |||
| −0.466 | Neutral | 1 | |||
| −0.348 | Neutral | 2 | |||
| −0.744 | Neutral | 2 | |||
| −1.765 | Neutral | 1 | |||
| 1 | |||||
| E3962K | −0.041 | Neutral | −0.34 | 3 | 1 |
| −0.34 | 7 | 2 | |||
| 1 | |||||
| −0.3 | 7 | 1 | |||
| K4069T | −2.268 | Neutral | −0.48 | 6 | 1 |
| −0.106 | Neutral | 1 | |||
| K4081R | −0.921 | Neutral | −0.33 | 7 | 1 |
| −0.46 | Neutral | 1 | |||
| K4176N | 0.651 | Neutral | −0.36 | 3 | 1 |
| −0.046 | Neutral | 1 | |||
| −0.25 | 1 | 1 | |||
| −0.23 | 1 | 1 | |||
| −0.49 | Neutral | 2 | |||
| K4483N | −1.326 | Neutral | −0.42 | 4 | 1 |
| A4487V | 0.357 | Neutral | −0.24 | 2 | 1 |
| A4489V | −2.346 | Neutral | −0.31 | 1 | 10 |
| 1 | |||||
| A4577V | −1.878 | Neutral | −0.17 | 4 | 1 |
| −1.074 | Neutral | 1 | |||
| 0.213 | Neutral | 1 | |||
| −0.609 | Neutral | 2 | |||
| −0.446 | Neutral | 359 | |||
| −1.085 | Neutral | 7 | |||
| 1 | |||||
| −1.728 | Neutral | 3 | |||
| C4856F | −0.483 | Neutral | −0.21 | 3 | 1 |
| 3 | |||||
| T5035I | −0.622 | Neutral | −0.43 | 2 | 1 |
| T5036M | −1.529 | Neutral | −0.29 | 2 | 1 |
| −0.117 | Neutral | 1 | |||
| −1.821 | Neutral | 2 | |||
| 1.016 | Neutral | 2 | |||
| V5272I | −0.551 | Neutral | −0.17 | 2 | 18 |
| D5285Y | −1.381 | Neutral | 0.27 | 3 | 2 |
| T5300I | −0.542 | Neutral | −0.42 | 4 | 1 |
| S5305L | −2.332 | Neutral | 0.22 | 4 | 1 |
| −0.897 | Neutral | 2 | |||
| H5488Y | 0.534 | Neutral | 0.19 | 7 | 1 |
| −2.053 | Neutral | 1 | |||
| 4 | |||||
| H5569R | 1.004 | Neutral | −0.1 | 5 | 1 |
| −1.592 | Neutral | 2 | |||
| −0.845 | Neutral | 2 | |||
| −0.858 | Neutral | 1 | |||
| 2 | |||||
| 0.366 | Neutral | 1 | |||
| 1 | |||||
| 0.351 | Neutral | 1 | |||
| −0.711 | Neutral | 1 | |||
| K5957R | −0.861 | Neutral | −0.09 | 0 | 1 |
| −1.93 | Neutral | 1 | |||
| −0.985 | Neutral | 1 | |||
| −0.14 | 1 | 1 | |||
| A6044V | 2.2 | Neutral | 0.14 | 2 | 1 |
| 0.176 | Neutral | 2 | |||
| −0.771 | Neutral | 1 | |||
| 3 | |||||
| −1.397 | Neutral | 67 | |||
| S6180R | −1.897 | Neutral | 0.16 | 4 | 1 |
| −2.053 | Neutral | 1 | |||
| D6249Y | 0.823 | Neutral | −0.16 | 3 | 1 |
| K6274N | −0.353 | Neutral | −0.18 | 3 | 3 |
| −0.448 | Neutral | 45 | |||
| 1 | |||||
| 1 | |||||
| −0.789 | Neutral | 3 | |||
| K6464N | −1.404 | Neutral | −0.42 | 2 | 1 |
| T6500I | −1.557 | Neutral | −0.42 | 5 | 1 |
| A6533V | −0.465 | Neutral | −0.35 | 5 | 3 |
| −0.868 | Neutral | 1 | |||
| −2.423 | Neutral | 2 | |||
| A6589V | −0.154 | Neutral | −0.17 | 3 | 1 |
| −2.262 | Neutral | 4 | |||
| −1.53 | Neutral | 1 | |||
| 0.442 | Neutral | 1 | |||
| V6688I | −0.141 | Neutral | −0.49 | 4 | 1 |
| −0.049 | Neutral | 1 | |||
| C6742Y | −0.068 | Neutral | −0.18 | 0 | 1 |
| −0.41 | 1 | 1 | |||
| −0.541 | Neutral | 1 | |||
| −0.017 | Neutral | 2 | |||
| A6914V | −0.428 | Neutral | −0.03 | 1 | 1 |
| K6958R | −0.492 | Neutral | −0.17 | 5 | 2 |
| 0.713 | Neutral | 1 | |||
| N7083D | −1.153 | Neutral | −0.42 | 2 | 1 |
The substitutions with either high PROVEAN score (< − 2.5, type: deleterious) or large increase stability (ΔΔG < − 0.5) or both are shown in bold.
The functional assessment of non-synonymous amino acid substitutions in four structural SARS-CoV-2 proteins (E, M, N, S). The functional assessment of mutation is predicted on utilizing two different measures (PROVEAN score and stability value).
| Protein | Substitution | PROVEAN score | Type | ΔΔ | RI | Freq. |
|---|---|---|---|---|---|---|
| E | 0 | Neutral | 1 | |||
| −1.414 | Neutral | 1 | ||||
| M | A142V | 0.18 | Neutral | 0.25 | 5 | 2 |
| −1.646 | Neutral | 1 | ||||
| A63V | −1.937 | Neutral | 0.14 | 0 | 1 | |
| −1.991 | Neutral | 1 | ||||
| −1.365 | Neutral | −1.66 | 1 | |||
| −0.33 | 4 | 1 | ||||
| H125Y | 0.799 | Neutral | 0.06 | 5 | 1 | |
| N | −0.38 | 2 | 158 | |||
| P13L | −1.23 | Neutral | −0.48 | 3 | 23 | |
| −0.404 | Neutral | −0.78 | 20 | |||
| −1.604 | Neutral | −0.93 | 14 | |||
| −1.604 | Neutral | −0.93 | 14 | |||
| −1.656 | Neutral | −0.52 | 14 | |||
| −1.562 | Neutral | −0.53 | 6 | |||
| −0.36 | 2 | 4 | ||||
| −1.98 | Neutral | −1.33 | 3 | |||
| −0.457 | Neutral | −0.83 | 3 | |||
| −0.223 | Neutral | −1.1 | 2 | |||
| S33I | −1.372 | Neutral | 0.27 | 6 | 2 | |
| −0.14 | 3 | 2 | ||||
| M234I | 0.044 | Neutral | −0.03 | 1 | 2 | |
| −0.541 | Neutral | −1.4 | 1 | |||
| −0.541 | Neutral | −1.4 | 1 | |||
| 0.054 | Neutral | −0.7 | 1 | |||
| G34W | −1.609 | Neutral | −0.13 | 2 | 1 | |
| 1 | ||||||
| G120R | −0.733 | Neutral | −0.29 | 1 | 1 | |
| −0.12 | 2 | 1 | ||||
| L139F | −0.697 | Neutral | −0.85 | 8 | 1 | |
| D144Y | −1.764 | Neutral | 0.2 | 1 | 1 | |
| A152S | 1.463 | Neutral | −0.92 | 9 | 1 | |
| −0.58 | 1 | |||||
| −1.6 | 1 | |||||
| −1.604 | Neutral | −0.93 | 1 | |||
| −1.656 | Neutral | −0.52 | 1 | |||
| −1.76 | Neutral | −0.96 | 1 | |||
| A218V | 0.171 | Neutral | 0.21 | 1 | 1 | |
| M234I | 0.044 | Neutral | −0.03 | 1 | 1 | |
| G236C | −2.269 | Neutral | −0.27 | 5 | 1 | |
| H300Y | −1.577 | Neutral | 0.46 | 5 | 1 | |
| 1 | ||||||
| 1 | ||||||
| D348Y | −0.588 | Neutral | −0.41 | 2 | 1 | |
| T362I | −1.722 | Neutral | −0.35 | 3 | 1 | |
| T393I | −0.613 | Neutral | 0.1 | 2 | 1 | |
| S | 0.598 | Neutral | −0.93 | 405 | ||
| −0.435 | Neutral | −1.14 | 80 | |||
| E583D | −0.819 | Neutral | −0.44 | 3 | 14 | |
| 0.986 | Neutral | −0.84 | 12 | |||
| T572I | −0.649 | Neutral | 0 | 3 | 10 | |
| 0.002 | Neutral | −0.67 | 5 | |||
| −1.126 | Neutral | −0.98 | 3 | |||
| −0.796 | Neutral | −0.86 | 3 | |||
| S12F | −0.65 | Neutral | 0.14 | 2 | 2 | |
| −0.159 | Neutral | 2 | ||||
| S155I | −0.503 | Neutral | 0 | 4 | 2 | |
| 0.579 | Neutral | −0.61 | 2 | |||
| 0.396 | Neutral | −0.58 | 2 | |||
| −1.084 | Neutral | −0.65 | 2 | |||
| 0.183 | Neutral | −0.85 | 2 | |||
| A879S | −0.361 | Neutral | 0.54 | 7 | 2 | |
| H1083Q | −1.006 | Neutral | −0.34 | 5 | 2 | |
| −0.09 | 2 | 2 | ||||
| −0.902 | Neutral | −1.22 | 9 | 1 | ||
| S13I | −1.187 | Neutral | 0.27 | 0 | 1 | |
| −0.443 | Neutral | −1.38 | 4 | 1 | ||
| −2.112 | Neutral | −0.68 | 7 | 1 | ||
| −0.115 | Neutral | −0.72 | 6 | 1 | ||
| −0.113 | Neutral | −0.92 | 7 | 1 | ||
| N148Y | −0.177 | Neutral | 0.1 | 4 | 1 | |
| 0.244 | Neutral | −0.98 | 6 | 1 | ||
| 0.958 | Neutral | −0.52 | 4 | 1 | ||
| S162I | 0.231 | Neutral | 0.02 | 1 | 1 | |
| −0.299 | Neutral | −1.02 | 7 | 1 | ||
| S255F | −0.423 | Neutral | −0.03 | 3 | 1 | |
| 0.485 | Neutral | −1.13 | 8 | 1 | ||
| 0.154 | Neutral | −0.64 | 9 | 1 | ||
| Q271R | −0.48 | Neutral | −0.27 | 5 | 1 | |
| 0.2 | 4 | 1 | ||||
| 0.445 | Neutral | 1 | ||||
| D574Y | 0.858 | Neutral | 0.36 | 2 | 1 | |
| −0.917 | Neutral | 1 | ||||
| H655Y | −0.814 | Neutral | 0.08 | 4 | 1 | |
| A688V | 0.498 | Neutral | −0.37 | 5 | 1 | |
| A701V | 0.597 | Neutral | −0.25 | 4 | 1 | |
| M731I | −0.598 | Neutral | −0.25 | 3 | 1 | |
| 0.072 | Neutral | −0.61 | 1 | |||
| 1.024 | Neutral | −1.55 | 1 | |||
| T827I | −0.378 | Neutral | −0.45 | 6 | 1 | |
| A892V | −1.901 | Neutral | 0.2 | 1 | 1 | |
| −0.2 | 3 | 1 | ||||
| T1077I | −1.511 | Neutral | −0.13 | 1 | 1 | |
| −0.604 | Neutral | 1 | ||||
| 1 | ||||||
| K1181R | −0.522 | Neutral | −0.48 | 7 | 1 | |
| N1187K | −0.467 | Neutral | −0.29 | 4 | 1 | |
| Q1201K | 1.409 | Neutral | −0.29 | 3 | 1 | |
| −0.09 | 2 | 1 | ||||
| D1259Y | 3.924 | Neutral | −0.21 | 3 | 1 |
The substitutions with either high PROVEAN score (< − 2.5, type: deleterious) or large increase stability (ΔΔG < − 0.5) or both are shown in bold.
The functional assessment of non-synonymous amino acid substitutions in six SARS-CoV-2 accessories proteins (ORF3a, ORF6, ORF7a, ORF7b, ORF8, ORF10). The functional assessment of mutation is predicted on utilizing two different measures (PROVEAN score and stability value).
| Protein | Substitution | PROVEAN score | Type | ΔΔ | RI | Freq. |
|---|---|---|---|---|---|---|
| ORF3a | G18V | −1.571 | Neutral | −0.28 | 6 | 1 |
| K21Q | 0.657 | Neutral | −0.47 | 1 | 1 | |
| −1.638 | Neutral | 2 | ||||
| 1 | ||||||
| 4 | ||||||
| 1 | ||||||
| −0.657 | Neutral | 1 | ||||
| 1 | ||||||
| −1.638 | Neutral | 1 | ||||
| 234 | ||||||
| K66N | 3.486 | Neutral | −0.16 | 1 | 1 | |
| R68I | −1.562 | Neutral | 0.17 | 3 | 1 | |
| 2.638 | Neutral | 1 | ||||
| 1 | ||||||
| 0.3 | 6 | 3 | ||||
| 0.2 | 5 | 1 | ||||
| 1 | ||||||
| 1 | ||||||
| R134L | −1.543 | Neutral | −0.47 | 9 | 1 | |
| 0.724 | Neutral | 1 | ||||
| −0.29 | 0 | 2 | ||||
| 0.21 | 0 | 2 | ||||
| S171L | −2.238 | Neutral | −0.22 | 0 | 2 | |
| T175I | 2.562 | Neutral | −0.04 | 4 | 1 | |
| ORF6 | −0.24 | 4 | 1 | |||
| ORF7a | Q94E | −1 | Neutral | −0.24 | 2 | 2 |
| 2 | ||||||
| 1 | ||||||
| 1 | ||||||
| V71I | −0.667 | Neutral | −0.24 | 5 | 1 | |
| −1.263 | Neutral | 1 | ||||
| ORF7b | 0.23 | 1 | ||||
| A43T | 0 | Neutral | −0.44 | 5 | 1 | |
| ORF8 | 2.333 | Neutral | 19 | |||
| 1 | ||||||
| −1.056 | Neutral | 1 | ||||
| A14S | 0.833 | Neutral | −0.47 | 6 | 1 | |
| A51V | −1.222 | Neutral | −0.06 | 2 | 1 | |
| −0.722 | Neutral | 1 | ||||
| A65V | 1.222 | Neutral | 0.02 | 1 | 1 | |
| −8.778 | 1 | |||||
| −0.278 | Neutral | 1 | ||||
| ORF10 | NA | 1 |
The substitutions with either high PROVEAN score (< − 2.5, type: deleterious) or large increase stability (ΔΔG < − 0.5) or both are shown in bold.
The 57 deleterious amino acid substitutions in different SARS-CoV-2 proteins highlighted with the putative functional domain and physicochemical property changes. The mutations with large decrease stability (ΔΔG < − 0.5) are shown in bold.
| Protein | Substitution | Putative functional domain | Hydropathy change | Chemical property change |
|---|---|---|---|---|
| ORF1ab | NSP3 | Hydrophobic to charge | Aliphatic to acidic | |
| NSP4 | Hydrophilic (unchanged) | Aliphatic to sulfur containing | ||
| NSP5 (3CLpro) | Hydrophobic (unchanged) | Aliphatic to aromatic | ||
| NSP5 (3CLpro) | Hydrophobic to hydrophilic | Aliphatic to aliphatic | ||
| N3405L | NSP5 (3CLpro) | Hydrophilic to hydrophobic | Acidic amide to aliphatic | |
| NSP7 | Charge to hydrophilic | Acidic to aliphatic | ||
| S3983F | NSP8 | Hydrophilic to hydrophobic | Hydroxyl containing to aromatic | |
| NSP8 | Charge to hydrophilic | Basic to sulfur containing | ||
| R3993L | NSP8 | Charge to hydrophilic | Basic to aliphatic | |
| A4271V | NSP10 | Hydrophobic (unchanged) | Aliphatic (unchanged) | |
| A4273V | NSP10 | Hydrophobic (unchanged) | Aliphatic (unchanged) | |
| NSP12 (RdRp) | Charge to hydrophilic | Acidic to aliphatic | ||
| NSP12 (RdRp) | Hydrophobic (unchanged) | Aliphatic (unchanged) | ||
| NSP12 (RdRp) | Hydrophobic (unchanged) | Aliphatic to aromatic | ||
| NSP13 (helicase) | Hydrophilic (unchanged) | Aliphatic to sulfur containing | ||
| NSP13 (helicase) | Hydrophobic (unchanged) | Cyclic to aliphatic | ||
| NSP13 (helicase) | Hydrophobic (unchanged) | Aromatic to aliphatic | ||
| G6039V | NSP14 (exonuclease) | Hydrophilic to hydrophobic | Aliphatic to aliphatic | |
| NSP14 (exonuclease) | Charge to hydrophilic | Basic to sulfur containing | ||
| NSP14 (exonuclease) | Hydrophilic to charge | Acidic amide to acidic | ||
| NSP14 (exonuclease) | Hydrophobic (unchanged) | Cyclic to aliphatic | ||
| D6900Y | NSP16 | Charge to hydrophobic | Acidic to aromatic | |
| M | R107L | Topological domain | Charge to hydrophobic | Basic to aliphatic |
| N | NTD | Charge to hydrophilic | Basic to hydroxyl containing | |
| A134V | NTD | Hydrophobic (unchanged) | Aliphatic (unchanged) | |
| S180I | SR-rich linker | Hydrophilic (unchanged) | Hydroxyl containing to aliphatic | |
| SR-rich linker | Charge to hydrophobic | Basic to aliphatic | ||
| S193I | SR-rich linker | Hydrophilic to hydrophobic | Hydroxyl containing to aliphatic | |
| S194L | SR-rich linker | Hydrophilic to hydrophobic | Hydroxyl containing to aliphatic | |
| SR-rich linker | Charge to hydrophilic | Basic to aliphatic | ||
| CTD | Hydrophobic to hydrophilic | Cyclic to hydroxyl containing | ||
| CTD | Hydrophobic to hydrophilic | Cyclic to hydroxyl containing | ||
| S | C301F | S1 (N-terminal) | Hydrophilic (unchanged) | Sulfur containing to aromatic |
| A930V | S2 (HR-1) | Hydrophobic (unchanged) | Aliphatic (unchanged) | |
| S2 (between HR1 and HR2) | Charge to hydrophilic | Acidic to aromatic | ||
| C1243F | S2 (cytoplasm domain) | Hydrophilic to hydrophobic | Sulfur containing to aromatic | |
| C1250F | S2 (cytoplasm domain) | Hydrophilic to hydrophobic | Sulfur containing to aromatic | |
| ORF3a | Hydrophobic to hydrophilic | Aliphatic to hydroxyl containing | ||
| TM-1 | Hydrophobic (unchanged) | Aliphatic to aromatic | ||
| TM-1 | Hydrophobic to charge | Cyclic to basic | ||
| TM-1 | Hydrophobic (unchanged) | Aliphatic to aromatic | ||
| TM-1 | Hydrophilic to charge | Acidic amide to basic | ||
| TM-2 | Hydrophobic (unchanged) | Aliphatic to aromatic | ||
| H93Y | Ion channels | Charge to hydrophilic | Basic to aromatic | |
| A103V | Ion channels | Hydrophobic (unchanged) | Aliphatic (unchanged) | |
| Ion channels | Hydrophobic (unchanged) | Aliphatic to aromatic | ||
| Ion channels | Hydrophobic to hydrophilic | Aromatic to sulfur containing | ||
| T151I | C-terminal | Hydrophilic to hydrophobic | Hydroxyl containing to aliphatic | |
| D155Y | C-terminal | Charge to hydrophilic | Acidic to aromatic | |
| ORF6 | E13D | Charge (unchanged) | Acidic (unchanged) | |
| ORF7a | G38V | Luminal domain | Hydrophilic (unchanged) | Aliphatic to aliphatic |
| Luminal domain | Hydrophobic (unchanged) | Cyclic to aliphatic | ||
| Luminal domain | Charge (unchanged) | Acidic to basic | ||
| ORF7b | S31L | Hydrophilic to hydrophobic | Hydroxyl containing to aliphatic | |
| ORF8 | G8E | N-terminal (hydrophobic region) | Hydrophilic to charge | Aliphatic to acidic |
| P85L | Hydrophobic (unchanged) | Cyclic to aliphatic | ||
| ORF10 | Hydrophobic (unchanged) | Aliphatic to aromatic |