| Literature DB >> 32979477 |
Stefanie Weber1, Christina Ramirez2, Walter Doerfler3.
Abstract
Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) was first identified in Wuhan, China late in 2019. Nine months later (Sept. 23, 2020), the virus has infected > 31.6 million people around the world and caused > 971.000 (3.07 %) fatalities in 220 countries and territories. Research on the genetics of the SARS-CoV-2 genome, its mutants and their penetrance can aid future defense strategies. By analyzing sequence data deposited between December 2019 and end of May 2020, we have compared nucleotide sequences of 570 SARS-CoV-2 genomes from China, Europe, the US, and India to the sequence of the Wuhan isolate. During worldwide spreading among human populations, at least 10 distinct hotspot mutations had been selected and found in up to > 80 % of viral genomes. Many of these mutations led to amino acid exchanges in replication-relevant viral proteins. Mutations in the SARS-CoV-2 genome would also impinge upon the secondary structure of the viral RNA molecule and its repertoire of interactions with essential cellular and viral proteins. The increasing frequency of SARS-CoV-2 mutation hotspots might select for dangerous viral pathogens. Alternatively, in a 29.900 nucleotide-genome, there might be a limit to the number of mutable and selectable sites which, when exhausted, could prove disadvantageous to viral survival. The speed, at which novel SARS-CoV-2 mutants are selected and dispersed around the world, could pose problems for the development of vaccines and therapeutics.Entities:
Keywords: Consequences for secondary and tertiary structures of viral RNA; Impact on replication-relevant viral proteins; Questions about immunogenesis and vaccine development; Selection of viral hotspot mutations; Sequence comparisons between 570 viral genomes to Wuhan isolate; Severe acute respiratory syndrome Coronavirus-2 (SARS-CoV-2)
Mesh:
Substances:
Year: 2020 PMID: 32979477 PMCID: PMC7513834 DOI: 10.1016/j.virusres.2020.198170
Source DB: PubMed Journal: Virus Res ISSN: 0168-1702 Impact factor: 3.303
Fig. 1Visual example of the analytical method used in mutant evaluation. Screenshot of the 28,840 nucleotide (nt.)-28,920 nt. segment from the SARS-CoV-2 nucleotide sequence of different samples from Europe and the US. DNA sequences were aligned with the GGG → AAC mutation at position 28,881 nt. For DNA alignments the program Vector NTI Advance 11.0 Tool Align X was used. The top row presents the sequence of the original 2019 Wuhan, China isolate (NC_045512.2) which served as the reference for all sequence comparisons.
Synopsis of Data.
| Genome Position | Mutation | China | Europe | Germany | Munich* | USA I | USA II | India |
|---|---|---|---|---|---|---|---|---|
| 241nt | CG → TG | 0/98 | 80/99 | 4/62 | 14/14 | 76/111 | 74/96 | 82/99 |
| 1,059nt | CC → TC | 0/99 | 5/99 | 21/62 | 0/14 | 42/111 | 45/97 | 0/99 |
| 1,440nt | GC → AC | 0/99 | 0/99 | 15/62 | 0/14 | 3/112 | 0/97 | 0/99 |
| 1,917 nt | CT → TT | 0/99 | 0/99 | 0/62 | 0/14 | 0/112 | 11/97 | 0/99 |
| 2,891nt | GC → AC | 0/99 | 0/99 | 15/62 | 0/14 | 3/112 | 0/97 | 0/99 |
| 3,037nt | CT → TT | 2/99 | 80/99 | 41/62 | 14/14 | 75/111 | 72/97 | 81/99 |
| 6,446nt | GT → AT | 0/99 | 0/99 | 0/62 | 10/14 | 0/112 | 1/97 | 0/99 |
| 8,782nt | CC → TC | 29/99 | 5/99 | 1/62 | 0/14 | 15/112 | 15/97 | 7/99 |
| 14,408nt | CT → TT | 2/99 | 81/99 | 39/62 | 0/14 | 78/112 | 71/97 | 80/99 |
| 17,747nt | CT → TT | 0/99 | 0/99 | 0/62 | 0/14 | 8/112 | 12/97 | 0/99 |
| 17,858nt | AT → GT | 0/99 | 0/99 | 0/62 | 0/14 | 8/112 | 12/97 | 0/99 |
| 18,060nt | CT → TT | 0/99 | 0/99 | 0/62 | 0/14 | 9/112 | 11/97 | 0/99 |
| 22,444nt | CC → TC | 0/99 | 0/99 | 0/62 | 0/14 | 0/112 | 1/97 | 26/99 |
| 23,403nt | AT → GT | 0/99 | 81/99 | 1/62 | 14/14 | 77/112 | 72/97 | 80/99 |
| 25,563nt | GA → TA | 0/99 | 7/99 | 21/62 | 0/14 | 65/112 | 54/97 | 43/99 |
| 26,735nt | CA → TA | 0/99 | 1/99 | 0/62 | 0/14 | 0/112 | 1/97 | 39/99 |
| 28,144nt | TA → CA | 29/99 | 5/99 | 1/62 | 0/14 | 15/112 | 15/97 | 7/99 |
| 28,854nt | CA → TA | 2/99 | 0/99 | 1/62 | 0/14 | 3/112 | 3/97 | 29/99 |
| 28,881nt | GGG → AAC | 2/99 | 35/99 | 9/62 | 0/14 | 3/112 | 6/97 | 2/99 |
Survey of all sequence comparisons of hotspot mutations. Synopsis of the most frequent SARS-CoV-2 mutations collected from 570 nucleotide sequences of NCBI GenBank sequences from China, Europe, Germany, *Munich, the US (I and II), and India. Hotspot mutations (highlighted by enhanced print) arose, as SARS-CoV-2 expanded from China to different countries and populations. The * relates to the work by Böhmer et al., 2020 who followed 16 COVID-19 patients from the Munich, Germany area, but SARS-CoV-2 sequence data were published for only 14. CORRECTION: Please use enhanced print of numbers as used in Table 1 of the original manuscript. Many thanks.
Codon Changes SARS-CoV-2.
| Genome Position | DNA Sequence Original → Mutation | Amino Acid Original → Mutation | ORF → Product |
|---|---|---|---|
| 1,059nt | CC → TC | ACC (Threonine) → ATC (Isoleucine) | ORF1ab mature peptide → nsp 2 |
| 1,440nt | GC → AC | GGC (Glycine) → GAC (Aspartic Acid) | ORF1ab mature peptide → nsp 2 |
| 1,917 nt | CT → TT | ACT (Threonine) → ATT (Isoleucine) | ORF1ab mature peptide → nsp 2 |
| 2,891nt | GC → AC | GCA (Alanine) → ACA (Threonine) | ORF1ab mature peptide → nsp 3 |
| 6,446nt* | GT → AT | GTT (Valine) → ATT (Isoleucine) | ORF1ab → ORF1ab polyprotein segment 1 |
| 14,408nt | CT → TT | CCT (Proline) → CTT (Leucine) | ORF1ab → ORF1ab polyprotein segment 2 |
| 17,747nt | CT → TT | CCT (Proline) → CTT (Leucine) | ORF1ab mature peptide → helicase |
| 17,858nt | AT → GT | TAT (Tyrosine) → TGT (Cysteine) | ORF1ab mature peptide → helicase |
| 23,403nt | AT → GT | GAT (Aspartic Acid) → GGT (Glycine) | Surface glycoprotein |
| 28,144nt | TA → CA | TTA (Leucine) → TCA (Serine) | ORF8 → ORF8 protein |
| 28,854nt | CA → TA | TCA (Serine) → TTA (Leucine) | Nucleocapsid phosphoprotein |
| 28,881nt | GGG → AAC | AGGGGA → AAACGA (Arginine Glycine) (Lysine Arginine) | Nucleocapsid phosphoprotein |
Coding changes in mutants. A listing of amino acid exchanges due to the SARS-CoV-2 RNA mutations with the highest frequencies. The reading frames (proteins) affected were also listed. Amino acid sequence-neutral mutations have not been included. The latter mutations might still have altered the structure of the SARS-CoV-2 RNA and affected its ability to bind to or interact with cellular and viral proteins. The * relates to the publication by Böhmer et al., 2020. The mutation in position 23,403 has been investigated by Korber et al. (2020) for its potential to enhance viral infectivity.