| Literature DB >> 32742035 |
Takahiko Koyama1, Daniel Platt1, Laxmi Parida1.
Abstract
OBJECTIVE: To analyse genome variants of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2).Entities:
Mesh:
Substances:
Year: 2020 PMID: 32742035 PMCID: PMC7375210 DOI: 10.2471/BLT.20.253591
Source DB: PubMed Journal: Bull World Health Organ ISSN: 0042-9686 Impact factor: 9.408
Number of gene variants in SARS-CoV-2 genomes,2019–2020
| Genome segmenta | Missense mutation | Synonymous mutation | Non-coding region | In-frame | Frameshift deletion | Stop-gained | Total | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mutation | Deletion | Insertion | Deletion | Insertion | |||||||||
| 1905 | 1344 | 0 | 0 | 0 | 57 | 2 | 7 | 13 | 3328 | ||||
| 394 | 260 | 0 | 0 | 0 | 27 | 0 | 0 | 6 | 687 | ||||
| 169 | 71 | 0 | 0 | 0 | 5 | 0 | 1 | 1 | 247 | ||||
| 27 | 15 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 43 | ||||
| 53 | 71 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 124 | ||||
| 28 | 11 | 0 | 0 | 0 | 2 | 0 | 0 | 2 | 43 | ||||
| 59 | 29 | 0 | 0 | 0 | 1 | 0 | 2 | 6 | 97 | ||||
| 68 | 26 | 0 | 0 | 0 | 1 | 0 | 0 | 7 | 102 | ||||
| 20 | 12 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 34 | ||||
| 246 | 126 | 0 | 0 | 0 | 6 | 0 | 0 | 0 | 378 | ||||
| Intergenic | 0 | 0 | 0 | 7 | 2 | 0 | 0 | 0 | 0 | 9 | |||
| 5’-UTR | 0 | 0 | 260 | 50 | 37 | 0 | 0 | 0 | 0 | 347 | |||
| 3’-UTR | 0 | 0 | 224 | 85 | 27 | 0 | 0 | 0 | 0 | 336 | |||
E: envelope protein; M: membrane glycoprotein; N: nucleocapsid phosphoprotein; ORF: open reading frame; S: spike glycoprotein; SARS-CoV-2: severe acute respiratory syndrome coronavirus 2; UTR: untranslated region.
a Genes are in italics.
Note: We compared 10 022 genomes to the NC_045512 genome sequence.
Number of variants in the open reading frame 1ab of SARS-CoV-2 genomes, by final cleaved protein, 2019–2020
| Final proteina | Missense mutation | Synonymous mutation | Non-coding region | In-frame | Frameshift deletion | Stop-gained | Total | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mutation | Deletion | Insertion | Deletion | Insertion | |||||||||
| NSP1 | 64 | 45 | 0 | 0 | 0 | 13 | 0 | 1 | 0 | 123 | |||
| NSP2 | 237 | 130 | 0 | 0 | 0 | 5 | 0 | 0 | 0 | 372 | |||
| NSP3 | 547 | 349 | 0 | 0 | 0 | 16 | 0 | 2 | 3 | 917 | |||
| NSP4 | 116 | 113 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 232 | |||
| 3CLPro | 67 | 54 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 121 | |||
| NSP6 | 82 | 67 | 0 | 0 | 0 | 4 | 1 | 2 | 0 | 156 | |||
| NSP7 | 27 | 21 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 48 | |||
| NSP8 | 60 | 25 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 87 | |||
| NSP9 | 29 | 22 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 52 | |||
| NSP10 | 25 | 25 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 52 | |||
| RdRp | 194 | 157 | 0 | 0 | 0 | 2 | 0 | 1 | 3 | 357 | |||
| Helicase | 148 | 101 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 249 | |||
| ExoN | 141 | 118 | 0 | 0 | 0 | 11 | 0 | 1 | 2 | 273 | |||
| endoRNase | 92 | 67 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 162 | |||
| OMT | 76 | 50 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 128 | |||
3CLPro: 3C like protease; ExoN: 3-’5′ exonuclease; NSP: non-structural protein; OMT: O-methyltransferase; RdRp: RNA-dependent RNA polymerase; SARS-CoV-2: severe acute respiratory syndrome coronavirus 2.
a The open reading frame 1ab gene codes for a polyprotein, which a viral protease cleaves in to several protein after translation.
Note: We compared 10 022 genomes to the NC_045512 genome sequence.
Variants of SARS-CoV-2 genomes observed in more than 100 samples, 2019–2020
| Genomic change | Type of mutation | Gene/protein | Amino acid change | No. of samples |
|---|---|---|---|---|
| Synonymous | F924F/F106F | 6334 | ||
| Missense | P4715L/P323L | 6319 | ||
| Missense | D614G | 6294 | ||
| Non-coding | NA | 5928 | ||
| Missense | Q57H | 2893 | ||
| Missense | T265I/T85I | 2442 | ||
| Missense | L84S | 1669 | ||
| Synonymous | S2839S/S76S | 1598 | ||
| Missense | 203_204delinsKR | 1573 | ||
| Synonymous | L5932L/L7L | 1178 | ||
| Missense | Y5865C/Y541C | 1166 | ||
| Missense | P5828L/P504L | 1147 | ||
| Missense | L3606F/L37F | 1070 | ||
| Synonymous | Y4847Y/Y455Y | 844 | ||
| Missense | G251V | 769 | ||
| Synonymous | L6668L/L216L | 452 | ||
| Synonymous | R5661R/R337R | 325 | ||
| Missense | P765S/P585S | 274 | ||
| Synonymous | N5020N/N628N | 267 | ||
| In-frame deletion | D448del/D268del | 250 | ||
| Synonymous | L6205L/L280L | 234 | ||
| Missense | I739V/I559V | 232 | ||
| Missense | T175M | 221 | ||
| Missense | S3884L/S25L | 185 | ||
| Synonymous | Y717Y/Y537Y | 170 | ||
| Missense | G392D/G212D | 164 | ||
| Missense | S24L | 164 | ||
| Non-coding | NA | 163 | ||
| Missense | A876T/A58T | 159 | ||
| Missense | S194L | 155 | ||
| Missense | V378I/V198I | 139 | ||
| Synonymous | D128D | 139 | ||
| Synonymous | L139L | 138 | ||
| Missense | A6245V/A320V | 137 | ||
| Missense | P13L | 136 | ||
| Missense | S197L | 136 | ||
| Missense | F3071Y/F308Y | 136 | ||
| Missense | G196V | 132 | ||
| Non-coding | NA | 131 | ||
| Missense | V13L | 128 | ||
| Synonymous | N824N | 118 | ||
| Non-coding | NA | 115 | ||
| Missense | V62L | 113 | ||
| Synonymous | A69A | 106 | ||
| Non-coding deletion | NA | 106 | ||
| Non-coding deletion | NA | 105 | ||
| Synonymous | H83H/H83H | 105 | ||
| Synonymous | T723T | 102 | ||
| Missense | P971L/T1198K | 101 |
del: deletion; delins: deletion–insertion; ExoN: 3’-5′ exonuclease; NSP: non-structural protein; M: membrane glycoprotein; N: nucleocapsid phosphoprotein; NA: not applicable; ORF: open reading frame; RdRp: RNA-dependent RNA polymerase; SARS-CoV-2: severe acute respiratory syndrome coronavirus 2; S: spike glycoprotein; UTR: untranslated region.
Note: We compared 10 022 genomes to the NC_045512 genome sequence.
Fig. 1A graphical representation of variants found in SARS-CoV-2 genomes, 2019–2020
Major clades of SARS-CoV-2 genomes, 2019–2020
| Clade/sublevel 1/sublevel 2 | First observation of strain | No. of samples | ||
|---|---|---|---|---|
| Date | Accession no. | Country | ||
| Dec 2019 | MN90894 | China | 670 | |
| 24 Jan 2020 | EPI_ISL_422425 | China | 1889 | |
| D614G/Q57H/ | 26 Feb 2020 | EPI_ISL_418219 | France | 469 |
| D614G/Q57H/T265I | 21 Feb 2020 | EPI_ISL_418218 | France | 2391 |
| D614G/203_204delinsKR/ | 25 Feb 2020 | EPI_ISL_412912 | Germany | 1330 |
| D614G/203_204delinsKR/T175M | 1 Mar 2020 | EPI_ISL_413647 and EPI_ISL_417688 | Portugal and Iceland | 215 |
| 30 Dec 2019 | MT291826 | China | 525 | |
| L84S/P5828L | 20 Feb 2020 | EPI_ISL_413456 | United States | 1137 |
| 18 Jan 2020 | EPI_ISL_408481 | China | 182 | |
| L3606F/V378I/ | 18 Jan 2020 | EPI_ISL_412981 | China | 127 |
| L3606F/G251V/ | 29 Jan 2020 | EPI_ISL_412974 | Italy | 419 |
| L3606F/G251V/P765S | 20 Feb 2020 | EPI_ISL_415128 | Brazil | 260 |
| 8 Feb 2020 | EPI_ISL_410486, | France | 248 | |
| 25 Feb 2020 | EPI_ISL_414497 | Germany | 160 | |
Del: deletion; delins: deletion–insertion; SARS-CoV-2: severe acute respiratory syndrome coronavirus 2.
a The reference genome (NC_045512) used in this study belongs to the basal clade.
Fig. 2Base pair changes observed in SARS-CoV-2 genomes, 2019–2020
Fig. 3Annotation of SARS-CO-2 variants in the alignment of the amino acid sequence of the spike protein from several coronaviruses, 2019–2020
Fig. 4Phylogenetic tree for the SARS-CoV-2 genomes, 2019–2020