| Literature DB >> 34821382 |
Jian Yu1,2,3, Shanshan Sun1,2, Qianqian Tang1,2, Chengzhuo Wang1,2, Liangchen Yu4, Lulu Ren4, Jun Li3, Zhenhua Zhang1,2.
Abstract
Coronavirus disease 2019 (COVID-19) is a severe respiratory disease caused by the highly infectious severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). As the COVID-19 pandemic continues, mutations of SARS-CoV-2 accumulate. These mutations may not only make the virus spread faster, but also render current vaccines less effective. In this study, we established a reference sequence for each clade defined using the GISAID typing method. Homology analysis of each reference sequence confirmed a low mutation rate for SARS-CoV-2, with the latest clade GRY having the lowest homology with other clades (99.89%-99.93%), and the homology between other clade being greater than or equal to 99.95%. Variation analyses showed that the earliest genotypes S, V, and G had 2, 3, and 3 characterizing mutations in the genome respectively. The G-derived clades GR, GH, and GV had 5, 6, and 13 characterizing mutations in the genome respectively. A total of 28 characterizing mutations existed in the genome of the latest clades GRY. In addition, we found differences in the geographic distribution of different clades. G, GH, and GR are popular in the USA, while GV and GRY are common in the UK. Our work may facilitate the custom design of antiviral strategies depending on the molecular characteristics of SARS-CoV-2.Entities:
Keywords: SARS-CoV-2; characterizing mutation; high-frequency mutation; reference sequence; variation analyses
Mesh:
Substances:
Year: 2021 PMID: 34821382 PMCID: PMC9015442 DOI: 10.1002/jmv.27476
Source DB: PubMed Journal: J Med Virol ISSN: 0146-6615 Impact factor: 20.693
Comparison of homology between different clade of SARS‐COV‐2
| Homology (%) | L | S | V | G | GH | GR | GV | GRY |
|---|---|---|---|---|---|---|---|---|
| L | 100 | 99.99 | 99.99 | 99.99 | 99.98 | 99.98 | 99.96 | 99.91 |
| S | 100 | 99.98 | 99.98 | 99.97 | 99.97 | 99.95 | 99.90 | |
| V | 100 | 99.98 | 99.97 | 99.97 | 99.95 | 99.90 | ||
| G | 100 | 99.99 | 99.99 | 99.97 | 99.92 | |||
| GH | 100 | 99.98 | 99.96 | 99.91 | ||||
| GR | 100 | 99.96 | 99.93 | |||||
| GV | 100 | 99.89 | ||||||
| GRY | 100 |
Abbreviation: SARS‐CoV‐2, severe acute respiratory syndrome coronavirus 2.
Characterizing mutations at nucleotide level of SARS‐CoV‐2 based on reference sequences
| Region | Position | L | S | V | G | GH | GR | GV | GRY |
|---|---|---|---|---|---|---|---|---|---|
| 5ʹUTR | 204 | G | T | ||||||
| 241 | C | T | T | T | T | T | |||
| 1a | 445 | T | C | ||||||
| 913 | C | T | |||||||
| 1059 | C | T | |||||||
| 3037 | C | T | T | T | T | T | |||
| 3267 | C | T | |||||||
| 5388 | C | A | |||||||
| 5986 | C | T | |||||||
| 6286 | C | T | |||||||
| 6954 | T | C | |||||||
| 8782 | C | T | |||||||
| 11083 | G | T | |||||||
| 11288‐11296 | TCTGGTTTT | del | |||||||
| 1b | 14408 | C | T | T | T | T | T | ||
| 14676 | C | T | |||||||
| 14805 | C | T | |||||||
| 15279 | C | T | |||||||
| 16176 | T | C | |||||||
| 21255 | G | C | |||||||
| S | 21766‐21771 | ACATGT | del | ||||||
| 21994‐21996 | TTA | del | |||||||
| 22227 | C | T | |||||||
| 23063 | A | T | |||||||
| 23271 | C | A | |||||||
| 23403 | A | G | G | G | G | G | |||
| 23604 | C | A | |||||||
| 23709 | C | T | |||||||
| 24506 | T | G | |||||||
| 24914 | G | C | |||||||
| 3a | 25563 | G | T | ||||||
| 26144 | G | T | |||||||
| M | 26801 | C | G | ||||||
| 8 | 27944 | C | T | ||||||
| 27972 | C | T | |||||||
| 28048 | G | T | |||||||
| 28111 | A | G | |||||||
| 28144 | T | C | |||||||
| N | 28274 | A | del | ||||||
| 28280‐28282 | GAT | CTA | |||||||
| 28881‐28883 | GGG | AAC | AAC | ||||||
| 28932 | C | T | |||||||
| 28977 | C | T | |||||||
| 10 | 29645 | G | T |
Abbreviation: SARS‐CoV‐2, severe acute respiratory syndrome coronavirus 2.
Characterizing mutations at the amino acid level of SARS‐CoV‐2 based on reference sequences
| Region | position | L | S | V | G | GH | GR | GV | GRY |
|---|---|---|---|---|---|---|---|---|---|
| 1a | 265 (NSP2_85) | T | I | ||||||
| 1001 (NSP3_183) | T | I | |||||||
| 1708 (NSP3_890) | A | D | |||||||
| 2230 (NSP3_1412I) | I | T | |||||||
| 3606 (NSP6_37) | L | F | |||||||
| 3675‐3677 (NSP6_106‐108) | SGF | del | |||||||
| 1b | 314 | P | L | L | L | L | L | ||
| S | 69 | H | del | ||||||
| 70 | V | del | |||||||
| 144 | Y | del | |||||||
| 222 | A | V | |||||||
| 501 | N | Y | |||||||
| 570 | A | D | |||||||
| 614 | D | G | G | G | G | G | |||
| 681 | P | H | |||||||
| 716 | T | I | |||||||
| 982 | S | A | |||||||
| 1118 | D | H | |||||||
| 3a | 57 | Q | H | ||||||
| 251 | G | V | |||||||
| 8 | 27 | Q | stop | ||||||
| 84 | L | S | |||||||
| N | 3 | D | L | ||||||
| 203 | R | K | K | ||||||
| 204 | G | R | R | ||||||
| 220 | A | V | |||||||
| 235 | S | F | |||||||
| 10 | 30 | V | L |
Abbreviation: SARS‐CoV‐2, severe acute respiratory syndrome coronavirus 2.
Figure 1The changes in the rate of characterizing mutations in each clade over time
Figure 2High‐frequency mutations (the mutation rate is between 20% and 50%) of each clade. (A) High‐frequency mutations through comparison of 100 downloaded sequences with corresponding reference sequences; (B) The mutation rate of high‐frequency mutations were conducted on GISAID; (C)–(H) represent the changes in the rate of high‐frequency mutations over time in clade S, V, G, GH, GV, and GRY, respectively
Figure 3The changes in the rate of characterizing mutations in key regions over time