| Literature DB >> 32701194 |
Ovinu Kibria Islam1, Hassan M Al-Emran1, Md Shazid Hasan1, Azraf Anwar2, Md Iqbal Kabir Jahid1, Md Anwar Hossain1,3.
Abstract
The SARS-CoV-2 coronavirus is responsible for the current COVID-19 pandemic, with an ongoing toll of over 5 million infections and 333 thousand deaths worldwide within the first 5 months. Insight into the phylodynamics and mutation variants of this virus is vital to understanding the nature of its spread in different climate conditions. The incidence rate of COVID-19 is increasing at an alarming pace within subtropical South-East Asian nations with high temperatures and humidity. To understand this spread, we analysed 444 genome sequences of SARS-CoV-2 available on the GISAID platform from six South-East Asian countries. Multiple sequence alignments and maximum-likelihood phylogenetic analyses were performed to analyse and characterize the non-synonymous (NS) mutant variants circulating in this region. Global mutation distribution analysis showed that the majority of the mutations found in this region are also prevalent in Europe and North America, and the concurrent presence of these mutations at a high frequency in other countries indicates possible transmission routes. Unique spike protein and non-structural protein mutations were observed circulating within confined area of a given country. We divided the circulating viral strains into four major groups and three subgroups on the basis of the most frequent NS mutations. Strains with a unique set of four co-evolving mutations were found to be circulating at a high frequency within India, specifically. Group 2 strains characterized by two co-evolving NS mutants which alter in RdRp (P323L) and spike (S) protein (D614G) were found to be common in Europe and North America. These European and North American variants have rapidly emerged as dominant strains within South-East Asia, increasing from a 0% prevalence in January to an 81% by May 2020. These variants may have an evolutionary advantage over their ancestral types and could present a large threat to South-East Asia for the coming winter.Entities:
Keywords: COVID-19; SARS-CoV-2; genome sequence; non-synonymous mutation; phylogeny
Mesh:
Year: 2020 PMID: 32701194 PMCID: PMC7405211 DOI: 10.1111/tbed.13748
Source DB: PubMed Journal: Transbound Emerg Dis ISSN: 1865-1674 Impact factor: 4.521
The frequency of SARS‐CoV‐2 cases identified in South‐East Asia region by country and their possessed mutations among selected 60 strains. Case reports were until 23 May 2020
| Country | India | Bangladesh | Indonesia | Thailand | Sri Lanka | Nepal | Overall |
|---|---|---|---|---|---|---|---|
| Selected sequences for detailed study | 30 | 10 | 8 | 7 | 4 | 1 | 60 |
| Collection dates | 31 Jan−6 May | 18 Apr−13 May | 17 Mar−14 Apr | 22 Jan−3 Apr | 4–31 Mar | 13 Jan | 13 January–13 May |
| Non‐synonymous (N‐S) mutations | 39 | 22 | 12 | 8 | 13 | 0 | 78 |
| Unique N‐S mutations | 11 | 4 | 2 | 3 | 1 | 0 | 21 |
| Spike protein N‐S mutations | 8 | 2 | 4 | 2 | 1 | 0 | 13 |
| E protein N‐S mutation | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
| M protein N‐S mutations | 1 | 0 | 0 | 1 | 1 | 0 | 3 |
| N protein N‐S mutations | 4 | 6 | 1 | 2 | 3 | 0 | 9 |
| NS3 N‐S mutations | 1 | 3 | 1 | 1 | 2 | 0 | 6 |
| NS7b N‐S mutations | 0 | 0 | 0 | 1 | 0 | 0 | 1 |
| NS8 N‐S mutations | 1 | 2 | 0 | 1 | 0 | 0 | 2 |
| NSP2 N‐S mutations | 8 | 1 | 0 | 1 | 1 | 0 | 10 |
| NSP3 N‐S mutations | 5 | 2 | 1 | 2 | 1 | 0 | 11 |
| NSP4 N‐S mutations | 2 | 0 | 0 | 0 | 0 | 0 | 2 |
| NSP5 N‐S mutations | 0 | 2 | 0 | 0 | 0 | 0 | 2 |
| NSP6 N‐S mutations | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| NSP8 N‐S mutations | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| NSP12 N‐S mutations | 3 | 1 | 4 | 1 | 1 | 0 | 6 |
| NSP13 N‐S mutations | 2 | 1 | 0 | 0 | 0 | 0 | 3 |
| NSP14 N‐S mutations | 2 | 1 | 0 | 1 | 1 | 0 | 5 |
| NSP15 N‐S mutations | 0 | 0 | 0 | 1 | 1 | 0 | 2 |
Unique mutations with amino acid substitutions observed solely in South‐East Asia until 23 May 2020
| Mutation sites | Country | No. of virus (study) | No. of virus (total) |
|---|---|---|---|
| E_L65M | India | 1 | 1 |
| M_L29J | India | 1 | 1 |
| N_K347N | Indonesia | 2 | 2 |
| NS3_I263M | Sri Lanka | 1 | 1 |
| NSP2_G212D | Thailand | 1 | 1 |
| NSP2_I120F | Bangladesh | 4 | 9 |
| NSP2_L266I | India | 1 | 1 |
| NSP3_L1553J | India | 1 | 1 |
| NSP3_N1337S | Bangladesh | 1 | 1 |
| NSP3_S1485Y | India | 1 | 1 |
| NSP5_D92G | Bangladesh | 3 | 4 |
| NSP12_V880I | India | 1 | 4 |
| NSP13_K40R | India | 1 | 2 |
| NSP13_T214I | India | 1 | 1 |
| NSP14_D415G | Thailand | 1 | 1 |
| NSP14_S434N | India | 1 | 1 |
| NSP14_V459I | Bangladesh | 1 | 1 |
| Spike_A829T | Thailand | 3 | 31 |
| Spike_A930V | India | 1 | 1 |
| Spike_E471Q | India | 1 | 1 |
| Spike_S116C | Indonesia | 1 | 1 |
FIGURE 1Phylogenetic relations of selected 60 SARS‐CoV‐2 strain sequences separated into four clusters
FIGURE 2(a) The frequencies of 10 most recurrent mutations in countries of South‐East Asia compared with worldwide frequencies in percentage. (b) Number of sequences separated in different group and subgroup
FIGURE 3Mutation sites with amino acid substitution in spike proteins found in South‐East Asia. Receptor‐binding domain (RBD) chain is marked with green. Most abundant mutation D614G is marked with yellow and mutations reside in RBD are marked with green balls
FIGURE 4The cluster‐based time plot of 444 complete genome sequences available in GISAID from January to May 2020. Bar charts indicate the frequency of mutants identified in the respective months. Line graph indicates the number of COVID‐19 infection and death cases
FIGURE 5A total of 78 non‐synonymous mutations found in 60 sequences of South‐East Asia. Markers showing the number of these mutations present in relatively high frequency in respective countries (Supporting Information S3)
FIGURE 6Transmission map constructed by Nextstrain with 329 sequences showing possible transmission routes into six South‐East Asian countries. This map is also separated the sequences of different clades