| Literature DB >> 33815459 |
Madhvi Joshi1, Apurvasinh Puvar1, Dinesh Kumar1, Afzal Ansari1, Maharshi Pandya1, Janvi Raval1, Zarna Patel1, Pinal Trivedi1, Monika Gandhi1, Labdhi Pandya1, Komal Patel1, Nitin Savaliya1, Snehal Bagatharia2, Sachin Kumar3, Chaitanya Joshi1.
Abstract
Humanity has seen numerous pandemics duriene">ng its course of evolutioene">n. The list iene">ncludes several iene">ncidents from the past, such as measles, Ebola, severe acute respiratory syndrome (SARS), and Middle East respiratory syndrome (MERS), etc. The latest edition to this is coronavirus disease 2019 (COVID-19), caused by the novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). As of August 18, 2020, COVID-19 has affected over 21 million people from 180 + countries with 0.7 million deaths across the globe. Genomic technologies have enabled us to understand the genomic constitution of pathogens, their virulence, evolution, and rate of mutation, etc. To date, more than 83,000 viral genomes have been deposited in public repositories, such as GISAID and NCBI. While we are writing this, India is the third most affected country by COVID-19, with 2.7 million cases and > 53,000 deaths. Gujarat is the 11th highest affected state with a 3.48% death rate compared to the national average of 1.91%. In this study, a total of 502 SARS-CoV-2 genomes from Gujarat were sequenced and analyzed to understand its phylogenetic distribution and variants against global and national sequences. Further variants were analyzed from diseased and recovered patients from Gujarat and the world to understand its role in pathogenesis. Among the missense mutations present in the Gujarat SARS-CoV-2 genomes, C28854T (Ser194Leu) had an allele frequency of 47.62 and 7.25% in deceased patients from the Gujarat and global datasets, respectively. In contrast, the allele frequency of 35.16 and 3.20% was observed in recovered patients from the Gujarat and global datasets, respectively. It is a deleterious mutation present in the nucleocapsid (N) gene and is significantly associated with mortality in Gujarat patients with a p-value of 0.067 and in the global dataset with a p-value of 0.000924. The other deleterious variant identified in deceased patients from Gujarat (p-value of 0.355) and the world (p-value of 2.43E-06) is G25563T, which is located in Orf3a and plays a potential role in viral pathogenesis. SARS-CoV-2 genomes from Gujarat are forming distinct clusters under the GH clade of GISAID. This study will shed light on the viral haplotype in SARS-CoV-2 samples from Gujarat, India.Entities:
Keywords: COVID-19; SARS-CoV-2 (2019-nCoV); genomic surveillance; haplotyping; mutation analysis; viral epidemiology
Year: 2021 PMID: 33815459 PMCID: PMC8017293 DOI: 10.3389/fgene.2021.586569
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Mutation spectrum profile of 502 SARS-CoV-2 genomes from 46 locations representing 20 districts of Gujarat, India including synonymous and missense mutations. The top mutations included C241T, C3037T, C14408T/Pro314Leu, C18877T, A23403G/Asp614Gly, G25563T/Gln57His, and C26735T with frequency >55%.
FIGURE 2Phylogenetic distribution of lineage from 502 SARS-CoV-2 viral genomes from Gujarat, India with reference to the Wuhan/Hu-1/2019 (EPI_ISL_402125). Maximum likelihood phylogenetic tree was built using the Augur tree implementation pipeline with the IQ-TREE 2 with default parameters. The selected metadata information plotted in the time-resolved phylogenetic tree was constructed using TreeTime and visualized in the FigTree.
FIGURE 3Distribution of the GISAID clades of SARS-CoV-2 genomes from global and Gujarat datasets as of 18th August 2020. The majority of the viral genomes from Gujarat are falling under GH (n = 278) and G (n = 180) clades.
FIGURE 4Synonymous and missense mutation profiles of the SARS-CoV-2 viral genomes, Gujarat (n = 502), India (n = 1,821), and global (n = 79,518). Only mutations with frequency >5% are plotted.
FIGURE 5Venn diagram representing the mutually common and exclusive synonymous and missense mutations among SARS-CoV-2 viral genomes, Gujarat (n = 502), India (n = 1,821), and global (n = 79,518).
The overall comparison of missense 478 and synonymous mutation frequency profiles of Gujarat-502, India-1821, and Global-79518 datasets.
| 5’ UTR | C241T | 470 | 1,133 | 60,265 | 93.63 | 62.22 | 75.79 | #N/A | #N/A | 1.23505E-58 | |
| ORF1ab | C313T | 10 | 362 | 1,178 | 1.99 | 19.88 | 1.48 | 0.84 | Benign/tolerated | 0 | |
| C1059T | Thr265Ile | 2 | 7 | 14,114 | 0.40 | 0.38 | 17.75 | 0.03 | Deleterious | 3.3988E-104 | |
| A2292C | Gln676Pro | 75 | 0 | 0 | 14.94 | 0.00 | 0.00 | 0.05 | Deleterious | 0 | |
| C2836T | 209 | 17 | 21 | 41.63 | 0.93 | 0.03 | 0.17 | Benign/tolerated | 0 | ||
| C3037T | 474 | 1,145 | 61,503 | 94.42 | 62.88 | 77.34 | 0.66 | Benign/tolerated | 3.45605E-65 | ||
| C3634T | 38 | 78 | 26 | 7.57 | 4.28 | 0.03 | 0.40 | Benign/tolerated | 0 | ||
| C4084T | 34 | 1 | 35 | 6.77 | 0.05 | 0.04 | 0.72 | Benign/tolerated | 0 | ||
| G4300T | 68 | 1 | 41 | 13.55 | 0.05 | 0.05 | 0.84 | Benign/tolerated | 0 | ||
| G4354A | 0 | 116 | 0 | 0.00 | 6.37 | 0.00 | 1.00 | Benign/tolerated | 0 | ||
| C5700A | Ala1812Asp | 9 | 348 | 8 | 1.79 | 19.11 | 0.01 | 0.38 | Benign/tolerated | 0 | |
| C6312A | Thr2016Lys | 12 | 432 | 882 | 2.39 | 23.72 | 1.11 | 0.03 | Deleterious | 0 | |
| C6573T | Ser2103Phe | 3 | 114 | 206 | 0.60 | 6.26 | 0.26 | 0.36 | Benign/tolerated | 0 | |
| C8782T | 22 | 65 | 5,526 | 4.38 | 3.57 | 6.95 | 0.67 | Benign/tolerated | 1.08234E-08 | ||
| C8917T | 1 | 107 | 90 | 0.20 | 5.88 | 0.11 | 1.00 | Benign/tolerated | 0 | ||
| G11083T | Leu3606Phe | 16 | 362 | 8,060 | 3.19 | 19.88 | 10.14 | 0.01 | Deleterious | 1.98676E-46 | |
| C13730T | Ala4489Val | 11 | 471 | 1,034 | 2.19 | 25.86 | 1.30 | 0.00 | Deleterious | 0 | |
| C14408T | Pro4715Leu | 447 | 1,110 | 61,641 | 89.04 | 60.96 | 77.52 | 0.31 | Benign/tolerated | 9.93477E-70 | |
| C14805T | 0 | 5 | 6,799 | 0.00 | 0.27 | 8.55 | 1.00 | Benign/tolerated | 2.09768E-45 | ||
| C15324T | 41 | 73 | 1,588 | 8.17 | 4.01 | 2.00 | 1.00 | Benign/tolerated | 2.2731E-28 | ||
| A16512G | 53 | 0 | 13 | 10.56 | 0.00 | 0.02 | 1.00 | Benign/tolerated | 0 | ||
| C18568T | Leu6102Phe | 72 | 1 | 50 | 14.34 | 0.05 | 0.06 | 0.01 | Deleterious | 0 | |
| C18877T | 286 | 147 | 2,075 | 56.97 | 8.07 | 2.61 | 1.00 | Benign/tolerated | 0 | ||
| C19154T | Thr6297Ile | 47 | 0 | 5 | 9.36 | 0.00 | 0.01 | 0.21 | Benign/tolerated | 0 | |
| A20268G | 0 | 3 | 4,650 | 0.00 | 0.16 | 5.85 | 1.00 | Benign/tolerated | 1.27368E-30 | ||
| S | G21724T | Leu54Phe | 95 | 4 | 304 | 18.92 | 0.22 | 0.38 | 0.69 | Benign/tolerated | 0 |
| C22444T | 218 | 96 | 201 | 43.43 | 5.27 | 0.25 | 1.00 | Benign/tolerated | 0 | ||
| A23403G | Asp614Gly | 472 | 1,142 | 61,751 | 94.02 | 62.71 | 77.66 | 0.30 | Benign/tolerated | 2.08832E-67 | |
| C23929T | 12 | 408 | 858 | 2.39 | 22.41 | 1.08 | 1.00 | Benign/tolerated | 0 | ||
| ORF3a | C25528T | Leu46Phe | 0 | 110 | 194 | 0.00 | 6.04 | 0.24 | 0.00 | Deleterious | 0 |
| G25563T | Gln57His | 290 | 147 | 18,045 | 57.77 | 8.07 | 22.69 | 0.00 | Deleterious | 1.1597E-125 | |
| G26144T | Gly251Val | 0 | 4 | 5,385 | 0.00 | 0.22 | 6.77 | 0.00 | Deleterious | 1.93496E-35 | |
| M | C26735T | 277 | 154 | 797 | 55.18 | 8.46 | 1.00 | 1.00 | Benign/tolerated | 0 | |
| ORF8 | T28144C | Leu84Ser | 20 | 75 | 5,636 | 3.98 | 4.12 | 7.09 | 0.37 | Benign/tolerated | 1.70788E-07 |
| N | C28311T | Pro13Leu | 13 | 413 | 1,151 | 2.59 | 22.68 | 1.45 | 0.00 | Deleterious | 0 |
| C28854T | Ser194Leu | 201 | 106 | 1,948 | 40.04 | 5.82 | 2.45 | 0.05 | Deleterious | 0 | |
| GGG28881AAC | ArgGly203LysArg | 11 | 642 | 26,021 | 2.19 | 35.25 | 32.72 | 0.00 | Deleterious | 5.3828E-48 | |
| 3’ UTR | C29750T | 75 | 0 | 42 | 14.94 | 0.00 | 0.05 | #N/A | #N/A | 0 | |
| G29868A | 0 | 353 | 42 | 0.00 | 19.38 | 0.05 | #N/A | #N/A | 0 | ||
FIGURE 6Frequency of missense mutations in SARS-CoV-2 viral genome from global dataset. (A) Bar chart for global deceased versus recovered patients. (B) Venn diagram of the global deceased versus recovered patients. (C) Bar chart for the Gujarat deceased versus recovered patients. (D) Venn diagram of the Gujarat deceased versus recovered patients.
Comparison of missense mutation frequency in deceased 481 vs recovered patients from global dataset.
| C14408T | Pro4715Leu | 245 | 1,450 | 88.77 | 78.59 | 0.31 | Benign/tolerated | 8.28E-05 |
| A23403G | Asp614Gly | 205 | 1,403 | 74.28 | 76.04 | 0.3 | Benign/tolerated | 0.522342 |
| G25563T | Gln57His | 112 | 495 | 40.58 | 26.83 | 0.00 | Deleterious | 2.43E-06 |
| GGG28881AAC | ArgGly203LysArg | 101 | 579 | 39.45 | 31.38 | 0.00 | Deleterious | 0.083557 |
| C1059T | Thr265Ile | 23 | 206 | 8.33 | 11.17 | 0.03 | Deleterious | 0.157376 |
| C28854T | Ser194Leu | 20 | 59 | 7.25 | 3.20 | 0.05 | Deleterious | 0.000924 |
| G25088T | Val1176Phe | 27 | 5 | 9.78 | 0.27 | #N/A | #N/A | 1.19E-33 |
| T28144C | Leu84Ser | 13 | 148 | 4.71 | 8.02 | 0.37 | Benign/tolerated | 0.052701 |
| T12503C | Tyr4080His | 0 | 109 | 0.00 | 5.91 | 0.00 | Deleterious | 3.38E-05 |
| G11083T | Leu3606Phe | 7 | 94 | 2.54 | 5.09 | 0.01 | Deleterious | 0.062656 |
| G25770T | Arg126Ser | 0 | 79 | 0.00 | 4.28 | 0.00 | Deleterious | 0.000459 |
Comparison of missense mutation frequency in deceased 485 vs recovered patients from Gujarat dataset.
| A23403G | Asp614Gly | 62 | 241 | 98.41 | 94.14 | 0.30 | Benign/tolerated | 0.164016 |
| C14408T | Pro4715Leu | 61 | 234 | 96.83 | 91.41 | 0.31 | Benign/tolerated | 0.144062 |
| G25563T | Gln57His | 39 | 142 | 61.90 | 55.47 | 0.00 | Deleterious | 0.355651 |
| C28854T | Ser194Leu | 30 | 90 | 47.62 | 35.16 | 0.00 | Deleterious | 0.067355 |
| G16078A | Val5272Ile | 7 | 10 | 11.11 | 3.91 | 0.00 | Deleterious | 0.022562 |
| G23311T | Glu583Asp | 5 | 10 | 7.94 | 3.91 | 0.33 | Benign/tolerated | 0.175819 |
| C23277T | Thr572Ile | 4 | 5 | 6.35 | 1.95 | 0.57 | Benign/tolerated | 0.059057 |
| G21724T | Leu54Phe | 3 | 39 | 4.76 | 15.23 | 0.69 | Benign/tolerated | 0.027646 |
| C18568T | Leu6102Phe | 2 | 33 | 3.17 | 12.89 | 0.01 | Deleterious | 0.027074 |
| A2292C | Gln676Pro | 2 | 31 | 3.17 | 12.11 | 0.05 | Deleterious | 0.036972 |
Chi-square test analysis of the deceased and recovered 490 patients for gender and age group.
| Total sample | 63 | 256 | 276 | 1,845 | 0.00118 | |
| Gender | Male | 37 | 178 | 203 | 1,002 | 0.89596 |
| Female | 26 | 78 | 73 | 843 | 2.7E-08 | |
| Age (years) | 0–40 | 2 | 94 | 18 | 865 | 0.97648 |
| 41–60 | 28 | 115 | 101 | 675 | 0.03783 | |
| > 60 | 33 | 47 | 157 | 305 | 0.20849 | |
FIGURE 7Overall comparison of the missense mutations in SARS-CoV-2 genome. Gujarat (R = 256, D = 63) and Global (R = 1,845, D = 276), where “R” is the number of genomes from recovered patients, and “D” is the number of genomes from deceased patients.
FIGURE 8Distinct cluster of the viral isolate with mutation C28854T/Ser194Leu/N gene in Gujarat SARS-CoV-2 genomes. This cluster is visualized at http://covid.gbrc.org.in/nextstrain.php using the Nextstrain virus genome analysis pipeline.