| Literature DB >> 35790838 |
Shilpa Chatterjee1, Choon-Mee Kim2, You Mi Lee3, Jun-Won Seo3, Da Young Kim3, Na Ra Yun3, Dong-Min Kim4.
Abstract
To investigate the specific genomic features and mutation pattern, whole and near-complete SARS-CoV-2 genome sequences were analyzed. Clinical samples were collected from 18 COVID-19-positive patients and subjected to nucleic acid purification. Cell culture was performed to extract various SARS-CoV-2 isolates. Whole-genome analysis was performed using next-generation sequencing, and phylogenetic analyses were conducted to determine genetic diversity of the various SARS-CoV-2 isolates. The next-generation sequencing data identified 8 protein-coding regions with 17 mutated proteins. We identified 51 missense point mutations and deletions in 5' and 3' untranslated regions. The phylogenetic analysis revealed that V and GH are the dominant clades of SARS-CoV-2 circulating in the Gwangju region of South Korea. Moreover, statistical analysis confirmed a significant difference between viral load (P < 0.001) and number of mutations (P < 0.0001) in 2 mutually exclusive SARS-CoV-2 clades which indicates frequent genomic alterations in SARS-CoV-2 in patients with high viral load. Our results provide an in-depth analysis of SARS-COV-2 whole genome which we believe, can shed light in the understanding of SARS-COV-2 pathogenesis and mutation pattern which can aid in the development of prevention methods as well as future research into the pathogenesis of SARS-CoV-2 and therapeutic development.Entities:
Mesh:
Year: 2022 PMID: 35790838 PMCID: PMC9255444 DOI: 10.1038/s41598-022-14989-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Data on patient characteristics, sample types, symptoms at collection, clades, and number of mutations found in the SARS-CoV-2 genomes isolated from COVID-19 patients.
| Patient (age/sex) | Sample type | Symptoms at sample collection | Symptom severity | Clade/lineage | Mutations | ||
|---|---|---|---|---|---|---|---|
| GISAID | Nextstrain | Nucleotide | Amino acid | ||||
| M/46 | Nasopharynx | Coughing, chills | Mild to moderate | V | 19A | 7 | 5 |
| M/30 | Nasopharynx | Coughing, sore throat, chills | Mild to moderate | 5 | 4 | ||
| Sputum | 4 | 3 | |||||
| M/30 | Sputum | Febrile sensation | Mild to moderate | 5 | 3 | ||
| Nasopharynx | 5 | 3 | |||||
| F/29 | Nasopharynx | Sore throat, myalgia, chills | Mild to moderate | 6 | 3 | ||
| Nasopharynx | 5 | 3 | |||||
| Sputum | 6 | 3 | |||||
| Nasopharynx | 5 | 3 | |||||
| Nasopharynx | 6 | 3 | |||||
| Nasopharynx | 6 | 4 | |||||
| M/68 | Nasopharynx | Febrile sensation, chills, fever | Mild to moderate | GH | 20C | 12 | 6 |
| F/53 | Nasopharynx | Myalgia, fever | Mild to moderate | 12 | 7 | ||
| F/81 | Nasopharynx | Coughing, rhinorrhea | Mild to moderate | 16 | 11 | ||
| M/68 | Sputum | Chills, fever | Severe | 20 | 13 | ||
| M/76 | Sputum | Fever, cough, dizziness, chills | Mild to moderate | 14 | 8 | ||
| M/74 | Sputum | Cough, phlegm | Mild to moderate | 21 | 9 | ||
| F/51 | Sputum | Sore throat | Mild to moderate | 20 | 8 | ||
| F/37 | Nasopharynx | Asymptomatic | Asymptomatic | 24 | 15 | ||
| M/83 | Nasopharynx | Hypotension, low oxygen saturation | Critical/fatal | 19 | 7 | ||
| Nasopharynx | 19 | 7 | |||||
| Cell supernatant | 20 | 9 | |||||
| Cell supernatant | 20 | 8 | |||||
| Plasma | 20A | 17 | 9 | ||||
| Nasopharynx | 20C | 17 | 7 | ||||
| Sputum | 17 | 9 | |||||
| M/82 | Nasopharynx | Fever | Mild to moderate | 22 | 12 | ||
| Nasopharynx | 22 | 12 | |||||
| F/93 | Cell supernatant, nasopharynx | Myalgia | Mild to moderate | 15 | 10 | ||
| F/66 | Cell supernatant, sputum | Dyspnea | Mild to moderate | 22 | 10 | ||
| F/78 | Cell supernatant, sputum | Sore throat, myalgia, chills | Mild to moderate | GH | 20C | 22 | 9 |
| F/87 | Cell supernatant, nasopharynx | Hypotension, low oxygen saturation | Critical/fatal | 20 | 7 | ||
Figure 1Phylogenetic analysis of complete/near complete genomes of the severe acute respiratory syndrome coronavirus 2 (SARS-COV-2) isolates. The evolutionary tree was constructed by the Maximum Likelihood (M-L) method using MEGA X software. The tree is drawn to scale, with branch lengths reflecting the number of substitutions per site (indicated below the branches). Only bootstraps greater than 70 are shown. To evaluate replicated tree confidence, 1000 bootstrap replicates were performed. The evolutionary distances were computed using the Tamura-Nei method and have been presented in the units of number of base substitutions per site. This analysis involved 56 nucleotide sequences. Clades are indicated on the right.
Figure 2Schematic mapping of the mutations in the SARS-CoV-2 whole genome. The full-length (29,903 bp) SARS-CoV-2 Betacoronavirus RNA genome consists of an ORF1a encoding 10 nonstructural proteins (nsp1–10) and an ORF1b encoding 16 nonstructural proteins (nsp1–16) in the 5´ untranslated region (UTR). The structural proteins correspond to 4 genes in the 3′ UTR: spike (S), envelope (E), membrane protein (M), and nucleocapsid (N) genes. The mutations and changes in amino acid residues (with position numbers) have been individually presented in the diagram.
Figure 3Graph representing the number of mutations per sample for wave 1 and wave 2. The average mutations per sample observed in wave 1 is 3.36 for 11 isolates and in wave 2 is 9.19 for 21 isolates.
Figure 4Graph representing the number of protein level alteration observed in first and second wave samples.
Statistical analysis showing significant difference between viral load and number of mutation in the two clades from the clinical samples of the patients with COVID-19.
| Mutation | Viral load | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Nucleotide | Amino acid | E gene | RdRp gene | |||||||||
| Median | IQR* | Median | IQR | Median | IQR | Median | IQR | |||||
| V clade (n = 11) | 5 | 1 | < 0.0001 | 3 | 1 | < 0.0001 | 1.59E + 07 | 6.25E + 07 | < 0.001 | 3.42E + 07 | 8.72E + 07 | < 0.001 |
| GH clade (n = 21) | 20 | 5 | 9 | 3.5 | 6.13E + 08 | 7.50E + 09 | 8.98E + 08 | 7.60E + 09 | ||||
*Interquartile range.
**Probability value.