| Literature DB >> 33969236 |
Suruchi Gupta1, Prosenjit Paul2, Ravail Singh1.
Abstract
The spread of SARS-CoV-2 is a global concern that has taken a toll on entire human health. Researchers across the globe have been working in devising the strategies to combat this dreadful disease. Studies focused on genetic variability help design effective drugs and vaccines. Considering this, the present study entails the information regarding the genome-wide mutations detected in the 108 SARS CoV-2 genomes worldwide. We identified a few hypervariable regions localized in orf1ab, spike, and nucleocapsid gene. These nucleotide polymorphisms demonstrated their effect on both codon usage as well as amino acid usage pattern. Altogether the present study provides valuable information that would be helpful to ongoing research on 2019-nCoV vaccine development.Entities:
Keywords: Amino acids; Genetic variations; Hypervariable region; Mutations; S-protein
Year: 2021 PMID: 33969236 PMCID: PMC8096765 DOI: 10.1016/j.genrep.2021.101185
Source DB: PubMed Journal: Gene Rep ISSN: 2452-0144
Fig. 1Flow chart representing the overall work done from retrieval to analysis of genomic sequences.
Fig. 2Phylogenetic tree of 69 SARS CoV-2 genomes representing different cities/states of India.
Fig. 3Phylogenetic tree of SARS CoV-2 genome sampled from 13 selected Indian states.
Fig. 4Phylogenetic tree constructed using 108 (95 global and 13 Indian states) SARS CoV-2 genomes (method: Maximum-likelihood).
Single Nucleotide Polymorphism and amino acid changes identified in the entire Coronavirus genome (World Data).
| Genes | Total SNP | SNP led to change in amino acid | Synonymous mutations |
|---|---|---|---|
| orf1ab | 37 | 16 | 21 |
| spike | 8 | 4 | 4 |
| Orf3a | 4 | 4 | 0 |
| Memberane | 3 | 3 | 0 |
| Orf8 | 3 | 3 | 0 |
| Nucleocapsid | 2 | 2 | 0 |
| Genomic | 26 | ||
| Total | 83 | 32 | 25 |
Hypervariable regions identified from the selected SARS CoV-2 genome sequences (World Data).
| Genomic position | Gene position | Nucleotide in reference | Change nucleotide | Amino acid and codon | Change codon and amino acid | Countries involved |
|---|---|---|---|---|---|---|
| 250 | Genomic | C | T | NA | NA | Belarus, Estonia, Slovakia, Tamil Nadu, Portugal, Ghana, Philippines, Cambodia, Pakistan, Turkey, Egypt, Mexico, Norway, Kenya, Czech Republic, Netherlands, Morocco, Spain, Saudi Arabia, Italy, Bosnia, Nigeria, Croatia, Georgia, USA, Jamaica, Argentina, Serbia, Ahmedabad (India) |
| 1068 | orf1ab | C | T | ACC (Threonine) | ATC (Isoleucine) | Bosnia, Croatia, Nigeria, USA, Georgia, Jamaica, Argentina |
| 3046 | orf1ab | C | T | TTC and Phenylalanine | TTT and Phenylalanine | Belarus, Estonia, Slovakia, Tamil Nadu, Portugal, Ghana, Philippines, Cambodia, Pakistan, Turkey, Egypt, Mexico, Norway, Kenya, Czech Republic, Netherlands, Morocco, Spain, Saudi Arabia, Italy, Bosnia, Nigeria, Croatia, Georgia, USA, Jamaica, Argentina, Serbia, Ahmedabad (India) |
| 11092 | orf1ab | G | T | TTG (Asparagine) | TTT (Phenylalanine) | Pakistan, Turkey, Egypt, Delhi (India), Bihar (India), Singapore, UAE, Kazakhstan |
| 23412 | Spike | A | G | GAT (Aspartate) | GGT (Glycine) | Belarus, Estonia, Slovakia, Tamil Nadu, Portugal, Ghana, Philippines, Cambodia, Pakistan, Turkey, Egypt, Mexico, Norway, Kenya, Czech Republic, Netherlands, Morocco, Spain, Saudi Arabia, Italy, Bosnia, Nigeria, Croatia, Georgia, USA, Jamaica, Argentina, Serbia, Ahmedabad (India) |
| 25572 | orf3a | G | T | CAG (Glutamine) | CAT (Histidie) | Belarus, Estonia, Slovakia, Tamil Nadu, Portugal, Ghana, Philippines, Cambodia, Pakistan, Turkey, Egypt, Mexico, Norway, Kenya, Czech Republic, Netherlands, Morocco, Spain, Saudi Arabia, Italy, Bosnia, Nigeria, Croatia, Georgia, USA, Jamaica, Argentina, Serbia, Ahmedabad (India) |
| 28320 | Nucleocapsid | C | T | CCC (Proline) | CTC (Glutamate) | Pakistan, Turkey, Egypt, Israel, Mumbai (India), Delhi (India), Bihar (India), Singapore |
Fig. 5Multiple sequence alignment of SARS CoV-2 genomes (65 countries including 5 Indian states) representing the hypervariable regions across the entire genome. (Mutations at 1068 bp, 3046 bp and 11,092 bp genomic positions are localized in gene orf1ab), (Mutations at 23412 bp and 25,572 bp are localized in spike gene), (Mutations at 28320 bp are localized in Nucleocapsid (N) gene).
Single Nucleotide Polymorphism and subsequent amino acid changes identified in the SARS CoV-2 genome sequences retrieved from selected states of India.
| Genes | Total SNP | SNP led to chane in amino acid | Synonymous mutations |
|---|---|---|---|
| orf1ab | 24 | 14 | 10 |
| spike | 8 | 4 | 4 |
| Orf3a | 2 | 1 | 1 |
| Memberane | 1 | 0 | 1 |
| Orf8 | 1 | 1 | 0 |
| Nucleocapsid | 1 | 1 | 0 |
| Genomic | 10 | ||
| Total | 47 | 21 | 16 |
Hypervariable regions identified from SARS CoV-2 genome sequences retrieved from selected states of India.
| Genomic position | Gene position | Nucleotide in reference | Change nucleotide | Amino acid and codon | Change codon and amino acid | States involved |
|---|---|---|---|---|---|---|
| 241 | Genomic | C | T | NA | NA | Ahmedabad, Rajasthan, Kolkatta, Odisha, Punjab, Tamil Nadu |
| 3037 | orf1ab | C | T | TTC (Phenylalanine) | TTT (Phenylalanine) | Ahmedabad, Rajasthan, Kolkata, Odisha, Punjab, Tamil Nadu |
| 6312 | orf1ab | C | A | ACA Threonine | AAA and Lysine | Assam, Bihar, Delhi, Mumbai |
| 11083 | orf1ab | G | T | TTG (Leucine) | TTT (Phenylalanine) | Assam, Bihar, Delhi, Ladakh, Mumbai |
| 13730 | orf1ab | C | T | GCT (Alanine) | GTT (Valine) | Assam, Bihar, Delhi, MP, Mumbai |
| 23403 | Spike | GAT (Aspartic acid) | GGT (Glycine) | Ahmedabad, Rajasthan, Kolkata, Odisha, Punjab Tamil nadu | ||
| 23929 | Spike | C | T | TAC (Tyrosine) | TAT (Tyrosine) | Assam, Bihar, Delhi, |
| 28311 | Nucleocapsid | C | T | CCC (Proline) | CTC (Glutamate) | Bihar, Delhi, Mumbai |
Fig. 6Multiple sequence alignment of SARS CoV-2 genomes from 13 selected states of India representing the hypervariable regions across the entire genome (Mutations at 3037 bp, bp and 6312 bp, 11,083 bp, 13,730 bp genomic positions are localized in gene orf1ab), (Mutations at 23403 bp and 23,929 bp are localized in spike gene), (Mutation at 28311 bp are localized in Nucleocapsid (N) gene). Initially we retrieved two sequences from each Indian states, which was grouped into 13 states based on their sequence similarity.
Fig. 7Docked structure of S protein with Remdesivir.
Fig. 8Docked structure of S protein with NAG.