| Literature DB >> 32889094 |
Indrajit Saha1, Nimisha Ghosh2, Debasree Maity3, Nikhil Sharma4, Kaushik Mitra5.
Abstract
Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) is a threat to the human population and has created a worldwide pandemic. Daily thousands of people are getting affected by the SARS-CoV-2 virus; India being no exception. In this situation, there is no doubt that vaccine is the primary prevention strategy to contain the wave of COVID-19 pandemic. In this regard, genome-wide analysis of SARS-CoV-2 is important to understand its genetic variability. This has motivated us to analyse 566 Indian SARS-CoV-2 sequences using multiple sequence alignment techniques viz. ClustalW, MUSCLE, ClustalO and MAFFT to align and subsequently identify the lists of mutations as substitution, deletion, insertion and SNP. Thereafter, a consensus of these results, called as Consensus Multiple Sequence Alignment (CMSA), is prepared to have the final list of mutations so that the advantages of all four alignment techniques can be preserved. The analysis shows 767, 2025 and 54 unique substitutions, deletions and SNPs in Indian SARS-CoV-2 genomes. More precisely, out of 54 SNPs, 4 SNPs are present close to the 60% of the virus population. The results of this experiment can be useful for virus classification, designing and defining the dose of vaccine for the Indian population.Entities:
Keywords: Multiple sequence alignment; Point mutation; SARS-CoV-2; SNP
Mesh:
Year: 2020 PMID: 32889094 PMCID: PMC7462517 DOI: 10.1016/j.meegid.2020.104522
Source DB: PubMed Journal: Infect Genet Evol ISSN: 1567-1348 Impact factor: 3.342
Fig. 1(A) Pipeline of the workflow (B) Examples of mutations like substitution, deletion and SNP (C) Venn diagram to represent the consensus results of four alignment techniques (D) BioCircos plot to represent the whole virus genome with the frequency of mutations in different tracks (E) SNPs present in more than 10% of population of Indian SARS-CoV-2 genomes.
Mutation results of different methods on Indian SARS-CoV-2 Genomes.
| Method | All Mutations | Substitution | Deletion | Insertion | SNP |
|---|---|---|---|---|---|
| ClustalW | 3384 | 933 | 2449 | 2 | 64 |
| MUSCLE | 3344 | 848 | 2494 | 2 | 65 |
| ClustalO | 3397 | 893 | 2502 | 2 | 66 |
| MAFFT | 3396 | 888 | 2506 | 2 | 66 |
| CMSA | 2792 | 767 | 2025 | 0 | 54 |
Mutation as SNP present in more than 10% of population of Indian SARS-CoV-2 genomes.
| Coordinate of Mutation | Frequency of Mutation in Genomes | Type of Mutation | Change in Nucleotide | Change in Amino Acid | Avg. Entropy | Mapped with Coding Region |
|---|---|---|---|---|---|---|
| 241 | 342 | Substitution | C > T | S > L | 0.7012 | 5’-UTR |
| 3037 | 340 | Substitution | C > T | S > F | 0.6944 | ORF1ab |
| 14,410 | 332 | Substitution | C > T | P > L | 0.7174 | ORF1ab |
| 18,879 | 117 | Substitution | C > T | S > F | 0.5216 | ORF1ab |
| 22,446 | 69 | Substitution | C > T | T > I | 0.3702 | Spike |
| 23,405 | 334 | Substitution | A > G | D > G | 0.7125 | Spike |
| 23,931 | 165 | Substitution | C > T | T > I | 0.6789 | Spike |
| 25,565 | 122 | Substitution | G > T | R > I | 0.5207 | ORF3a |
| 26,737 | 112 | Substitution | C > T | T > I | 0.4969 | Membrane |
| 28,313 | 174 | Substitution | C > T | P > L | 0.6883 | Nucleocapsid |
| 28,856 | 71 | Substitution | C > T | S > L | 0.3899 | Nucleocapsid |
| 28,883 | 64 | Substitution | G > A | R > K | 0.3825 | Nucleocapsid |
| 28,884 | 64 | Substitution | G > A | G > N | 0.3848 | Nucleocapsid |
| 28,885 | 64 | Substitution | G > C | G > T | 0.3848 | Nucleocapsid |