Literature DB >> 33623833

Characterizing genomic variants and mutations in SARS-CoV-2 proteins from Indian isolates.

Jayanta Kumar Das1, Antara Sengupta2, Pabitra Pal Choudhury3, Swarup Roy4.   

Abstract

SARS-CoV-2 is mutating aene">nd creating divergeene">nt variaene">nts by altering the composition of esseene">ntial constitueene">nt proteins. Pharmacologically, it is crucial to understaene">nd the diverse mechaene">nism of mutations for stable vaccine or aene">nti-viral drug desigene">n. Our curreene">nt study conceene">ntrates on all the constitueene">nt proteins of 469 n class="Species">SARS-CoV-2 genome samples, derived from Indian patients. However, the study may easily be extended to the samples across the globe. We perform clustering analysis towards identifying unique variants in each of the SARS-CoV-2 proteins. A total of 536 mutated positions within the coding regions of SARS-CoV-2 proteins are detected among the identified variants from Indian isolates. We quantify mutations by focusing on the unique variants of each SARS-CoV-2 protein. We report the average number of mutation per variant, percentage of mutated positions, synonymous and non-synonymous mutations, mutations occurring in three codon positions and so on. Our study reveals the most susceptible six (06) proteins, which are ORF1ab, Spike (S), Nucleocapsid (N), ORF3a, ORF7a, and ORF8. Several non-synonymous substitutions are observed to be unique in different SARS-CoV-2 proteins. A total of 57 possible deleterious amino acid substitutions are predicted, which may impact on the protein functions. Several mutations show a large decrease in protein stability and are observed in putative functional domains of the proteins that might have some role in disease pathogenesis. We observe a good number of physicochemical property change during above deleterious substitutions.
© 2021 Published by Elsevier Inc.

Entities:  

Keywords:  COVID-19; Codon position; Deleterious substitutions; Functional domain; Non-synonymous mutations; Protein stability

Year:  2021        PMID: 33623833      PMCID: PMC7893251          DOI: 10.1016/j.genrep.2021.101044

Source DB:  PubMed          Journal:  Gene Rep        ISSN: 2452-0144


Introduction

Due to the massive outbreak of COVID-19 disease, caused by the highly infectious n class="Species">novel coronavirus- SARS-CoV-2, the world is passing through a difficult situation. There are seven species of human coronaviruses reported so far that causing diseases in humans. Out of them, four species (HCoV-229E, HKU1, NL63 and OC43) causing mild respiratory apparatus infection which can easily be treated. However, three species, termed as beta coronaviruses (SARS-CoV, MERS-CoV, and SARS-CoV-2), are severe in nature, leads to potentially fatal consequences (Andersen et al., 2020). The scientific community trying hard to decipher parthenogenesis mechanism of SARS-CoV-2 and its therapeutic control, in silico, using various computational tools. An exhaustive study is available in (Das et al., 2020a). Scientists observed a number of variaene">nts among n class="Species">novel coronavirus, SARS-CoV-2, reported from different geographical regions (Joshi and Paul, 2020; Sardar et al., 2020; Chang et al., 2020). Most of the evolutionary changes in the genome of viruses occur due to mutation. In some cases, it is due to insertion or deletion in the genome. In the course of evolution, variations bring novelty (Baer, 2008). The small variations might be beneficial or detrimental for the organism (Loewe and Hill, 2010). The mutational study helps in understanding viral transmission, replication efficiency, and magnitude of virulence of the pathogen (Eaaswarkhanth et al., 2020). A minor change in the genome might lead to the variation in functionality of constituent proteins of the organism (Chaudhuri, 2020). Previous studies revealed significant alternation in structural and pathogenic properties due to even single point mutation in virus proteins (André et al., 2019; Sakai et al., 2017). Characterizing mutations in different functional domains of SARS-CoV-2 genome might help in designing potential vaccine (Kaur et al., 2020). Determining the mutation types (synonymous or non-synonymous) that influence a lot in gene regulation is vital for understanding the role of regulatory variation during evolution (DiMaio and Nathans, 1982; Foy et al., 2003). Studying the mutations at different codon positions is essential, particularly for quantification of synonymous and non-synonymous amino acid substitutions (Plotkin and Kudla, 2011). Though the non-synonymous mutation is primarily crucial (from codon usage bias point of view) as it alters the amino acid, synonymous mutations too have their strong impact (Plotkin and Kudla, 2011; Kristofich et al., 2018; Gustafsson et al., 2004). It is worth to mention that the changes in the physicochemical properties of nucleotides (purine-R or n class="Chemical">pyrimidine-Y) due to the mutations have remarkable biological significance (Lyons and Lauring, 2017; Sengupta et al., 2018; Guo et al., 2017). It is reported that in the case of codons, various evolutionary constraints at different codon positions occur due to the functional constraints imposed by the genetic code and the physicochemical properties of encoded amino acids (Bofkin and Goldman, 2007; Simmons, 2017; Plotkin and Kudla, 2011). For example, mutations at the 2nd position of a codon directly impact the changes in replaced amino acids (hydrophobic to hydrophilic and vice versa). The change is due to the transversion (A ↔ C or A ↔ T or G ↔ C or G ↔ T) (Haig and Hurst, 1991; Wolfenden et al., 1979; Błażej et al., 2017), although A ↔ G or C ↔ T transition is mostly occurring for single point mutation (Beletskii and Bhagwat, 1996; Błażej et al., 2017). Further, the changes in physicochemical properties of amino acids have a significant functional role (Das et al., 2019; Basak et al., 2017). Hence, understanding the genetic diversity is important that might hint towards the susceptible antigen targets of SARS-CoV-2. It can be used for potential therapeutic and prophylactic interventions in order to prevent this deadly outbreak. Mutations in SARS-CoV-2 proteins may lead to different phenotypic changes, and hence virus can adapt to new hosts and environments. In addition, codon bias study helps in revealing the host-virus interaction mechanism in SARS-CoV-2 (Dilucca et al., 2020; Kurland, 1991; Das et al., 2020b). A detailed in-silico study on putative mutations in SARS-CoV-2 is of utmost importaene">nt to understaene">nd aene">ny sigene">nificaene">nt pattern aene">nd its possible impact on the functional aene">nd structural characteristics of the virus. India is the second-largest n class="Species">SARS-CoV-2 infected country in the world. Due to the volume of study, we restricted our current study within Indian isolates only. However, our study can easily be extended to other variants from any part of the world. Although a couple of studies have been carried out to learn various crucial facts about SARS-CoV-2 genome from Indian patient samples (Kaur et al., 2020; Saha et al., 2020; Samaddar et al., 2020), there are certain facts yet to explore. Therefore, in this work, we broadly focused on the mutational study on SARS-CoV-2 genomes, isolated from Indian patient, as discussed in the following section.

Material and methods

Collection of SARS-CoV-2 genome sequences extracted from Indian patients

We collect SARS-CoV-2 geene">nome sequeene">nces isolated from Indiaene">n n class="Species">patients that are achieved in public repositories. Several protein-coding genes are present in each SARS-CoV-2 genome. SARS-CoV-2 encodes different types of essential proteins: (i) nonstructural proteins - polyprotein (ORF1ab), structural proteins - Spike glycoprotein (S), Envelope (E), Membrane (M) and Nucleocapsid (N), and accessory proteins - ORF3a, ORF6, ORF7a, ORF7b, ORF8, and ORF10 (Kim et al., 2020; Yadav et al., 2020; Ruan et al., 2003; Gordon et al., 2020). A complete topological structure (position) of all SARS-CoV-2 proteins is shown in Table 1. Each of these proteins is highly essential and has diverse functional roles. The first full genome sequence of SARS-CoV-2 virus from India sample was reported during February 2020 (Yadav et al., 2020). We collect sequences from NCBI database1 (Supplementary-1). We find around 469 complete SARS-CoV-2 nucleotide sequences. Protein wise, we extract the coding region from each nucleotide sequence and ignore noisy sequences. The final list of obtained unique sequences is utilized for sub-sequence analysis (Table 2).
Table 1

Topological structure of all SARS-CoV-2 proteins shown by respective genomic location. For each protein, the range of CDS region and amino acids used in this paper are numbered starting from 1 to length of the nucleotide or protein sequence.

Gene/proteinGenome location (nucleotide)Protein length (aa)Nucleotide location usedAmino acid location used
ORF1ab266–21,55570961–21,2881–7096
S21,563–25,38412731–38191–1273
ORF3a25,393–26,2202751–8251–275
E26,245–26,472751–2251–75
M26,523–27,1912221–6661–222
ORF627,202–27,387611–1831–61
ORF7a27,394–27,7591211–3631–121
ORF7b27,756–27,887431–1291–43
ORF827,894–28,2591211–3631–121
N28,274–29,5334191–12571–419
ORF1029,558–29,674381–1141–38
Table 2

The number of collected samples from Indian isolates, unique variant, sample to variant ratio in each SARS-CoV-2 protein.

Protein# collected samples# noise free samples# unique variantSample to variant ratio
ORF1ab4624002621.52
E4604453148.33
M4604571825.38
N463455538.58
S462436904.84
ORF3a4594453313.48
ORF64604593153.0
ORF7a4604541141.27
ORF7b4564553151.66
ORF84614511141.00
ORF104604603153.33

Workflow design

Protein specific nucleotide sequences are first clustered to extract set of unique sequences (or unique variants). Next, unique sequences (representative of each group) are aligned using multiple sequence alignment. As a reference sequence we use sequence of SARS-CoV-2 proteins from n class="Gene">Wuhan-Hu-1 (accession no: NC_045512). We compare every variant with the reference sequence to identify and localize mutations. We consider only single point mutation as a substitution. Observed mutations are then analyzed based on the number of synonymous and non-synonymous substitutions, quantification of nucleotide mutations in three different codon positions (1st/2nd/3rd), type of nucleotide mutations and amino acid substitutions. We then characterize non-synonymous amino acid substitutions and their biological implications using various computational tools. Topological structure of all SARS-CoV-2 proteins shown by respective genomic location. For each protein, the raene">nge of CDS region aene">nd amino acids used in this paper are numbered starting from 1 to length of the nucleotide or protein sequence. The number of collected samples from Indian isolates, unique variant, sample to variant ratio in each SARS-CoV-2 protein.

Computational tools and techniques used

We use web-based tool PROVEAN2 and I-mutant3 for functional assessment of single point mutation. PROVEAN (Protein Variation Effect Analyzer), a web server, is used to predict aene">ny non-synonymous amino acid substitution or indel impacts on the biological function of a protein (Choi et al., 2012). The tool predicts two kinds of substitution effects: deleterious effect aene">nd neutral effect on protein function by measuring the combined score of substitution matrix, alignment, the position of substitution with the neighborhood that surrounds the site of variation. The cut-off value of the PROVEAN score is set as −2.5, below which it indicates deleterious substitution, otherwise, neutral. For predicting stability chaene">nges due to mutation, we use I-Mutaene">nt (Capriotti et al., 2005). The tool is designed based on Support Vector Machine (SVM) that produces Gibbs free energy of unfolding (ΔΔG value in kcal/mol, in terms of increased or decreased stability) for each non-synonymous substitution. The stability predictors value ΔΔG < − 0.5 indicates high decrease in stability, whereas, ΔΔG > 0.5 indicates high increase in stability, aene">nd −0.5 < ΔΔG ≤ 0.5 signifies neutral stability. We use simple Python scripting for rest of the quantitative analysis. We report the functionally important mutations identified using the above tools, highlighting the various putative functional domains of SARS-CoV-2 proteins. We also study wild type aene">nd new amino acid chaene">nges in two categories of physicochemical properties, Hydropathy profile (Aftabuddin aene">nd Kundu, 2007), aene">nd side-chain structure (Das et al., 2016). The categorizations are as follows: Hydropathy based classes: The three classes are Hydrophobic (F, M, W, I, V, L, P, A), Hydrophilic (N, C, Q, G, S, T, Y), and Charged (R, D, E, H, K). Side-Chain based classes: According to this grouping, twenty (20) amino acids are clustered into eight groups as Acidic (D, E), Basic (R, H, K), Aromatic (F, W, Y), Aliphatic (A, G, I, L, V), Cyclic (P), Sulfur-containing (C, M), Hydroxyl-containing (S, T), aene">nd Acidic n class="Chemical">amide (N, Q).

Results and discussion

Our first objective is to find out unique variants by clustering the SARS-CoV-2 geene">ne sequeene">nces. We theene">n ideene">ntify point mutation (as substitution) in each obn class="Chemical">served variant by comparing it with the reference sequence. Observed mutations occurring at different codon positions are then classified and quantified based on different perspectives as discussed below.

Clustering of unique variants

The majority of the input genomes are redundant with respect to sequence similarity. We cluster them based on sequence similarity and consider a sequence from each cluster as cluster representative (termed as unique variant). We use a string matching technique to cluster the sequences, where exactly similar sequences are put in a single cluster (Table 2). The cardinality of each cluster indicates the number of similar sequences in that cluster. We report clusters by variant numbering i.e., v1, v2⋯vn; n is the number of clusters or variaene">nts for each n class="Species">SARS-CoV-2 protein. The group of similar sequences belonging to a cluster (or variant) for each SARS-CoV-2 protein is reported with accession numbers (Supplementary-1). We draw phylogenetic tree for each SARS-CoV-2 protein taking all distinct variants and report in Supplementary-2. Our analysis shows that distinct variants in ORF1ab, S, N and ORF3a proteins are comparatively higher than other SARS-CoV-2 proteins, signifying that such proteins are highly susceptible.

Indian vs. world-wide variants

We compare Indian variants with the variants collected from nine (09) major countries such as China (CHN), Bangladesh (BGD), Japan (JPN), Saudi Arabia (SAU), French (FRA), Germany (DEU), Greece (GRC), Italy (ITA), and United States (USA). The protein-specific unique variaene">nts obn class="Chemical">served from all the above countries are reported in Fig. 1(A). We observe a high percentage of unique variants in BGD isolates, followed by Indian isolates. However, the percentage may be an indicator (not conclusive) as the total sample available is non-uniform. We even quantify common variants across nine different countries that are matching with Indian variants is reported in Fig. 1(B). Interestingly, common protein-specific variants are relatively rare while comparing with variants from different countries.
Fig. 1

Comparison of unique variant among ten different countries. (A) Percentage of unique variant in each SARS-CoV-2 protein. The number at the top of the bar indicates the number of noise-free collected samples; (B) number of common unique variant between India and other country.

Comparison of unique variant among ten different countries. (A) Percentage of unique variant in each SARS-CoV-2 protein. The number at the top of the bar indicates the number of noise-free collected samples; (B) number of common unique variaene">nt between India aene">nd other country.

Quantification of observed mutations in SARS-CoV-2 proteins

Among the distinct variants in each SARS-CoV-2 protein, we consider a particular variaene">nt as a refereene">nce sequeene">nce (exactly similar to NC_045512) except n class="Gene">ORF1ab protein. We then compare other variants for studying nucleotide level substitutions. The frequency distribution of the number of mutations for each protein is shown in Fig. 2. We observe at least one mutation in case of five proteins (ORF1ab, ORF3a, S, N, M). The average number of mutations per variant for such proteins is relatively higher (Fig. 3). In case of other six proteins (ORF6, ORF7a, ORF7b, ORF8, ORF10, and E), we observe only single mutation in each variant. Upon examining mutations in SARS-CoV-2 proteins, we observe several substitutions, the majority of which are associated with a single variant. The protein wise mutations are highlighted and reported for all the variants associated with more than one samples (Supplementary-3). We list mutations considering only M, N, and S proteins (having mutations in more than one sample). In the case of E protein, only a single mutation is observed in all the variants. In case of accessory proteins, mutations in more than one sample are observed in ORF3a, and ORF8. Most of which are from the non-synonymous category and having more than one sample frequency. We discuss below few top variants and mutations observed in our candidate SARS-CoV-2 proteins.
Fig. 2

Distribution of observed number of mutations (x-axis) and relative frequency of number of variants (y-axis) for each SARS-CoV-2 protein. The five proteins ORF1ab, ORF3a, S, N, M are observed multiple mutations in different variants, whereas in six proteins, ORF6, ORF7a, ORF7b, ORF8, ORF10 and E are found exactly a single mutation in each variant.

Fig. 3

Average number of mutation per variant. Proteins are ranked by avg. mutation, highest (left) to lowest (right).

ORF1ab protein: We obn class="Chemical">serve several mutations in ORF1ab because this protein is a polyprotein that consists of sixteen non-structural proteins. We compare all Indian SARS-CoV-2 ORF1ab protein variants with the reference sequence (NC_045512). We observe mutation in 40 variants that are associated with more than one sample. The top non-synonymous mutation ([C14144T, P4715L]) is observed in 233 variants of total 359 samples. Here, the first numbering in bracket refers to the nucleotide mutation position, whereas the second numbering refers to the amino acid substitution position. The majority of the mutations are synonymous. Several non-synonymous mutations are observed associated with five or more samples, which are [A2027C, Q676P], [C18304T, L6102F], [C18890T, T6297I], [G15814A, V5272I], [G10818T, L3606F], [C6047A, T2016K], [C13466T, A4489V], [G4601T, S1534I], [C9173T, T3058I], [C14161A, L4721I]. Further, several synonymous mutations observed in more than one sample are [C2772T, F924F], [C18613T, L6205L], [C2571T, C857C], [G4035T, V1345V], [A16248G, L5416L], [C15060T, N5020N], [C3369T, N1123N], [C3819T, D1273D], [C8517T, S2839S], [C11355T, F3785F]. Envelope (E) protein: In the case of n class="Gene">E protein, a total 443 (out of 445) variants are observed that are exactly matching with the reference sequence (NC_045512). The only two non-synonymous mutations of single instance are [G184T, V62F] and [G223T, V75F]. Membrane (M) protein: A total 224 (out of 457) matching samples (with NC_045512) are found in Indiaene">n genome. In M protein, significaene">nt synonymous mutations are obn class="Chemical">served. [C213T, Y71Y] observe in 9 variants of total 223 (≈) 50% samples. In addition, one non-synonymous ([C425T, A142V] in 2 variants of total two samples) and one synonymous ([G429T, V143V] in 2 variants of total four samples) are also observed. Nucleocapsid (N) protein: A total of 204 (out of 455) matching sequences of N protein are obn class="Chemical">served in Indian samples. We observe mutations in fifty three (53) variants. Each of them is associated with more than one sample. The mutation in the top variant (v2) is [C581T, S194L], which is found in 19 variants of a total of 158 samples. The other important non-synonymous mutations associated in more than one sample are [C581T, S194L], [C38T, P13L], [G605A, S202N], [G608A, R203K], [G609A, R203K], [G610C, G204R], [C614T, T205I], [G578T, S193I], and one synonymous mutations is [G578T, S193I] observed in 2. Spike (S) protein: We obn class="Chemical">serve only 11 samples (out of 436) that are exactly similar to the reference S protein. A total of twenty (20) variants in S protein are found to be associated with mutations in more than one sample. Mutations in each variant show either synonymous or non-synonymous or both the categories. For example, we observe mutations in variant v2 [A1841G, D614G] and [T2367C, Y789Y] that are non-synonymous and synonymous, respectively, and associated with 164 samples. Similarly, the variant (v3) shows two synonymous mutations ([C882T, D294D] and [T2367C, Y789Y]), and two non-synonymous mutations ([G162T, L54F] and [A1841G, D614G]) found in 63 samples. Few findings are consistent with the previously reported results. For example, D614G substitution is observed ≈60% in Indian samples (Saha et al., 2020). In our candidate dataset, we observe D614G substitution in ≈ 93% samples covering 77 variants. The majority of the substitutions are in variant v2, along with a synonymous mutation [T2367C, Y789Y] in the same variant. Few other important mutations are found in five and more samples, which are three non-synonymous mutations ([G162T, L54F], [G1749T, E583D], [G2031T, Q677H]) and three synonymous mutation ([T2367C, Y789Y], [C882T, D294D], [G906T, T302T], [T328C, L110L]). ORF3a protein: In n class="Gene">ORF3a protein, we observe only 190 samples (out of 445) in Indian SARS-CoV-2 S proteins, which are exactly similar to the reference sequence. We observe mutations in ORF3a protein of eight (08) variants associated with more than one sample. The top variant is v2 with only non-synonymous mutation [G171T, Q57H] in 17 variants of a total of 234 samples. This non-synonymous mutation (Q57H) is found in Ion channels domain and consistence with previous study (Issa et al., 2020), and shows quite higher percentage ((53%)) in Indian SARS-CoV-2 genome as compared to 17.43% a global study reported in (Issa et al., 2020). Although, another mutation G251V is also found 9.71% of the genomes but, we did not observe this mutation in the Indian SARS-CoV-2 candidate genome. The other observed important mutations associated with more than one sample, where six mutations are non-synonymous ([C121T, L41F], [C277T, H93Y], [G67T, A23S], [C452T, T151I], [G463T, D155Y], [C512T, S171L]), and only one synonymous mutation is [C246T, N82N]. ORF6 protein: In the case of n class="Gene">ORF6 protein, 457 (out of 459) sequences of SARS-CoV-2 Indian samples are exactly matching with the reference sequence. Similar to E protein, we observe two mutations, each associated with only one sample, one is synonymous ([C12T, L4L]), and the other is non-synonymous ([G39T, E13D]). ORF7a aene">nd n class="Gene">ORF7b proteins: All the sequences (except two) from the Indian SARS-CoV-2 genome for both the proteins are matched with the reference sequence (NC_045512). Two non-synonymous mutations are observed ORF7a protein associated with two samples each of in a single variant ([C280G, Q94E], [G283A, E95K]). In ORF7b protein, only two non-synonymous are [C92T, S31L] and [G127A, A43T] with an equal number of samples and variants (only 1). ORF8 protein: Majority (423 out of 451) of the sequences are similar with the reference sequence. The top two variaene">nts are v2 (non-synonymous mutation: [n class="Mutation">T251C, L84S]) with sample frequency 19, and v3 (synonymous mutation: [G108T, P36P]) with sample frequency 2. ORF10 protein: In n class="Gene">ORF10 protein, 457 out of 460 sequences from the Indian SARS-CoV-2 genome are similar to the reference sequence. The only non-synonymous mutation is [L37F] and synonymous mutation is [C109T] with sample frequency 2 and 1. It can be noted that some of the observed mutations in differeene">nt variaene">nts are common (Supplemeene">ntary-3). Therefore, with respect to mutation types those variaene">nts are highly similar. However, we obn class="Chemical">serve a total of 536 mutated positions located in different SARS-CoV-2 proteins in Indian isolates (Table 3). It is noted that the ORF3a protein shows the highest (≈3.96%) number of mutated locations followed by N protein. We observe a few numbers of mutated locations in E, ORF6, ORF7b and ORF10 proteins.
Table 3

Number of mutated positions (or locations) in each SARS-CoV-2 protein.

GeneGene length# mutated positionMutated position (%)
ORF1ab21,2913281.541
E22820.877
M669152.242
N1260493.889
S3822832.172
ORF3a828333.986
ORF618621.075
ORF7a336102.976
ORF7b13221.515
ORF8366102.732
ORF1011721.709

Characterizing the mutations into synonymous and non-synonymous categories

We account for both synonymous and non-synonymous mutations irrespective of any codon positions. Among the observed nucleotide mutations, 541 nucleotide mutations in 536 locations are then characterized in synonymous aene">nd non-synonymous categories (see Fig. 4(A)). Overall, percentage of non-synonymous mutation is more (≈62%, count-333) in comparison to synonymous mutations (≈38%, count-208). Obn class="Chemical">served mutations by the percentage of synonymous and non-synonymous category for all SARS-CoV-2 proteins are shown in Fig. 4(B). Overall, the non-synonymous category percentage is more (except for M protein), where E and ORF7b proteins show 100% non-synonymous mutations.
Fig. 4

Quantification of synonymous and non-synonymous mutation. (A) Percentage of synonymous vs. non-synonymous mutation type in three codon positions taking all proteins together; (B) percentage of non-synonymous and synonymous mutation type in all SARS-CoV-2 protein.

Distribution of observed number of mutations (x-axis) aene">nd relative n class="Gene">frequency of number of variants (y-axis) for each SARS-CoV-2 protein. The five proteins ORF1ab, ORF3a, S, N, M are observed multiple mutations in different variants, whereas in six proteins, ORF6, ORF7a, ORF7b, ORF8, ORF10 and E are found exactly a single mutation in each variant. Average number of mutation per variant. Proteins are ranked by avg. mutation, highest (left) to lowest (right). Number of mutated positions (or locations) in each SARS-CoV-2 protein. Quantification of synonymous and non-synonymous mutation. (A) Percentage of synonymous vs. non-synonymous mutation type in three codon positions taking all proteins together; (B) percentage of non-synonymous and synonymous mutation type in all SARS-CoV-2 protein.

Quantifying mutations in three different positions of codon

In case of any coding region mutation may occurs at any three different codon positions. Mutations at the third (3rd) position of the codon are almost synonymous that is the least functionally constrained. In contrast, the majority of the mutations at 1st and 2nd positions of the codon are non-synonymous that alter amino acid. The second codon position is the most functionally constrained as any change to the second codon position causes a non-synonymous change in the coding sequence. We observe mutations in all the three codon positions for n class="Gene">ORF1ab, M, N, S, ORF3a, and ORF8 genes (Fig. 5(A)). In, and ORF10 genes, mutations are observed in 1st, 3rd, and 1st codon positions, respectively. We observe mutations in 1st and 2nd positions of ORF7b codons. We do not observe any mutation at the 2nd position of codon. It is worth mentioning that most highly mutated genes (ORF1ab, M, N, S, ORF3a) show a higher percentage of mutations at the third position of the codon, i.e., all these are in the synonymous category.
Fig. 5

Percentage of mutation in SARS-CoV-2 proteins in each codon position (1st, 2nd and 3rd). (A) Protein-wise in each codon position, and (B) aggregate by all proteins in three different codon positions; (C) overall percentage of synonymous vs. non-synonymous mutation taking all codon positions and proteins.

Percentage of mutation in SARS-CoV-2 proteins in each codon position (1st, 2nd aene">nd 3rd). (A) Protein-wise in each codon position, aene">nd (B) aggregate by all proteins in three different codon positions; (C) overall percentage of synonymous vs. non-synonymous mutation taking all codon positions aene">nd proteins. We account overall mutations that are taking place in all SARS-CoV-2 proteins (Fig. 5(B)). Mutations at 1st aene">nd 3nd positions of codons are found almost equal (38%), whereas mutations at 2nd position are comparatively less (24%). More thaene">n 90% mutations at 1st codon position are non-synonymous, whereas around 80% mutations at 3rd codon position are synonymous (Fig. 5(C)). Protein-wise the mutations at three different codon positions are reported in Table 4 . In case of non-synonymous mutations, the percentage of mutation at 1st codon position is more (≥50%) for E, n class="Gene">ORF10, and ORF7b protein, whereas at 2nd codon position, the mutation percentage is more (≥50%). For ORF8 and ORF7b, and ORF6 highest percentage of mutation occurs at the 3rd codon position.
Table 4

Percentage of synonymous (syn) and non-synonymous (non-syn) mutation in three different codon positions (CP)-1st/2nd/3rd in each of the SARS-CoV-2 protein.

ProteinCPTypePercentage
E1stNon-syn100.00
M1stNon-syn26.67
M1stSyn6.67
N1stNon-syn33.96
N1stSyn1.89
ORF101stNon-syn50.00
ORF1ab1stNon-syn22.80
ORF1ab1stSyn2.74
ORF3a1stNon-syn33.33
ORF7a1stNon-syn40.00
ORF7b1stNon-syn50.00
ORF81stNon-syn30.00
S1stNon-syn20.48
S1stSyn1.20
M2ndNon-syn20.00
N2ndNon-syn26.42
ORF1ab2ndNon-syn26.44
ORF3a2ndNon-syn30.30
ORF7a2ndNon-syn20.00
ORF7b2ndNon-syn50.00
ORF82ndNon-syn60.00
S2ndNon-syn28.92
M3rdSyn46.67
N3rdNon-syn9.43
N3rdSyn28.30
ORF103rdSyn50.00
ORF1ab3rdNon-syn8.51
ORF1ab3rdSyn39.51
ORF3a3rdNon-syn9.09
ORF3a3rdSyn27.27
ORF63rdNon-syn50.00
ORF63rdSyn50.00
ORF7a3rdSyn40.00
ORF83rdSyn10.00
S3rdNon-syn15.66
S3rdSyn33.73
Percentage of synonymous (syn) and non-synonymous (non-syn) mutation in three different codon positions (CP)-1st/2nd/3rd in each of the n class="Species">SARS-CoV-2 protein.

Characterizing nucleotide mutation types in non-synonymous category

There are twelve possible nucleotide changes that can occur due to nucleotide mutation (see Methods and Materials). Protein-wise observed mutation counts in three different codon positions are shown in Fig. 6 (A). A majority of the nucleotide mutations are obn class="Chemical">served in the 1st codon position. Considering three codon positions (Fig. 6(A)), all 12 nucleotide mutation are observed in ORF1ab proteins followed by S proteins. ORF3a and N show comparatively fewer number of mutations (8 and 7 respectively). Similarly, in E, ORF10, and ORF6 proteins only a single mutation is observed.
Fig. 6

(A) Number of distinct mutation type count shown for non-synonymous category in each protein and each codon position; (B) number of distinct mutation type count shown for non-synonymous category by aggregate all codon positions.

(A) Number of distinct mutation type count shown for non-synonymous category in each protein and each codon position; (B) number of distinct mutation type count shown for non-synonymous category by aggregate all codon positions. Quantification of nucleotide mutation type shows higher for G > T and C > T, and protein hit count also observer maximum for these two mutation types (Fig. 7(A) aene">nd (C)). In terms of perceene">ntage (considering all codon positions), the mostly occurring two mutations are C > T (≈32%) aene">nd G > T (≈30%). Further, these two mutations (C > T aene">nd G > T) are obn class="Chemical">served in 9 (out of taken 11) SARS-CoV-2 proteins followed by G > A (07) and A > C (05) mutations, respectively. T > A mutation is observed to be rare (only 2). Among the two above mostly occurring mutations, G > T is observed within the top two positions (by percentage) in all three codon positions (Table 5 ), whereas C > T is observed only in 1st and 2nd codon positions. The percentage of abundance of other two important mutations, G > A (in 1st codon position) and A > G (in 2nd codon position), is 17% and 12%, respectively. Further, we observe diversity in different codon positions in individual protein (Table 6 ). For example, G > T is mostly occurring at 2nd codon position (ORFF1ab, S), at the 1st codon position (E, M, N ORF3a), at 3rd codon position (ORF6).
Fig. 7

The quantification of nucleotide mutation type in non-synonymous category. (A) Percentage of each type of nucleotide mutation; (B) mutation type by associate number of SARS-CoV-2 protein count.

Table 5

The percentage of nucleotide mutation type for all non-synonymous cases shown for three codon positions independently and arranged by highest to lowest percentage. Mut-type: Mutation type;

Codon position-1st
Codon position-2nd
Codon position-3rd
Mut-typePercentageMut-typePercentageMut-typePercentage
G>T33.09C>T48.98G>T68
C>T25.00G>T16.33A>C6
G>A16.91A>G12.24G>A6
A>G6.62T>C7.48G>C6
A>C5.88G>A6.12C>A4
T>C3.68C>A2.72T>A4
G>C2.94A>C1.36T>G4
C>A2.21G>C1.36A>T2
C>G1.47T>G1.36
A>T0.74A>T0.68
T>A0.74C>G0.68
T>G0.74T>A0.68
Table 6

Percentage of nucleotide mutation type for all non-synonymous cases shown by three codon positions for all proteins. CP: codon position; Mut-type: mutation type.

ProteinCPMut-typePercentage
ORF1ab2C>T23.16
3G>T11.58
1C>T10.53
1G>T10.53
2A>G8.42
1G>A7.37
2T>C4.74
1A>G4.21
2G>A2.63
2G>T2.63
1A>C2.11
1T>C2.11
2C>A1.58
2A>C1.05
3C>A1.05
3G>C1.05
3A>C0.53
2A>T0.53
1C>A0.53
1C>G0.53
3G>A0.53
1G>C0.53
1T>A0.53
2T>A0.53
1T>G0.53
2T>G0.53
E1G>T100.00
M1C>T28.57
2C>T28.57
1G>T28.57
2G>T14.29
N2C>T18.92
1G>T18.92
1G>A10.81
2G>T10.81
1C>T8.11
2G>A5.41
1G>C5.41
3G>T5.41
3A>C2.70
1A>G2.70
1C>A2.70
3G>A2.70
2G>C2.70
3G>C2.70
S2C>T18.52
2G>T18.52
3G>T14.81
1G>T12.96
1C>T5.56
1A>C3.70
2A>G3.70
3T>A3.70
3A>C1.85
1A>T1.85
1C>A1.85
1G>A1.85
2G>A1.85
3G>A1.85
1G>C1.85
2G>C1.85
1T>C1.85
3T>G1.85
ORF3a1G>T20.83
1C>T16.67
2C>T16.67
2G>T12.50
1A>C4.17
3A>T4.17
2C>G4.17
1G>A4.17
3G>T4.17
2T>C4.17
2T>G4.17
3T>G4.17
ORF63G>T100.00
ORF7a1G>A33.33
1C>G16.67
1C>T16.67
2C>T16.67
2G>T16.67
ORF82C>T33.33
1G>T22.22
1A>C11.11
2C>A11.11
2G>A11.11
2T>C11.11
ORF7b2C>T50.00
1G>A50.00
ORF101C>T100.00
The quantification of nucleotide mutation type in non-synonymous category. (A) Percentage of each type of nucleotide mutation; (B) mutation type by associate number of SARS-CoV-2 protein count. The percentage of nucleotide mutation type for all non-synonymous cases shown for three codon positions independently and arranged by highest to lowest percentage. Mut-type: Mutation type; Percentage of nucleotide mutation type for all non-synonymous cases shown by three codon positions for all proteins. CP: codon position; Mut-type: mutation type.

Quantification of non-synonymous amino acid substitutions

As highlighted earlier, there are total 380 amino acid substitutions. Out of 333 non-synonymous substitutions, we observe only 86 distinct substitutions in Indiaene">n n class="Species">SARS-CoV-2 genome (Table 7 ).
Table 7

Amino acid substitution type by associated protein and number of mutated locations in that protein.

Substitution typeProtein (#mutated position)
A>DORF1ab-(2)
A>SM-(1),N-(2),ORF1ab-(4),ORF3a-(3),ORF8-(1),S-(3)
A>TORF1ab-(2),ORF7b-(1)
A>VM-(2),N-(2),ORF1ab-(16),ORF3a-(1),ORF8-(2),S-(4)
C>FORF1ab-(1),S-(3)
C>YORF1ab-(1)
D>EORF1ab-(1)
D>GORF1ab-(4)
D>NN-(1),ORF1ab-(2)
D>YN-(3),ORF1ab-(6),ORF3a-(1),S-(3)
E>DORF1ab-(5),ORF6-(1),S-(2)
E>GORF1ab-(1)
E>KORF1ab-(3),ORF7a-(1)
E>QN-(1),ORF1ab-(1),S-(1)
F>LORF1ab-(1),S-(1)
G>AS-(1)
G>CN-(1),ORF1ab-(3)
G>DORF1ab-(3),S-(1)
G>EORF8-(1)
G>RN-(2),ORF1ab-(1)
G>SN-(1),ORF1ab-(2),S-(1)
G>TN-(2)
G>VORF1ab-(2),ORF3a-(1),ORF7a-(1),S-(1)
G>WN-(1)
H>QORF3a-(1),S-(1)
H>RORF1ab-(2)
H>YM-(1),N-(1),ORF1ab-(5),ORF3a-(1),S-(1)
I>KORF1ab-(1)
I>LORF1ab-(1),ORF8-(1)
I>TORF1ab-(4),ORF3a-(1)
K>EORF1ab-(1)
K>NORF1ab-(7),ORF3a-(1)
K>QORF3a-(1),S-(2)
K>RORF1ab-(7),S-(1)
K>TORF1ab-(1)
L>FM-(1),N-(1),ORF10-(1),ORF1ab-(10),ORF3a-(3),ORF7a-(1),S-(2)
L>IORF1ab-(1)
L>PORF1ab-(2)
L>SORF8-(1)
L>VORF1ab-(1)
L>WORF3a-(1)
M>IN-(2),ORF1ab-(10),S-(3)
N>DORF1ab-(2)
N>HORF1ab-(1)
N>KS-(1)
N>LORF1ab-(2)
N>YS-(1)
P>AORF1ab-(1)
P>LN-(1),ORF1ab-(7),ORF7a-(1),ORF8-(1)
P>RORF3a-(1)
P>SN-(2),ORF1ab-(5),S-(1)
P>TN-(1)
Q>EORF7a-(1)
Q>HORF1ab-(2),S-(4)
Q>KS-(1)
Q>PORF1ab-(1)
Q>RORF1ab-(2),S-(1)
R>CORF1ab-(2)
R>GN-(1)
R>IORF3a-(1)
R>KN-(2)
R>LM-(1),N-(1),ORF1ab-(1),ORF3a-(1)
R>MS-(1)
R>QORF1ab-(1)
R>SN-(1)
S>FORF1ab-(3),S-(2)
S>GORF1ab-(2)
S>IN-(3),ORF1ab-(1),S-(3)
S>LN-(1),ORF1ab-(2),ORF3a-(1),ORF7b-(1)
S>NN-(1)
S>PORF1ab-(2)
S>RORF1ab-(2)
S>TORF1ab-(1)
T>AORF1ab-(3)
T>IN-(3),ORF1ab-(15),ORF3a-(2),S-(4)
T>KORF1ab-(1)
T>MORF1ab-(1)
T>NORF8-(1)
V>AORF1ab-(3)
V>FE-(2),M-(1),ORF1ab-(5),ORF3a-(1)
V>GORF1ab-(1)
V>IORF1ab-(4),ORF3a-(1),ORF7a-(1)
V>LORF1ab-(2),ORF8-(1),S-(1)
W>CORF3a-(1)
W>LS-(2)
Y>HORF1ab-(1),S-(1)
Amino acid substitution type by associated protein and number of mutated locations in that protein. We rank amino acid substitutions by the number of substituted positions (Fig. 8(A)). The top substitutions are A > V, which is observed in 27 locations of six (06) differeene">nt proteins (Fig. 8(B)). The substitution, L > F is obn class="Chemical">served with maximum hit, occurring in 07 proteins, M (1), N (1), ORF10 (1), ORF1ab (10), ORF3a (3), ORF7a (1), and S (2). Overall it is observed in nineteen (19) different positions of Indian SARS-CoV-2. Similarly, several other important substitutions with regards to number of substituted positions and associated SARS-CoV-2 proteins can be seen from Fig. 8(A) and (B). Further, there are few substitutions, which are observed uniquely in different SARS-CoV-2 proteins. For example, A > D, C > Y, D > E, D > G are observed in ORF1ab protein, G > A, N > K, N > Y, R > M are observed in Spike (S) protein. Several other unique substitutions with their count and type in each SARS-CoV-2 proteins are reported in Fig. 9 and Table 7, respectively. It is to be noted that the highly mutated four proteins are ORF1ab, S, N, and ORF3a. The number of mutations per variant in SARS-CoV-2 proteins of Indian isolates is shown in Fig. 3.
Fig. 8

(A) The amino acid substitution type observed with more than two mutated positions in SARS-CoV-2 genome. (B) The amino acid substitution type associated with more than two SARS-CoV-2 proteins.

Fig. 9

The non-synonymous amino acid substitution type count in each of the SARS-CoV-2 protein.

(A) The amino acid substitution type observed with more thaene">n two mutated positions in n class="Species">SARS-CoV-2 genome. (B) The amino acid substitution type associated with more than two SARS-CoV-2 proteins. The non-synonymous amino acid substitution type count in each of the SARS-CoV-2 protein.

Functional assessment of non-synonymous amino acid substitutions

Non-synonymous substitutions are vital as they alter the amino acid that impact on the structural and functional imbalance of the target protein. To understaene">nding the functional alteration during non-synonymous substitutions, we use PROVEAN (Choi aene">nd Chaene">n, 2015) to predict mutation type whether deleterious or neutral. We calculate ΔΔG values (Capriotti et al., 2005) for predicting the stability variations (increase or decrease or neutral). We report the PROVEAN aene">nd ΔΔG scores in Fig. 10 .
Fig. 10

The non-synonymous amino acid substitution categorization by percentage of deleterious and neutral mutation type predicted by PROVEAN score. (A) Percentage is shown for SARS-CoV-2 proteins taking all codon positions together; (B) percentage is shown for three codon positions in each of SARS-CoV-2 proteins.

The non-synonymous amino acid substitution categorization by percentage of deleterious and neutral mutation type predicted by PROVEAN score. (A) Percentage is shown for SARS-CoV-2 proteins taking all codon positions together; (B) perceene">ntage is shown for three codon positions in each of n class="Species">SARS-CoV-2 proteins. It is to be noted that the deleterious percentage is comparatively low for structural proteins (except E) and high for accessory proteins. We predicted a total of 57 (out of 333 non-synonymous substitution) deleterious substitutions as shown in Table 8, Table 9, Table 10 for ORF1ab, structural, aene">nd accessory proteins, respectively. All these substitutions are also listed with n class="Disease">NCBI protein accession number (Supplementary-4). While considering codon positions of all the deleterious substitutions, we observe that the deleterious substitutions occurs mostly in 2nd codon position (≈51%) followed by 40% and 9% in 1st and 3rd codon positions, respectively (Fig. 10). Moreover, few neutral mutations with a considerable decrease in stability are observed that might impact on protein structural conformation. For example, we observe D614G mutation occurred in the 2nd codon position, which is neutral with a large decrease in stability. This mutation can potentially decrease the structural stability (Maitra et al., 2020). The change in Asp with Gly at this position resulting in the enhancement of local conformational entropy (Ramakrishnan and Ramachandran, 1965). The most frequently observed non-synonymous mutations, Q57H in ORF3a protein and S194L in N protein, occurred in 3rd and 2nd codon positions respectively. In ORF7b, ORF8, and M proteins, deleterious substitutions occur only in 2nd position, whereas in case of ORF6 and ORF10 it is in 3rd and 1st places, respectively. For all other cases, deleterious substitutions are observed either in any two or all three codon positions.
Table 8

The non-synonymous amino acid substitutions in ORF1ab protein with the predicted PROVEAN score and ΔΔG prediction value.

SubstitutionPROVEAN scoreTypeΔΔG predictionRIFreq.
G30S−0.673Neutral−1.1581
D33N−0.733Neutral−1.3342
V38F−0.553Neutral−1.4891
G112C−1.223Neutral−1.0871
D147E−1.123Neutral0.0141
V169A0.027Neutral−1.781
G192D−1.198Neutral−1.2381
L204F0.327Neutral−1.1182
S212L0.097Neutral0.2411
T265I−0.693Neutral−0.6761
T283I−0.088Neutral−0.5772
P309A−0.135Neutral−1.881
P309L0.518Neutral−0.7251
G327D−1.072Neutral−0.8951
K338R−0.685Neutral0.0521
A339V−0.465Neutral0.0711
E347D−0.548Neutral−0.3961
H417Y0.379Neutral0.2671
S443P−0.678Neutral−0.2321
G519S−0.633Neutral−1.284
Q575R−0.331Neutral−0.4963
E633D−0.233Neutral−0.3775
E658K−0.707Neutral−0.4471
G662R−1.425Neutral−0.3571
Q676P−0.531Neutral−0.58769
V682L0.035Neutral−1.0271
T882I−0.691Neutral−0.121
P892S0.996Neutral−1.3571
E940D0.515Neutral−0.4141
G989V0.35Neutral−0.3651
D1036G−0.887Neutral−1.5481
P1054L−1.268Neutral−0.4402
T1055I−0.496Neutral−0.3823
E1126D−0.535Neutral−0.5261
P1158S−0.909Neutral−1.7595
H1160Y0.734Neutral0.1143
V1211F−0.667Neutral−0.751
E1251K−0.511Neutral−0.761
A1268T0.092Neutral−0.7842
A1283V−0.232Neutral−0.1522
A1298V0.22Neutral−0.0111
T1429I0.457Neutral−0.551
A1432V0.864Neutral0.0721
S1534I0.319Neutral0.3617
I1551T−0.057Neutral−1.9841
T1573A−2.402Neutral−1.4791
M1588I−0.746Neutral−0.0802
D1625Y−2.143Neutral0.0122
S1733G−1.551Neutral−1.0782
M1769I−0.349Neutral−0.1153
A1812D−0.753Neutral−0.6336
T1822I−0.406Neutral0.101
L1853F−0.808Neutral−1.1961
T1854A−0.326Neutral−1.2881
T1854I−0.193Neutral−0.2531
T1874I−1.364Neutral−0.1631
D1939G−0.936Neutral−1.2861
D1940Y−0.872Neutral−0.1621
Q1943H−0.464Neutral−0.7861
K1973R−0.294Neutral−0.3711
S2015R−0.501Neutral−0.1705
T2016K−0.166Neutral−0.86410
K2029E−0.63Neutral−0.561
K2029N−0.431Neutral−0.6421
P2046L−1.038Neutral−0.6553
T2093I0.565Neutral0.141
S2103F−0.372Neutral0.2461
L2146P−1.386Neutral−1.6171
S2242P0.105Neutral−0.0631
I2307T−0.03Neutral−2.3481
L2323V−0.361Neutral−1.4781
H2357Y0.301Neutral0.3873
S2488F2.899Neutral−0.0521
K2511N−0.966Neutral−0.3802
H2520R−0.243Neutral−0.2932
A2593V−1.178Neutral−0.2401
A2732D−3.463Deleterious−0.6863
P2739L−1.595Neutral−0.6241
H2831Y3.17Neutral0.2761
A2891V−0.835Neutral031
D2980G0.071Neutral−1.2751
A2994V−1.769Neutral−0.0503
T3058I1.463Neutral−0.4847
G3072C−5.058Deleterious−1.0762
M3087I0.614Neutral−0.5653
T3150I0.112Neutral−0.4312
S3158G−0.785Neutral−1.3571
L3338F−3.068Deleterious−1.0962
K3353R−1.343Neutral−0.1311
V3377G−6.124Deleterious−2.5191
Q3390R−0.324Neutral−0.3341
N3405L−4.454Deleterious−0.0505
P3447S−1.913Neutral−1.791
T3453A−0.882Neutral−0.8371
V3475F−2.291Neutral−1.4291
K3499R−0.421Neutral−0.2525
L3606F−1.432Neutral−1614
I3618T−1.397Neutral−1.4971
M3655I0.174Neutral−0.7571
D3681N−0.466Neutral−1.1171
L3711F−0.348Neutral−1.2172
I3731T−0.744Neutral−2.3692
V3759F−1.765Neutral−1.5291
E3909G−3.759Deleterious−1.1791
E3962K−0.041Neutral−0.3431
S3983F−2.722Deleterious−0.3472
R3993C−6.175Deleterious−0.8651
R3993L−5.422Deleterious−0.371
K4069T−2.268Neutral−0.4861
V4073I−0.106Neutral−0.5581
K4081R−0.921Neutral−0.3371
M4116I−0.46Neutral−0.7451
K4176N0.651Neutral−0.3631
V4181I−0.046Neutral−0.771
A4271V−3.278Deleterious−0.2511
A4273V−3.349Deleterious−0.2311
K4451N−0.49Neutral−0.6522
K4483N−1.326Neutral−0.4241
A4487V0.357Neutral−0.2421
A4489V−2.346Neutral−0.31110
D4532G−3.086Deleterious−1.0861
A4577V−1.878Neutral−0.1741
M4588I−1.074Neutral−0.8181
I4593L0.213Neutral−0.971
E4670D−0.609Neutral−0.5952
P4715L−0.446Neutral−0.836359
L4721I−1.085Neutral−1.2977
V4746A−2.528Deleterious−2.0191
M4855I−1.728Neutral−0.8183
C4856F−0.483Neutral−0.2131
L5030F−2.739Deleterious−0.9973
T5035I−0.622Neutral−0.4321
T5036M−1.529Neutral−0.2921
M5060I−0.117Neutral−0.5671
A5091S−1.821Neutral−0.8292
Q5214H1.016Neutral−0.7762
V5272I−0.551Neutral−0.17218
D5285Y−1.381Neutral0.2732
T5300I−0.542Neutral−0.4241
S5305L−2.332Neutral0.2241
P5377S−0.897Neutral−1.7392
H5488Y0.534Neutral0.1971
E5492Q−2.053Neutral−0.771
G5530C−2.742Deleterious−0.8244
H5569R1.004Neutral−0.151
V5571F−1.592Neutral−1.6282
Y5577H−0.845Neutral−1.5372
S5583T−0.858Neutral−0.6641
P5624L−5.36Deleterious−0.6372
R5766Q0.366Neutral−0.9181
F5823L−3.989Deleterious−1.1751
A5926S0.351Neutral−1.08101
N5928H−0.711Neutral−0.7791
K5957R−0.861Neutral−0.0901
I5970K−1.93Neutral−2.1891
M5997I−0.985Neutral−0.9681
G6039V−6.16Deleterious−0.1411
A6044V2.2Neutral0.1421
P6065S0.176Neutral−1.6782
L6082F−0.771Neutral−1.0461
R6088C−5.465Deleterious−1.273
L6102F−1.397Neutral−1.05467
S6180R−1.897Neutral0.1641
A6199S−2.053Neutral−0.6281
D6249Y0.823Neutral−0.1631
K6274N−0.353Neutral−0.1833
T6297I−0.448Neutral−0.8345
N6313D−3.422Deleterious−0.6271
P6368L−6.762Deleterious−0.7961
V6385L−0.789Neutral−1.0373
K6464N−1.404Neutral−0.4221
T6500I−1.557Neutral−0.4251
A6533V−0.465Neutral−0.3553
D6580Y−0.868Neutral−0.5951
G6581D−2.423Neutral−1.1272
A6589V−0.154Neutral−0.1731
V6600A−2.262Neutral−1.8494
L6614F−1.53Neutral−1.2871
A6623T0.442Neutral−0.7661
V6688I−0.141Neutral−0.4941
M6723I−0.049Neutral−0.8561
C6742Y−0.068Neutral−0.1801
D6900Y−3.735Deleterious−0.4111
L6909F−0.541Neutral−1.0451
A6914S−0.017Neutral−1.0452
A6914V−0.428Neutral−0.0311
K6958R−0.492Neutral−0.1752
P7034L0.713Neutral−0.8851
N7083D−1.153Neutral−0.4221

The substitutions with either high PROVEAN score (< − 2.5, type: deleterious) or large increase stability (ΔΔG < − 0.5) or both are shown in bold.

Table 9

The functional assessment of non-synonymous amino acid substitutions in four structural SARS-CoV-2 proteins (E, M, N, S). The functional assessment of mutation is predicted on utilizing two different measures (PROVEAN score and stability value).

ProteinSubstitutionPROVEAN scoreTypeΔΔG predictionRIFreq.
EV62F0Neutral−1.6581
V75F−1.414Neutral−1.5791
MA142V0.18Neutral0.2552
L29F−1.646Neutral−0.9171
A63V−1.937Neutral0.1401
A69S−1.991Neutral−0.8291
V70F−1.365Neutral−1.66101
R107L−4.03Deleterious−0.3341
H125Y0.799Neutral0.0651
NS194L−4.272Deleterious−0.382158
P13L−1.23Neutral−0.48323
S202N−0.404Neutral−0.78020
R203K−1.604Neutral−0.93614
R203K−1.604Neutral−0.93614
G204R−1.656Neutral−0.52514
T205I−1.562Neutral−0.5336
S193I−2.755Deleterious−0.3624
G97S−1.98Neutral−1.3383
A156S−0.457Neutral−0.8393
P6T−0.223Neutral−1.182
S33I−1.372Neutral0.2762
S180I−3.465Deleterious−0.1432
M234I0.044Neutral−0.0312
D22N−0.541Neutral−1.461
D22N−0.541Neutral−1.461
E31Q0.054Neutral−0.751
G34W−1.609Neutral−0.1321
R92S−3.718Deleterious−1.2361
G120R−0.733Neutral−0.2911
A134V−2.811Deleterious−0.1221
L139F−0.697Neutral−0.8581
D144Y−1.764Neutral0.211
A152S1.463Neutral−0.9291
R191L−3.269Deleterious−0.5831
R203G−3.247Deleterious−1.671
R203K−1.604Neutral−0.9361
G204R−1.656Neutral−0.5251
G204T−1.76Neutral−0.9671
A218V0.171Neutral0.2111
M234I0.044Neutral−0.0311
G236C−2.269Neutral−0.2751
H300Y−1.577Neutral0.4651
P302S−4.043Deleterious−1.371
P344S−4.031Deleterious−1.4681
D348Y−0.588Neutral−0.4121
T362I−1.722Neutral−0.3531
T393I−0.613Neutral0.121
SD614G0.598Neutral−0.933405
L54F−0.435Neutral−1.14480
E583D−0.819Neutral−0.44314
R78M0.986Neutral−0.84712
T572I−0.649Neutral0310
Q677H0.002Neutral−0.6755
L5F−1.126Neutral−0.9833
Q690H−0.796Neutral−0.8663
S12F−0.65Neutral0.1422
W152L−0.159Neutral−0.8972
S155I−0.503Neutral042
M177I0.579Neutral−0.6152
G181A0.396Neutral−0.5812
W258L−1.084Neutral−0.6572
A706S0.183Neutral−0.8592
A879S−0.361Neutral0.5472
H1083Q−1.006Neutral−0.3452
C1243F−4.53Deleterious−0.0922
F2L−0.902Neutral−1.2291
S13I−1.187Neutral0.2701
Y28H−0.443Neutral−1.3841
G35V−2.112Neutral−0.6871
T76I−0.115Neutral−0.7261
K97Q−0.113Neutral−0.9271
N148Y−0.177Neutral0.141
M153I0.244Neutral−0.9861
E156D0.958Neutral−0.5241
S162I0.231Neutral0.0211
Q173H−0.299Neutral−1.0271
S255F−0.423Neutral−0.0331
G261S0.485Neutral−1.1381
A262S0.154Neutral−0.6491
Q271R−0.48Neutral−0.2751
C301F−8.689Deleterious0.241
E471Q0.445Neutral−0.5971
D574Y0.858Neutral0.3621
Q613H−0.917Neutral−0.8661
H655Y−0.814Neutral0.0841
A688V0.498Neutral−0.3751
A701V0.597Neutral−0.2541
M731I−0.598Neutral−0.2531
K795Q0.072Neutral−0.6131
P809S1.024Neutral−1.5581
T827I−0.378Neutral−0.4561
A892V−1.901Neutral0.211
A930V−3.727Deleterious−0.231
T1077I−1.511Neutral−0.1311
V1104L−0.604Neutral−0.711
D1153Y−3.275Deleterious−1.5221
K1181R−0.522Neutral−0.4871
N1187K−0.467Neutral−0.2941
Q1201K1.409Neutral−0.2931
C1250F−5.057Deleterious−0.0921
D1259Y3.924Neutral−0.2131

The substitutions with either high PROVEAN score (< − 2.5, type: deleterious) or large increase stability (ΔΔG < − 0.5) or both are shown in bold.

Table 10

The functional assessment of non-synonymous amino acid substitutions in six SARS-CoV-2 accessories proteins (ORF3a, ORF6, ORF7a, ORF7b, ORF8, ORF10). The functional assessment of mutation is predicted on utilizing two different measures (PROVEAN score and stability value).

ProteinSubstitutionPROVEAN scoreTypeΔΔG predictionRIFreq.
ORF3aG18V−1.571Neutral−0.2861
K21Q0.657Neutral−0.4711
A23S−1.638Neutral−0.8692
I35T−2.619Deleterious−2.3991
L41F−2.724Deleterious−1.0874
P42R−5.495Deleterious−0.9671
V50I−0.657Neutral−0.8481
L53F−3.962Deleterious−1.0971
A54S−1.638Neutral−0.681
Q57H−3.286Deleterious−0.97234
K66N3.486Neutral−0.1611
R68I−1.562Neutral0.1731
V77F2.638Neutral−1.3781
L86W−3.943Deleterious−1.1311
H93Y−3.943Deleterious0.363
A103V−2.876Deleterious0.251
L108F−3.4Deleterious−1.2461
W131C−7.752Deleterious−1.2981
R134L−1.543Neutral−0.4791
A143S0.724Neutral−0.9591
T151I−4.886Deleterious−0.2902
D155Y−6.829Deleterious0.2102
S171L−2.238Neutral−0.2202
T175I2.562Neutral−0.0441
ORF6E13D−2.786Deleterious−0.2441
ORF7aQ94E−1Neutral−0.2422
E95K−2.614Deleterious−0.682
G38V−6.526Deleterious−0.441
P45L−10Deleterious−0.741
V71I−0.667Neutral−0.2451
L116F−1.263Neutral−0.8571
ORF7bS31L−6Deleterious0.2311
A43T0Neutral−0.4451
ORF8L84S2.333Neutral−2.29819
G8E−3.056Deleterious−0.611
T12N−1.056Neutral−0.7111
A14S0.833Neutral−0.4761
A51V−1.222Neutral−0.0621
V62L−0.722Neutral−0.851
A65V1.222Neutral0.0211
P85L−8.778Deleterious−0.7371
I121L−0.278Neutral−0.7951
ORF10L37FNADeleterious−0.9961

The substitutions with either high PROVEAN score (< − 2.5, type: deleterious) or large increase stability (ΔΔG < − 0.5) or both are shown in bold.

We also predict the stability impact of single point mutations, where most of the substitutions show a large decrease in stability. We find a total of 32 single point deleterious substitutions (7 from structural proteins) out of total 57 with ln class="Chemical">arge decrease in stability (ΔΔG < − 0.5). We highlight all the functional domains of all such non-synonymous deleterious substitutions in Table 11 . Additionally, we study the changes in physicochemical properties during such substitutions. A few numbers of substitution leads to change physicochemical property both in hydropathy class and side-chain structural classes (Table 11).
Table 11

The 57 deleterious amino acid substitutions in different SARS-CoV-2 proteins highlighted with the putative functional domain and physicochemical property changes. The mutations with large decrease stability (ΔΔG < − 0.5) are shown in bold.

ProteinSubstitutionPutative functional domainHydropathy changeChemical property change
ORF1abA2732DNSP3Hydrophobic to chargeAliphatic to acidic
G3072CNSP4Hydrophilic (unchanged)Aliphatic to sulfur containing
L3338FNSP5 (3CLpro)Hydrophobic (unchanged)Aliphatic to aromatic
V3377GNSP5 (3CLpro)Hydrophobic to hydrophilicAliphatic to aliphatic
N3405LNSP5 (3CLpro)Hydrophilic to hydrophobicAcidic amide to aliphatic
E3909GNSP7Charge to hydrophilicAcidic to aliphatic
S3983FNSP8Hydrophilic to hydrophobicHydroxyl containing to aromatic
R3993CNSP8Charge to hydrophilicBasic to sulfur containing
R3993LNSP8Charge to hydrophilicBasic to aliphatic
A4271VNSP10Hydrophobic (unchanged)Aliphatic (unchanged)
A4273VNSP10Hydrophobic (unchanged)Aliphatic (unchanged)
D4532GNSP12 (RdRp)Charge to hydrophilicAcidic to aliphatic
V4746ANSP12 (RdRp)Hydrophobic (unchanged)Aliphatic (unchanged)
L5030FNSP12 (RdRp)Hydrophobic (unchanged)Aliphatic to aromatic
G5530CNSP13 (helicase)Hydrophilic (unchanged)Aliphatic to sulfur containing
P5624LNSP13 (helicase)Hydrophobic (unchanged)Cyclic to aliphatic
F5823LNSP13 (helicase)Hydrophobic (unchanged)Aromatic to aliphatic
G6039VNSP14 (exonuclease)Hydrophilic to hydrophobicAliphatic to aliphatic
R6088CNSP14 (exonuclease)Charge to hydrophilicBasic to sulfur containing
N6313DNSP14 (exonuclease)Hydrophilic to chargeAcidic amide to acidic
P6368LNSP14 (exonuclease)Hydrophobic (unchanged)Cyclic to aliphatic
D6900YNSP16Charge to hydrophobicAcidic to aromatic
MR107LTopological domainCharge to hydrophobicBasic to aliphatic
NR92SNTDCharge to hydrophilicBasic to hydroxyl containing
A134VNTDHydrophobic (unchanged)Aliphatic (unchanged)
S180ISR-rich linkerHydrophilic (unchanged)Hydroxyl containing to aliphatic
R191LSR-rich linkerCharge to hydrophobicBasic to aliphatic
S193ISR-rich linkerHydrophilic to hydrophobicHydroxyl containing to aliphatic
S194LSR-rich linkerHydrophilic to hydrophobicHydroxyl containing to aliphatic
R203GSR-rich linkerCharge to hydrophilicBasic to aliphatic
P302SCTDHydrophobic to hydrophilicCyclic to hydroxyl containing
P344SCTDHydrophobic to hydrophilicCyclic to hydroxyl containing
SC301FS1 (N-terminal)Hydrophilic (unchanged)Sulfur containing to aromatic
A930VS2 (HR-1)Hydrophobic (unchanged)Aliphatic (unchanged)
D1153YS2 (between HR1 and HR2)Charge to hydrophilicAcidic to aromatic
C1243FS2 (cytoplasm domain)Hydrophilic to hydrophobicSulfur containing to aromatic
C1250FS2 (cytoplasm domain)Hydrophilic to hydrophobicSulfur containing to aromatic
ORF3aI35THydrophobic to hydrophilicAliphatic to hydroxyl containing
L41FTM-1Hydrophobic (unchanged)Aliphatic to aromatic
P42RTM-1Hydrophobic to chargeCyclic to basic
L53FTM-1Hydrophobic (unchanged)Aliphatic to aromatic
Q57HTM-1Hydrophilic to chargeAcidic amide to basic
L86WTM-2Hydrophobic (unchanged)Aliphatic to aromatic
H93YIon channelsCharge to hydrophilicBasic to aromatic
A103VIon channelsHydrophobic (unchanged)Aliphatic (unchanged)
L108FIon channelsHydrophobic (unchanged)Aliphatic to aromatic
W131CIon channelsHydrophobic to hydrophilicAromatic to sulfur containing
T151IC-terminalHydrophilic to hydrophobicHydroxyl containing to aliphatic
D155YC-terminalCharge to hydrophilicAcidic to aromatic
ORF6E13DCharge (unchanged)Acidic (unchanged)
ORF7aG38VLuminal domainHydrophilic (unchanged)Aliphatic to aliphatic
P45LLuminal domainHydrophobic (unchanged)Cyclic to aliphatic
E95KLuminal domainCharge (unchanged)Acidic to basic
ORF7bS31LHydrophilic to hydrophobicHydroxyl containing to aliphatic
ORF8G8EN-terminal (hydrophobic region)Hydrophilic to chargeAliphatic to acidic
P85LHydrophobic (unchanged)Cyclic to aliphatic
ORF10L37FHydrophobic (unchanged)Aliphatic to aromatic
The non-synonymous amino acid substitutions in ORF1ab protein with the predicted PROVEAN score aene">nd ΔΔG prediction value. The substitutions with either high PROVEAN score (< − 2.5, type: deleterious) or large increase stability (ΔΔG < − 0.5) or both are shown in bold. The functional assessment of non-synonymous amino acid substitutions in four structural SARS-CoV-2 proteins (E, M, N, S). The functional assessmeene">nt of mutation is predicted on utilizing two differeene">nt measures (PROVEAN score aene">nd stability value). The substitutions with either high PROVEAN score (< − 2.5, type: deleterious) or large increase stability (ΔΔG < − 0.5) or both are shown in bold. The functional assessment of non-synonymous amino acid substitutions in six SARS-CoV-2 accessories proteins (n class="Gene">ORF3a, ORF6, ORF7a, ORF7b, ORF8, ORF10). The functional assessment of mutation is predicted on utilizing two different measures (PROVEAN score and stability value). The substitutions with either high PROVEAN score (< − 2.5, type: deleterious) or large increase stability (ΔΔG < − 0.5) or both are shown in bold. The 57 deleterious amino acid substitutions in different SARS-CoV-2 proteins highlighted with the putative functional domain aene">nd physicochemical property chaene">nges. The mutations with ln class="Chemical">arge decrease stability (ΔΔG < − 0.5) are shown in bold. The ORF1ab protein consists of several non-structural polyproteins (n class="Gene">NSP1-NSP16). A few deleterious substitutions are detected in putative functional domains of those polyproteins. The 3-chymotrypsin-like cysteine protease (3CLpro) and RNA-dependent RNA polymerase (RdRp) regions located in NSP3 and NSP12 polyproteins, respectively. It’s playing a major role in anti viral drug discovery for SARS-CoV-2 and other coronavirus diseases (ul Qamar et al., 2020; Anand et al., 2003; Calligari et al., 2020). So, the mutations detected in these functional domains might impact protein functions and stability. Three deleterious substitutions are also detected both in 3CLpro and RdRp region of ORF1ab polyprotein. The few deleterious substitutions with large decrease in stability changes are detected in other two important functional domains, namely helicase in NSP13 (Chen et al., 2020; Yu et al., 2012), and exonuclease in NSP14 (Yuen et al., 2020). These are also investigated to inhibit coronavirus (Chen et al., 2020; Shannon et al., 2020; Yu et al., 2012). The Membrane (M) protein is one of the most abundaene">nt structural proteins among n class="Species">coronaviruses protein and has an interaction role with other structural proteins (He et al., 2004; Naskalska et al., 2019). A single deleterious substitution is observed in Topological domain (Bianchi et al., 2020). N protein contains two distinct RNA-binding domains- the N-terminal domain (NTD, 44–179 residues) and the C-terminal domain (CTD, 247–363 residues) (Zeng et al., 2020), responsible for RNA binding and oligomerization, respectively. These two regions are connected by an intrinsically disordered central Ser/n class="Chemical">Arg (SR)-rich linker (Kang et al., 2020), which is responsible for primary phosphorylation. The study on the Nucleocapsid protein of other coronaviruses, several residues of N-terminal domain, is associated with RNA binding and virus infectivity (Saikatendu et al., 2007; Tan et al., 2006; Grossoehme et al., 2009). Among the observed seven deleterious substitutions in N protein, we observe 2 in NTD, 5 in SR-rich linker, and 2 in CTD functional domain (Kang et al., 2020). The S1 subunit (residues: 14–685) and the S2 subunit (residues: 686–1273) in Spike protein regions are responsible for receptor binding aene">nd n class="Gene">membrane fusion, respectively (Huang et al., 2020). The N-terminal domain (residues: 14–305) belongs to the S1 subunit. The S2 subunit consists of several sub-domains, including heptapeptide repeat sequence 1 (HR1) (residues: 912–984), HR2 (residues: 1163–1213), cytoplasm domain (residues: 1237–1273) (Xia et al., 2020). We observe five deleterious substitution in Spike protein, one each in S1 (N-terminal), S2 (HR-1), and S2 subunit in between HR1 and HR2. Two deleterious substitution occur in S2 subunit (Cytoplasm domain) (Huang et al., 2020; Walls et al., 2020). SARS-CoV has three major traene">nsn class="Gene">membrane domains: (i) Transmembrane domain 1 (TM-1) (approx. residues: 34–56), (ii) transmembrane domain 2 (TM-2) (approx. residues: 77–99), and (iii) transmembrane domain 3 (TM-3) (approx. residues: 103–125) available mostly in ORF3a and C-terminal domain with about 160 amino acid residues (Hofman, 1993; Zeng et al., 2004). In connection with approximate residues of SARS-CoV, We observe four mutations in TM-1 and one in TM-2 domains of SARS-CoV-2 ORF3a protein. Four important deleterious substitutions are observed in the Ion channels domain (Domain II, residues: 91–133) (Issa et al., 2020), which is linked to its pro-apoptotic function as observed for other SARS-coronavirus (Chan et al., 2009; Lu et al., 2010). One of the observed mutations is W131C, located in Cysteine rich region (cysteine-rich region overlaps the third membrane-spanning domain) of ORF3a protein (Zeng et al., 2004). This mutation further increases the Cysteine residue in that region that may alter interchain disulfide linkages with the Spike protein of other viral structural proteins. Additionally, two mutations in the C-terminal domain are observed between the last two Cysteine residues (Zeng et al., 2004). Both the proteins n class="Gene">ORF6 and ORF8 do not have any trans-membrane regions, but ORF8 has an hydrophobic signal peptide (residues: 1–15) and chain (residues: 61–121) (Alam et al., 2020). However, they play significant roles in innate immune suppression during viral infection, regulation of molecular functions, virus growth, replication, and host interactions (Alam et al., 2020; Li et al., 2020; Mohammad et al., 2020). A single deleterious substitution (E13D) is found in ORF6 protein and two mutations in ORF8 protein, where one in the Hydrophobic region (G8E) (Alam et al., 2020; Mohammad et al., 2020). The domain of ORF7a protein of Indiaene">n n class="Species">SARS-CoV-2 consists of seven (07) β strands (Alam et al., 2020). A similar result is reported for ORF7a protein of SARS-CoV in (Nelson et al., 2005; Bartlam et al., 2007). It consists of N-terminal signal peptide (residues: 1–15), luminal domain (16–96), transmembrane segment (residues: 97–117), and a 5 residue cytoplasmic tail. Considering the similar organizational domains of SARS-CoV with SARS-CoV-2, three deleterious substitutions are identified in the luminal domain, two of them (G38V and P45L) are located before and after the 3rd β strand.

Conclusion

In this study, we thoroughly investigated and characterized mutations observed in Indiaene">n n class="Species">SARS-CoV-2 genome. We reported variants and mutations observed in all the SARS-CoV-2 proteins belong to both synonymous and non-synonymous categories. We highlighted position-specific mutations in the codons. Non-synonymous amino acid substitutions are analyzed further to predict the functional stability of the proteins. Our study reported a total of 536 mutated positions in the coding region of SARS-CoV-2 proteins. The n class="Gene">ORF3a happens to be the mostly mutated protein (≈4% of total length), followed by three structural proteins (N, M, S). However, both in ORF3a and N proteins, we observed fewer mutation types compared to ORF1ab and S. The number of variants and mutations per variant observed to be maximum for ORF1ab followed by Spike protein. Interestingly, counts for non-synonymous mutations are higher compared to synonymous mutations (except for M protein). Mutations in E and ORF7b proteins are all non-synonymous. Our analysis further reveals that most of the deleterious substitutions with decrease in stability occur in the 2nd position (codon) and putative functional domains. Higher quantity of single point mutation, G > T, is observed both in 1st aene">nd 3rd positions in the codon, whereas mutation C > T, shows maximum occurrence in 2nd codon position. The conclusion drawn purely based on computational aene">nalysis, needs experimental confirmation. Though we restricted our current study on Indiaene">n isolates, it may easily be extended to aene">ny other strains. Overall aene">nalysis might help in better understaene">nding of the possible role in virulence, infectivity, aene">nd virus release in n class="Species">SARS-CoV-2. A further comparative study on the significant mutations observed in Indian isolates may be performed with the strains collected from the rest of the world. The following are the supplementary data related to this article.

Supplementary-1

Accession numbers of SARS-CoV-2 collected sequeene">nces from Indiaene">n isolates.

Supplementary-2

The polygenetic tree for each of the SARS-CoV-2 proteins.

Supplementary-3

The variant with frequency aene">nd respective nucleotide mutation aene">nd type, amino acid chaene">nge, aene">nd number of mutations for each of the n class="Species">SARS-CoV-2 proteins.

Supplementary-4

The observed non-synonymous substitution with thn class="Gene">e protein accession number in each of the SARS-CoV-2 proteins.

CRediT authorship contribution statement

Jayanta Kumar Das: Conceptualization, Data curation, Methodology, Software, Formal analysis, Visualization, Writing - original draft, review & editing. Antara Sengupta: Methodology, Software, Validation, Visualization, Review & editing. Pabitra Pal Choudhury: Review & editing. Swarup Roy: Conceptualization, Supervision, Visualization, Review & editing.

Declaration of competing interest

The authors declare that they have no competing interests.
  63 in total

1.  A quantitative measure of error minimization in the genetic code.

Authors:  D Haig; L D Hurst
Journal:  J Mol Evol       Date:  1991-11       Impact factor: 2.395

2.  Data science in unveiling COVID-19 pathogenesis and diagnosis: evolutionary origin to drug repurposing.

Authors:  Jayanta Kumar Das; Giuseppe Tradigo; Pierangelo Veltri; Pietro H Guzzi; Swarup Roy
Journal:  Brief Bioinform       Date:  2021-03-22       Impact factor: 11.622

3.  Codon Usage and Phenotypic Divergences of SARS-CoV-2 Genes.

Authors:  Maddalena Dilucca; Sergio Forcelloni; Alexandros G Georgakilas; Andrea Giansanti; Athanasia Pavlopoulou
Journal:  Viruses       Date:  2020-04-30       Impact factor: 5.048

4.  Membrane Protein of Human Coronavirus NL63 Is Responsible for Interaction with the Adhesion Receptor.

Authors:  Antonina Naskalska; Agnieszka Dabrowska; Artur Szczepanski; Aleksandra Milewska; Krzysztof Piotr Jasik; Krzysztof Pyrc
Journal:  J Virol       Date:  2019-09-12       Impact factor: 5.103

5.  Two-amino acids change in the nsp4 of SARS coronavirus abolishes viral replication.

Authors:  Yusuke Sakai; Kengo Kawachi; Yutaka Terada; Hiroko Omori; Yoshiharu Matsuura; Wataru Kamitani
Journal:  Virology       Date:  2017-07-21       Impact factor: 3.616

6.  Mathematical Characterization of Protein Sequences Using Patterns as Chemical Group Combinations of Amino Acids.

Authors:  Jayanta Kumar Das; Provas Das; Korak Kumar Ray; Pabitra Pal Choudhury; Siddhartha Sankar Jana
Journal:  PLoS One       Date:  2016-12-08       Impact factor: 3.240

Review 7.  Genetic comparison among various coronavirus strains for the identification of potential vaccine targets of SARS-CoV2.

Authors:  Navpreet Kaur; Rimaljot Singh; Zahid Dar; Rakesh Kumar Bijarnia; Neelima Dhingra; Tanzeer Kaur
Journal:  Infect Genet Evol       Date:  2020-08-01       Impact factor: 3.342

8.  Full-genome sequences of the first two SARS-CoV-2 viruses from India.

Authors:  Pragya D Yadav; Varsha A Potdar; Manohar Lal Choudhary; Dimpal A Nyayanit; Megha Agrawal; Santosh M Jadhav; Triparna D Majumdar; Anita Shete-Aich; Atanu Basu; Priya Abraham; Sarah S Cherian
Journal:  Indian J Med Res       Date:  2020 Feb & Mar       Impact factor: 2.375

9.  Remdesivir and SARS-CoV-2: Structural requirements at both nsp12 RdRp and nsp14 Exonuclease active-sites.

Authors:  Ashleigh Shannon; Nhung Thi-Tuyet Le; Barbara Selisko; Cecilia Eydoux; Karine Alvarez; Jean-Claude Guillemot; Etienne Decroly; Olve Peersen; Francois Ferron; Bruno Canard
Journal:  Antiviral Res       Date:  2020-04-10       Impact factor: 5.970

10.  SARS-CoV-2 and ORF3a: Nonsynonymous Mutations, Functional Domains, and Viral Pathogenesis.

Authors:  Elio Issa; Georgi Merhi; Balig Panossian; Tamara Salloum; Sima Tokajian
Journal:  mSystems       Date:  2020-05-05       Impact factor: 6.496

View more
  7 in total

Review 1.  Intestinal viral infections of nSARS-CoV2 in the Indian community: Risk of virus spread in India.

Authors:  Harjeet S Maan; Deepti Chaurasia; Garima Kapoor; Lokendra Dave; Arshi Siddiqui; Savita Pal; Hari O Singh; Debasis Biswas; Rashmi Chowdhary
Journal:  J Med Virol       Date:  2021-12-03       Impact factor: 20.693

Review 2.  Peptides and peptidomimetics as therapeutic agents for Covid-19.

Authors:  Achyut Dahal; Jafrin Jobayer Sonju; Konstantin G Kousoulas; Seetharama D Jois
Journal:  Pept Sci (Hoboken)       Date:  2021-10-11

Review 3.  Classical and Next-Generation Vaccine Platforms to SARS-CoV-2: Biotechnological Strategies and Genomic Variants.

Authors:  Rachel Siqueira de Queiroz Simões; David Rodríguez-Lázaro
Journal:  Int J Environ Res Public Health       Date:  2022-02-18       Impact factor: 3.390

4.  Genome sequence analysis of SARS-COV-2 isolated from a COVID-19 patient in Erbil, Iraq.

Authors:  Bashdar Mahmud Hussen; Dana Khdr Sabir; Yasin Karim; Karzan Khawaraham Karim; Hazha Jamal Hidayat
Journal:  Appl Nanosci       Date:  2022-02-07       Impact factor: 3.869

Review 5.  Development of antibody resistance in emerging mutant strains of SARS CoV-2: Impediment for COVID-19 vaccines.

Authors:  Narasimha M Beeraka; Olga A Sukocheva; Elena Lukina; Junqi Liu; Ruitai Fan
Journal:  Rev Med Virol       Date:  2022-04-13       Impact factor: 11.043

6.  Multifaceted Assessment of Wastewater-Based Epidemiology for SARS-CoV-2 in Selected Urban Communities in Davao City, Philippines: A Pilot Study.

Authors:  Maria Catherine B Otero; Lyre Anni E Murao; Mary Antoinette G Limen; Daniel Rev A Caalim; Paul Lorenzo A Gaite; Michael G Bacus; Joan T Acaso; Refeim M Miguel; Kahlil Corazo; Ineke E Knot; Homer Sajonia; Francis L de Los Reyes; Caroline Marie B Jaraula; Emmanuel S Baja; Dann Marie N Del Mundo
Journal:  Int J Environ Res Public Health       Date:  2022-07-19       Impact factor: 4.614

7.  Genome sequence diversity of SARS-CoV-2 obtained from clinical samples in Uzbekistan.

Authors:  Alisher Abdullaev; Abrorjon Abdurakhimov; Zebinisa Mirakbarova; Shakhnoza Ibragimova; Vladimir Tsoy; Sharofiddin Nuriddinov; Dilbar Dalimova; Shahlo Turdikulova; Ibrokhim Abdurakhmonov
Journal:  PLoS One       Date:  2022-06-27       Impact factor: 3.752

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.