Literature DB >> 34303694

Differential mutation profile of SARS-CoV-2 proteins across deceased and asymptomatic patients.

Abstract

BACKGROUND: The SARS-CoV-2 infection has spread at an alarming rate with many places showing multiple peaks in incidence. Present study analyzes a total of 332 SARS-CoV-2 genome sequences from 114 asymptomatic and 218 deceased patients from twenty-one different countries to assess the mutation profile therein in order to establish the correlation between the clinical status and the observed mutations.
METHODS: The mining of mutations was carried out using the GISAID CoVSurver (www.gisaid.org/epiflu-applications/covsurver-mutations-app) with the reference sequence 'hCoV-19/Wuhan/WIV04/2019' present in NCBI with Accession number NC-045512.2. The impact of the mutations on SARS-CoV-2 proteins mutation was predicted using PredictSNP1(loschmidt.chemi.muni.cz/predictsnp1) which is a meta-server integrating six predictor tools: SIFT, PhD-SNP, PolyPhen-1, PolyPhen-2, MAPP and SNAP. The iStable integrated server (predictor.nchu.edu.tw/iStable) was used to predict shifts in the protein stability due to mutations.
RESULTS: A total of 372 variants were observed in the 332 SARS-CoV-2 sequences with several variants present in multiple patients accounting for a total of 1596 incidences. Asymptomatic and deceased specific mutants constituted 32% and 62% of the repertoire respectively indicating their partial exclusivity. However, the most prevalent mutations were those present in both. Though some parts of the genome are more variable than others but there was clear difference between incidence and prevalence. Non-structural protein 3 (NSP3) with 68 variants had a total of only 105 incidences whereas Spike protein had 346 incidences with just 66 variants. Amongst the Deleterious variants, NSP3 had the highest incidence of 25 followed by NSP2 (16), ORF3a (14) and N (14). Spike protein had just 7 Deleterious variants out of 66.
CONCLUSION: Deceased patients have more Deleterious than Neutral variants as compared to the asymptomatic ones. Further, it appears that the Deleterious variants which decrease protein stability are more significant in pathogenicity of SARS-CoV-2.

Entities: Chemical Disease Gene Mutation Species

Keywords: Asymptomatic; Deceased; Deleterious; Neutral; SARS-CoV-2; Stability

Year: 2021 PMID： 34303694 PMCID： PMC8299203 DOI： 10.1016/j.cbi.2021.109598

Source DB: PubMed Journal: Chem Biol Interact ISSN： 0009-2797 Impact factor: 5.192

Introduction

The causative agent of ongoing COVID-19 global pandemic is Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) which belongs to family Coronaviridae characterized by single strand positive sense RNA genome [1]. As of 31st March 2021, there were 5,52,566 active cases; 1,14,34,301 discharged cases and 1,62,468 deaths in India due to SARS-CoV-2 (https://www.mohfw.gov.in/). The same time, as per WHO there were 2,73,49,248 confirmed cases of COVID-19, including 27,87,593 deaths worldwide due to COVID-19 (covid19.who.int). As the SARS-CoV-2 infection spread at an alarming rate and many places showed multiple peaks in incidence [[2], [3], [4]], the virus accrued mutations in the process. There have been multiple studies to assess and understand the impact of these mutations [[5], [6], [7], [8]]. These reports have focused on novel and recurrent variations and their impact on infectivity and antigenicity. However, the assessment of impact of mutations on SARS-CoV-2 is an ongoing process. Since the mutational profile of SARS-CoV-2 has been reported to be country specific, the analysis of mutations in a country wise manner assumes significance. For example, mutations at nucleotide positions 17746, 17857 and 18060 are exclusively present in North America [9]. Further, some proteins of SARS-CoV-2 play a critical role in viral infection and pathogenesis. For instance, the Spike (S) protein whose interaction with the angiotensin converting-enzyme 2 (ACE2) receptor is pivotal for viral entry into host cell. There are 17 residues in the receptor binding domain (RBD) of S protein which makes contact with the ACE2 receptor and mutations affecting this interaction will directly impact viral pathogenesis [10,11]. A brief summary of all the proteins of SARS-CoV-2 and their function has been provided in Table 1 . The proteins have been listed in descending order of number of mutations observed in present study.

Table 1

Summary of proteins of SARS-CoV-2 and mutations observed in the study.

S No	Protein Feature	ORF Name	Protein Name	RefSeq protein Id	AA Length	Function	Observed Mutations	Mutations/
S No	Protein Feature	ORF Name	Protein Name	RefSeq protein Id	AA Length	Function	Observed Mutations	100 Residues
1	Non-Structural Protein	ORF1a	NSP3 (Papain like Proteinase/PLPro)	YP_009725299	1945	Proteolytic cleavage [31].	68	3.5
2	Structural Protein	ORF 2	Surface Glycoprotein (S Protein)	YP_009724390	1273	Crucial for viral infection, integration, and ingress into the host cell [32].	66	5.18
3	Structural Protein	ORF9	Nucleocapsid Phosphoprotein (N Protein)	YP_009724397	419	Binds with RNA and facilitates the assembly of vRNPs (viral RNA–protein) into virions [33].	29	6.92
4	Non-Structural Protein	ORF1a	NSP2	YP_009725298	638	Disruption of the infected cells microenvironment [31].	27	4.23
5	Non-Structural Protein	ORF1b	NSP12 (RNA Dependent RNA Polymerase)	YP_009725307	932	This RdRp (RNA-dependent RNA-polymerase) protein aids in viral replication and transcription [31].	21	2.25
6	Accessory Protein	ORF3a	ORF3a Protein	YP_009724391	275	Viroporin (ion channel protein) that facilitates viral release [26].	20	7.27
7	Non-Structural Protein	ORF1b	NSP13 (Helicase)	YP_009725308	601	Unwinding RNA helixes [34]	18	3
8	Non-Structural Protein	ORF1b	NSP14 (Exo-ribonuclease)	YP_009725309	527	Proofreading exoribonuclease [35].	17	3.23
9	Non-Structural Protein	ORF1a	NSP4 (Contains Transmembrane Domain 2)	YP_009725300	500	Involved in the creation of two-membrane vesicles [36].	14	2.8
10	Non-Structural Protein	ORF1b	NSP15 (Endo-RNAse)	YP_009725310	346	Destroys viral dsRNA in order to elude recognition by the host [37].	14	4.05
11	Non-Structural Protein	ORF1a	NSP5 (3C-like proteinase/3CLPro)	YP_009725301	306	Cleaving polypeptides to form nonstructural proteins [31].	11	3.59
12	Non-Structural Protein	ORF1a	NSP6 (Putative Transmembrane Domain)	YP_009725302	290	Inhibits auto-phagosomal expansion and the degradation of viral components in the lysosome [36].	9	3.1
13	Accessory Protein	ORF8	ORF8 Protein	YP_009724396	121	Inhibits IFN-I signaling [35].	9	7.44
14	Non-Structural Protein	ORF1b	NSP16 (2′-O-Ribose-Methyltransferase)	YP_009725311	298	Pivotal role in mRNA cap methylation for host immune system evasion [31].	8	2.68
15	Accessory Protein	ORF7a	ORF7a Protein	YP_009724395	121	Induces IFN-I antagonism and inhibit STAT1 phosphorylation [27].	8	6.61
16	Structural Protein	ORF5	Membrane Glycoprotein (M Protein)	YP_009724393	222	M protein interacts with E, S, and N proteins to play a vital role in viral assembly and budding [38].	7	3.15
17	Structural Protein	ORF4	Envelope Protein (E Protein)	YP_009724392	75	Viroporin involved in viral synthesis, assembly, and release [39].	6	8
18	Non-Structural Protein	ORF1a	NSP1 (Leader Protein)	YP_009725297	180	Inhibitory effect on host gene expression and destroys host mRNA [28].	5	2.78
19	Accessory Protein	ORF6	ORF6 -Protein	YP_009724394	61	Involved in mitigating the host's immune reaction and viral replication [40].	5	8.2
20	Non-Structural Protein	ORF1a	NSP8 (Primase)	YP_009725304	198	Interacts with NSP7 to form a hexa-decamer and functions as a primase during replication [41].	4	2.02
21	Non-Structural Protein	ORF1a	NSP10	YP_009725306	139	Stimulating the activities of NSP14 and NSP16 [36].	3	2.16
22	Non-Structural Protein	ORF1a	NSP7 (Primase)	YP_009725303	83	Interacts with NSP8 to form a hexa-decamer and functions as a primase during replication [42].	2	2.41
23	Non-Structural Protein	ORF1a	NSP9 (RNA-Binding Protein)	YP_009725305	113	Serves as a single stranded RNA binding protein in viral replication [28]	1	0.88
24	Non-Structural Protein	ORF1a	NSP11	YP_009725312	13	Its functionality is undefined	0	0
25	Accessory Protein	ORF7b	ORF7b Protein	YP_009725318	43	Induces IFN-I antagonism and inhibits STAT1 & STAT2 phosphorylation [35]	0	0
26	Accessory Protein	ORF10	ORF10 Protein	YP_009725255	38	Its functionality is undefined	0	0

Summary of proteins of SARS-CoV-2 and mutations observed in the study. In a previous study, we performed the mutational analysis of 611 genomes from India extracted in June 2020 and analyzed its impact on proteins. Therein we observed a difference in mutation profile in viral genomes across deceased and asymptomatic patients. While only 11% disease mutations were present in genomes from asymptomatic people, the corresponding value from deceased patients was more than three folds at 38% implying a possible correlation between the nature of mutations and clinical outcome. However, one lacuna of the study was very limited number of samples with available clinical metadata (30 asymptomatic and 15 deceased) [12]. Presently, we expand our sample size by using sequences from twenty-one different countries with an attempt to ascertain the differential mutation profile of deceased and asymptomatic samples in order to understand the possible clinical correlation, if any.

Methods

Sequence Congregation

On 10th November 2020, we retrieved 6853 sequences from GISAID (www.gisaid.org) using the data filter ~ virus name: hCoV-19 - Host: Human - Complete – High Coverage. Subsequently, they were screened for availability of clinical status which revealed 122 and 542 sequences from Asymptomatic and Deceased patients respectively. The final screening parameter applied was that of age wherein all samples of over 60 years were excluded. There are multiple risk factors associated with mortality due to SARS-CoV-2 which include age, hypertension and diabetes. Since a detailed clinical profile was not available for most samples only age was used as an exclusion criterion. Multiple reports suggest deaths of old age people infected with SARS-CoV-2 is often due to co-morbidities and a weakened immune system [13,14]. Since our aim was to ascertain the impact of mutations, if any, on the deaths due to SARS-CoV-2, so we excluded the samples where the chance of other factors contributing to death is high. Thereon, a total of 332 SARS-CoV-2 sequences from 114 Asymptomatic and 218 Deceased patients were included in the study from twenty-one different countries. Details of the studied sequences have been provided in Table 2 and Supplementary file 1. These included 200 males, 72 females and 60 patients for which gender information was not available. There were ten countries which had no asymptomatic patients and five countries with no deceased patients. It is quite plausible that the absence of sequences from the said countries can be attributed to lack of metadata at GISAID repository and do not necessarily imply the absence of such patients in the countries.

Table 2

Country-wise distribution of samples used in the study.

S No	Country	Total	Asymptomatic	Deceased
01	Bangladesh	5	5	0
02	Belgium	2	1	1
03	Brazil	62	2	60
04	Colombia	3	0	3
05	Costa Rica	2	0	2
06	Czech Republic	3	1	2
07	Dominican Republic	3	3	0
08	India	68	28	40
09	Indonesia	2	0	2
10	Italy	3	1	2
11	Japan	59	59	0
12	Kuwait	2	2	0
13	Lebanon	1	0	1
14	Mexico	4	0	4
14	Oman	1	0	1
16	Russia	3	0	3
17	Saudi Arabia	70	0	70
18	South Africa	1	0	1
19	Sri Lanka	1	0	1
20	Turkey	8	8	0
21	United States	29	4	25
		332	114	218

Country-wise distribution of samples used in the study.

Mining of mutations in proteins

The mining of non-synonymous mutations from the selected 332 sequences was done utilizing the GISAID CoVSurver (www.gisaid.org/epiflu-applications/covsurver-mutations-app) web resources for sequence variance analysis. Sequence datasets were aligned with the reference sequence ‘hCoV-19/Wuhan/WIV04/2019’. The reference amino acid sequence of each protein for further analysis was downloaded from the same server. The reference sequence is identical to the NCBI sequence with Accession number NC-_045512.2 from Wuhan, China. It has also been used as reference for our earlier studies on SARS-CoV-2 genomes from India [12,15].

Pathogenicity Prediction of mutations

The impact of the mutations on SARS-CoV-2 proteins mutation was estimated using PredictSNP1(loschmidt.chemi.muni.cz/predictsnp1) which is a meta-server integrating six predictor tools: SIFT, PhD-SNP, PolyPhen-1, PolyPhen-2, MAPP and SNAP. The SNAP and PhD-SNP tools are based on supervised machine learning algorithms that have been trained in massive datasets to “learn” to distinguish between pathogenic and benign variants. The protein sequence and structure method predict how mutations influence the protein phenotype on the basis of the SNP position in the protein structure behind the PolyPhen-1 and PolyPhen-2 predictors. The MAPP and SIFT tools, on the other hand, determine the pathogenicity of mutations based on the conservation of specific amino acids by sequence and evolutionary conservation methods across various species. PredictSNP extracted the individual score of these tools to homogenize the assessment and generate its own confidence score as a percentage of expected accuracy, ranging from 0 to 100%. It has also designed three distinct datasets like PMD, MMP and OVERFIT in order to eliminate bias, duplicity and inconsistency [16]. Subsequently, the variants are categorized as ‘Neutral’ or ‘Deleterious’, and these predictions are more robust and accurate than the prediction provided by any individual predictor.

Stability shifts of protein mutations

The iStable integrated server (predictor.nchu.edu.tw/iStable) encompasses two machine-learning based predictor iMutant and MUpro along with thermodynamics parameters in order to predict shifts in the protein stability related to mutations. i-Mutant, an SVM-based tool, predicts protein stability attributable to mutant form in free energy change value [DDG = DG (New Protein) – DG (Wild Type) in Kcal/mol]. It classifies the variant in terms of decreasing (DDG<0) or increasing (DDG>0) protein stability. MUpro, an SVM and neural network-based tool, evaluates the mutational impact of protein stability on predictive confidence score (Conf. Score). The score varies from −1 to 1, with Conf. Score <0 signifies a decrease in stability and Conf. Score >0 as an increase in protein stability. iStable utilizes sequence information and predicts the meta-result as an increase or decrease in stability in terms of confident score, where a higher score indicates more confidence in the prediction [17].

Results and discussion

Mutation incidence and prevalence

A total of 372 variants were observed in 332 SARS-CoV-2 sequences with several variants being incident in multiple patients accounting for a total of 1596 incidences. The distribution of variants across different proteins of SARS-CoV-2 have been shown in Fig. 1 , Table 1 and Supplementary file 2. The distribution and incidence of variants can be analyzed through several aspects.

Fig. 1

Distribution of variant sites of SARS-CoV-2 proteins across Gender; Patient status (Deceased/Asymptomatic); Pathogenicity (Deleterious/Neutral) and Stability.

Distribution of variant sites of SARS-CoV-2 proteins across Gender; Patient status (Deceased/Asymptomatic); Pathogenicity (Deleterious/Neutral) and Stability. First, the incidence of mutations with respect to gender as summarized in Fig. 2 A. The patients for whom gender information was not available have been mentioned as unknown. There were 45 mutations (12%) present in both males and females and these variants were the most prevalent ones as they accounted for around 69% of the total incidences. Also, the male and female specific 221 and 75 mutations (59% and 20%) accounted for a meagre 17% and 5% of the total incidence respectively. Partly this difference can be attributed to the skewed sample set in favor of the males. Klein and Flanagan have reviewed the studies assessing the variable nature of immune responses in males and females [18] but since the variants most incident are present in both genders in our study and assuming the variations can potentially alter the immune response, the chances of it doing so in a gender-dependent manner seems rare.

Fig. 2

Details of mutations of SARS-CoV-2 proteins. A) Gender wise distribution of variants and its prevalence in studied genomes. B) Distribution of observed variants according to clinical status of patients as per Deceased/Asymptomatic/Both C) Most prevalent variants across proteins observed in the study. D) Timeline of incidence of observed variations. E) Age wise distribution of samples of the present study. Though there were 60 patients with unknown gender but the fact that 90% of the overall mutations in the studied population is present in males, females or both means that the data from samples with unknown gender will not have much of an impact on gender wise distribution and incidence of mutants. Further, almost all of these 60 patients were asymptomatic and belong to the Diamond Princess cruise ship from Japan which has become a benchmark study about how wearing masks reduces viral load wherein even if the person gets infected, chances of it being mild or asymptomatic are enhanced substantially [19,20]. Subsequently, we moved to the second aspect of our analysis in terms of patient status where we screened for mutations specific to the deceased or asymptomatic patients. The distribution and incidence of mutations as per patient status has been shown in Fig. 2B and Supplementary file 2. Interestingly, 22 mutations present in both deceased and asymptomatic patients accounted for 952 (60%) of the total incidence of mutations. Asymptomatic specific mutants encompassing 119 variants (32%) accounted for only 246 (15%) of the total observed mutations whereas the corresponding values for Deceased specific mutants was 231 (62%) variants with 398 (24%) total incidences respectively. The details of these mutations have been provided in Table 3 and Supplementary file 2 and their impact on proteins is discussed later.

Table 3

Distribution of Deleterious (Red) and Neutral (Black) mutations along with D/N ratio (Deleterious/Neutral) across different proteins of SARS-CoV-2 in Deceased (De) and Asymptomatic (As) patients. Both (Bo) denotes presence in Asymptomatic as well as Deceased patients. The impact of codon bias in SARS-COV-2 genomes on incidence of mutations has been reported wherein the most abundant mutations were C > U (46%), G > U (18.2%), U > C (9.4%) and A > G (8.8%) implying an increase in U as a viral strategy to adapt to host codon usage. This results in some mutations being more prevalent than others [21]. Evidently, in the present study as well, some mutations are getting represented at a much higher rate as compared to others. The fifteen most prevalent mutations along with their incident frequencies and localizations have been shown in Fig. 2C. P323L (NSP12) is the most prevalent variant present followed by D614G (Spike protein) with over two hundred incidences each. Most of these prevalent variants are common to both males and females, which is expected as per gender wise distribution data. The timeline for accumulation of mutants has been shown in Fig. 2D and evidently the maximum number of mutations accrued in April 2020 wherein the incidence of COVID-19 was at its peak. The age wise distribution of mutants has been shown in Fig. 2E. An absolute correlation between age and incidence cannot be drawn due to uneven representation of all age groups. Furthermore, since the samples are from different countries and the virus is behaving differently across geographical locations, the data has to be interpreted with caution. Another study encompassing 48,635 SARS-CoV-2 sequences reported a total of 353,341 mutation events compared to Wuhan reference genome. Of these, 256 samples were identical to the reference whereas 48,379 samples possessed at least one mutation. An average of 7.23 mutations per sample was reported in the study which also highlighted country specific average incidence of mutations [22]. Thereby, looking at the mutation incidence with reference to individual countries seemed rational. We analyzed the data in terms of number of mutants per sample for each country as represented in Fig. 3 and Supplementary file 4. Saudi Arabia with maximal representation of 70 samples in the study was contributing 3.93 variants per sample which is amongst the lowest in the group. Contrastingly, Bangladesh with highest value of 10 variants per sample is being represented by just 5 samples in the study. Thus, its explicit that some nations are exhibiting more variations in SARS-CoV-2 as compared to others. Differential health and gene profile of individuals might be factors contributing to it. We also looked at the variant per sample data for deceased and asymptomatic samples respectively for each country as shown in Supplementary file 4. Bangladesh had the highest variants per sample of 10 followed by India (6.36) for asymptomatic patients whereas it was the highest of 9.5 for Indonesia followed by Brazil (6.75) for deceased patients. The fact that some countries had no representation in asymptomatic or deceased samples makes any direct inference plausible but a geographical dependent evolution of the virus is surely supported by the observed data.

Fig. 3

Country wise distribution of variants of SARS-CoV-2 proteins.

Impact of mutations on proteins of SARS-CoV-2

The impact of mutations on SARS-CoV-2 proteins was assessed using three aspects as highlighted in Fig. 4 . These include distribution of variants across SARS-CoV-2 proteins; pathogenicity of the variants in term of deleterious or neutral and assessing the stability of proteins in presence of variants.

Fig. 4

Mutation incidence and its impact on SARS-CoV-2 proteins. A) Number of variants and its prevalence in studied genomes. B) Pathogenicity Prediction of the variants in terms of Deleterious or Neutral. C) Stability shift prediction of the observed variants. The distribution of variants across different SARS-CoV-2 proteins revealed some interesting observations. First, incidence of mutations in some proteins was relatively higher than others as evident in Table 1, Fig. 4a and Supplementary file 2. NSP3 housed a maximum of 68 variants closely followed by Spike protein with 66 variants. N protein and NSP2 with 29 and 27 variants respectively were a distant third and fourth. However, a greater number of mutations does not necessarily imply it being prevalent in the studied samples. To ascertain the same, we calculated the incident mutations per hundred amino acids for all the proteins as shown in Table 1. The proteins have been sorted as per number of observed mutations. NSP3 with 68 mutations had 3.5 mutations per hundred residues whereas ORF8 with just 9 mutations had a corresponding value of 7.44 indicating an unequal distribution of mutations across SARS-COV-2 proteins. A bias in incidence of mutations in SARS-COV-2 proteins has been previously reported as well with surface glycoprotein, nucleocapsid, ORF1ab, and ORF8 exhibiting a higher frequency of mutations as compared to envelop, membrane, ORF6, ORF7a and ORF7b. Further, emergence and stability of variants accounted for their widespread global distribution [23,24]. Thus, some parts of the genome are more variable whereas others are more conserved. These have led to emergence of novel variants around the world which have had varied clinical manifestations. Secondly, there was a clear difference between incidence and prevalence. NSP3 with 68 variants had total incidence of only 105 whereas Spike protein had 346 incidences (66 variants). Comparatively, N protein and NSP12 had 328 (29 variants) and 288 (21 variants) incidences in the study. This clearly implies that a greater number of mutations does not necessarily imply higher prevalence. In order to ascertain the significance of differential incidence and prevalence, we analyzed the pathogenicity of the variants as Deleterious or Neutral which has been shown in Fig. 4b, Table 3 and Supplementary file 3. Their presence across Asymptomatic and Deceased samples across different countries has been discussed in the next section. The basic premise for the study was that a protein having more variants will be contributing to the viral evolution only if they are Deleterious. Neutral mutations wouldn't be affecting the protein per se. In terms of incidence of Deleterious variants, NSP3 had the highest of 25 followed by NSP2 (16), ORF3a (14) and N (14). The NSP3 protein is responsible for proteolytic activity but is not the only such protein. NSP5 also has similar activity and has 4 Deleterious mutations associated exclusively to deceased patients. Spike protein had just 7 Deleterious variants out of 66 which partly explains that despite of so many variants it has not much impacted the viral pathogenesis yet. Though several mutations like K417 N, E484K, N501Y, D614G and P681H in the S protein along with the importance of RBD domain to the virulence of SARS-CoV-2 have been reported but with emerging variants their impact needs to be constantly monitored [10,11,25]. Interestingly, only four proteins NSP1, NSP2, ORF3a and ORF7a had more Deleterious than Neutral mutants. NSP1 is involved in inhibiting host gene expression and NSP2 in disrupting the microenvironment for infected cells. ORF3a and ORF7a are associated with virion release and evading host immune response respectively. The presence of more deleterious mutations therein suggests viral adaptability [[26], [27], [28]]. Further, NSP7 and ORF6 had only Deleterious mutants indicating their susceptibility to mutations (Fig. 4b, Table 3). Thus, we can say that even a protein with very few mutations can be a pivotal factor in viral evolution. Assessment of the protein stability in lieu of mutations was thereby ascertained and has been represented in Fig. 4c. Majority of variants resulted in a decrease in protein stability across all proteins. The three proteins which had maximum variants which increased protein stability included NSP3 (19), Spike (18) and N (12). Also, NSP3 (49) and Spike (48) had highest number of variants which decreased protein stability. The impact of these variants when discussed individually is significant but their presence in population is not in isolation and hence the cumulative impact of mutations incident together is also required.

Differential mutational profile across asymptomatic and deceased samples

The distribution of Deleterious and Neutral mutations across asymptomatic and deceased patients has been shown in Table 3 and Supplementary file 3. There are 12 proteins in which no variant was present in both asymptomatic and deceased samples suggesting a correlation between mutational status and disease profile. Contrastingly, N protein had the highest of 4 variants across both sample sets. Moreover, except for N protein, NSP1, NSP3, NSP9 and NSP14, all other proteins had more variants associated with deceased patients as compared to asymptomatic ones. NSP5 and NSP7 had no mutations from asymptomatic patients whereas only NSP9 had no variants coming from deceased patients. Subsequently we analyzed the ratio of Deleterious to Neutral mutations across asymptomatic and deceased samples as shown in Table 3. Most proteins had a higher D/N ratio for deceased samples indicating the implication of Deleterious mutations therein. The highest D/N ratio was 3.33 for ORF3a followed by 1.4 for N protein. The fact that deceased patients have more Deleterious than Neutral variants as compared to the symptomatic ones is suggestive of these mutations being correlated to the disease status of the samples. Therefore, the exclusive mutations for deceased and asymptomatic samples can serve as a benchmark for management of patients. Transversion (11083G > T) in ORF1ab gene leading to substitution of leucine to phenylalanine in NSP6 has been shown to differentiate between symptomatic and asymptomatic patients. Further, two mutations (26,144G > T) and (1,397G > A) were reported to distinguish between symptomatic and asymptomatic patients respectively [29]. It has also been shown that asymptomatic patients harbor mutations at 9 nucleotide positions (C2939T; C3828T; G21784T; T21846C; T24631C; G28881A; G28882A; G28883C; G29810T) associated with several non-synonymous substitutions across ORF1ab (P892S; S1188L), S (K74 N; I95T) and N (R203K, G204R) proteins [30]. Present study also exhibited the two variants of N protein. However, they are present in both males and females of asymptomatic as well as deceased patients. This signifies the constantly changing genetic landscape in SARS-CoV-2. To further construe the link between the deceased host and the non-synonymous mutations, we analyzed the mutated protein clusters as follows: From Deceased Individuals (De) with Deleterious (D) pathogenicity and Increase in Stability (IS) and From Deceased Individuals (De) with Neutral (N) pathogenicity and Decrease in Stability (DS). The details have been shown in Table 4 and Supplementary file 3. When we analyzed mutations only from deceased patients and with deleterious pathogenicity, we observed that there were nine proteins in which all such mutations had decreasing stability suggesting the possibility of the decreased stability contributing to enhanced pathogenicity. Moreover, there were ten proteins which had mutations with increasing as well as decreasing stability. Amongst these there was N protein which had four variants with increasing stability as compared to three with decreasing stability prediction. All other proteins had more stability decreasing variants as compared to increasing ones. Thus, we can say that the variants with decreasing stability are more significant in pathogenicity of SARS-CoV-2.

Table 4

Correlation between Deleterious mutations from Deceased individuals and Protein Stability.(Variants with increased stability have been shown in green).

Conclusions

A total of 372 variants were observed in 332 SARS-CoV-2 sequences with several variants being incident in multiple patients accounting for a total of 1596 variants. Since, some countries had no representation in asymptomatic or deceased samples, inference about geographical correlation is not plausible from present dataset. However, the deleterious pathogenicity mutations found in deceased patients can serve as guide for patient management which can be based on nine proteins (E; ORF7a; ORF8; NSP6; NSP8; NSP12; NSP13; NSP16 and S) in which twenty-two such mutations had decreasing stability implying its contribution to enhanced pathogenicity.

Ethics approval

Not Applicable.

Availability of data and materials

All data pertaining to the study has been provided as Supplementary material of the manuscript.

CRediT authorship contribution statement

Rezwanuzzaman Laskar: Methodology, Investigation, Formal analysis, Validation. Safdar Ali: Conceptualization, Supervision, Formal analysis.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

5 in total

1. Effect of Aryl-Cyclohexanones and their Derivatives on Macrophage Polarization In Vitro.

Authors: Tainá L Lubschinski; Luiz A E Pollo; Eduarda T B Mohr; Julia S da Rosa; Luigi A Nardino; Louis P Sandjo; Maique W Biavatti; Eduardo M Dalmarco
Journal: Inflammation Date: 2022-03-05 Impact factor: 4.092

2. Phylogenetic analysis and comparative genomics of SARS-CoV-2 from survivor and non-survivor COVID-19 patients in Cordoba, Argentina.

Authors: Nadia B Olivero; Ana S Gonzalez-Reiche; Viviana E Re; Gonzalo M Castro; María B Pisano; Paola Sicilia; María G Barbas; Zenab Khan; Adriana van de Guchte; Jayeeta Dutta; Paulo R Cortes; Mirelys Hernandez-Morfa; Victoria E Zappia; Lucia Ortiz; Ginger Geiger; Daniela Rajao; Daniel R Perez; Harm van Bakel; Jose Echenique
Journal: BMC Genomics Date: 2022-07-14 Impact factor: 4.547

3. Analyzing the interaction of human ACE2 and RBD of spike protein of SARS-CoV-2 in perspective of Omicron variant.

Authors: Arijit Samanta; Syed Sahajada Mahafujul Alam; Safdar Ali; Mehboob Hoque
Journal: EXCLI J Date: 2022-03-10 Impact factor: 4.022

4. Safety and Immunogenicity of COVID-19 BBIBP-CorV Vaccine in Children 3-12 Years Old.

Authors: Khaled Greish; Abdulla Alawadhi; Ahmed Jaradat; Amer Almarabheh; Marwa AlMadhi; Jaleela Jawad; Basma Alsaffar; Ejlal Alalawi; Adel Alsayyad; Afaf Merza; Batool Alalawi; Donia Qayed; Ahmed Humaidan; Manaf Al Qahtani
Journal: Vaccines (Basel) Date: 2022-04-11

5. Editorial: Decoding the genetics of viral evolution.

Authors: Hashim Ali; Sheikh Abdul Rahman; Junaid Akhtar; Syed Asfarul Haque; Safdar Ali
Journal: Front Genet Date: 2022-08-11 Impact factor: 4.772

5 in total