Literature DB >> 33221119

Genetic analysis of SARS-CoV-2 isolates collected from Bangladesh: Insights into the origin, mutational spectrum and possible pathomechanism.

Md Sorwer Alam Parvez1, Mohammad Mahfujur Rahman2, Md Niaz Morshed2, Dolilur Rahman2, Saeed Anwar3, Mohammad Jakir Hosen4.   

Abstract

As the coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), rages across the world, killing hundreds of thousands and infecting millions, researchers are racing against time to elucidate the viral genome. Some Bangladeshi institutes are also in this race, sequenced a few isolates of the virus collected from Bangladesh. Here, we present a genomic analysis of these isolates. The analysis revealed that SARS-CoV-2 isolates sequenced from Dhaka and Chittagong were the lineage of Europe and India, respectively. Our analysis identified a total of 42 mutations, including three large deletions, half of which were synonymous. Most of the missense mutations in Bangladeshi isolates found to have weak effects on the pathogenesis. Some mutations may lead the virus to be less pathogenic than the other countries. Molecular docking analysis to evaluate the effect of the mutations on the interaction between the viral spike proteins and the human ACE2 receptor, though no significant difference was observed. This study provides some preliminary insights into the origin of Bangladeshi SARS-CoV-2 isolates, mutation spectrum and its possible pathomechanism, which may give an essential clue for designing therapeutics and management of COVID-19 in Bangladesh.
Copyright © 2020 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  ACE2 receptor; Bangladeshi isolates; COVID-19; Mutation; SARS-CoV-2; Spike protein

Mesh:

Substances:

Year:  2020        PMID: 33221119      PMCID: PMC7641529          DOI: 10.1016/j.compbiolchem.2020.107413

Source DB:  PubMed          Journal:  Comput Biol Chem        ISSN: 1476-9271            Impact factor:   2.877


Introduction

The coronavirus disease 2019 (COVID-19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Human to human transmission of this viral infection occurs via the droplets expelled during the face to face talking, coughing and sneezing. The time of exposure is very crucial factors for the transmission of infection from an infected person to a healthy person. Prolonged exposure has a high risk of transmission while shot exposure is less likely to transmit. It takes an average of 11.5 days to develop the symptoms of the disease after the successful transmission to a healthy person (Tang et al., 2020; Wiersinga et al., 2020). Common symptoms include fever, cough, fatigue, shortness of breath, nausea, vomiting, and diarrhoea. The disease has emerged as a critical, rapidly evolving global health crisis (Yin et al., 2020; Zheng et al., 2020). More than 6.5 million people have contracted the virus, and nearly 400 thousand have died (CSSE, 2020). In Bangladesh, the COVID-19 was first reported on 7 March by the Institute of Epidemiology Disease Control and Research (IEDCR) (Paul, 2020). Until the end of March, the infection rate was sort of low; however, as the non-therapeutic prevention measures enforced by the government faced enormous challenges, the infection rate raised drastically in April and kept on rising (Nabi and Shovon, 2020). The people did not maintain the social distancing enforced by the government and trend to gather in crowded places (The Business Standard, 2020). Moreover, an inadequacy of testing for COVID-19 diagnosis is a common criticism in Bangladesh (Tithila, 2020). As of 31 August 2020, nearly 313,000 confirmed cases were reported, with a total of 4281 deaths in Bangladesh (IEDCR, 2020). SARS-CoV-2 is a positive-stranded RNA virus with a genome of ∼ 30 kb, encodes structural and non-structural proteins. SARS-CoV-2 is a spherical-shaped enveloped virus and characterized by spike proteins projecting from the virion surface. Generally, the viral structure is formed with some structural proteins such as spike (S), membrane (M), envelope (E) and nucleocapsid (N) protein where S, M and E proteins are embedded in the viral envelope and N protein is located in the core regions (Ashour et al., 2020). Like other RNA viruses, the SARS-CoV-2 is prone to frequent mutations, which makes it challenging to develop therapeutics and vaccines against the virus (Ruan et al., 2003; Wang et al., 2020). Sequence information of both the pathogen and the host would greatly facilitate an effective therapeutic strategy or vaccine development (Seib et al., 2009). Analysis of the genome sequences obtained from a vast array of isolates collected from different regions could provide an idea about the efficacy of the vaccines being developed (Amanat and Krammer, 2020). Henceforth, researchers across the world are running against time to unravel the genomic insights into the virus. Till the end of August 2020, more than 80,000 genome sequences of SARS-CoV-2 has been submitted from different countries, where most of the sequences have come from European countries (∼46,000). About 20,000 complete genome sequences have been submitted from the USA while China has submitted about 1000 genome sequences. In Bangladesh, more than 300 isolates of the virus have been sequenced and deposited in GSAID (Global Initiative on Sharing Avian Influenza Data) database until the end of August 2020. Unfortunately, there is yet a study on the genomics of the SARS-CoV-2 in Bangladeshi isolates. This study aimed to provide some preliminary insights into the genetic structure of all isolates reported in Bangladesh along with the mutational spectrum. It presents the first study on SARS-CoV-2 genomes obtained from Bangladesh, which, in broader terms, would help the therapeutic strategy development and vaccination programs against the virus in the country.

Materials and methods

Retrieval of the genome sequences of SARS-CoV-2

Till the end of August, more than 300 genome sequences of SARS-CoV-2 isolates were deposited from Bangladesh in the GSAID database (https://www.gisaid.org/) and we retrieved all of them from the database. As many of the Bangladeshi people return during the COVID-19 outbreak mainly from China, India, Saudi Arabia, Spain, Italy, Japan, Qatar, Canada, Kuwait, USA, France, Sweden, and Switzerland, the first deposited genome sequence of those countries were also retrieved. Sequence information of the first isolate collected from China was considered as a reference for further analysis.

Multiple sequence alignment and phylogenetic tree reconstruction

Multiple sequence alignment of all genome sequences of Bangladeshi isolates along with other countries was done using MUSCLE alignment program (Edgar, 2004). This alignment file was further proceeded for the reconstruction of the phylogenetic tree with Maximum Likelihood (ML) method using IQ-TREE (Nguyen et al., 2015). Model test was performed by ModelFinder tools to select the best-fit substitution model for the tree reconstruction (Kalyaanamoorthy et al., 2017). A total number of 88 models were tested and the best-fit model (GTR + F+I + G4) was selected according to Bayesian Information Criterion (BIC). Besides, to assess the branch supports, the ultrafast bootstrap was performed adopting UFBoot2 and the number of bootstrap replicates was set to 1000 (Hoang et al., 2018). Finally, the reconstructed phylogenetic tree was visualized and analyzed by iTOL online tool (Letunic and Bork, 2019).

Identification of nucleotide variations in Bangladeshi Strain

To identify the nucleotide variations, we performed multiple sequence alignment using Clustal Omega (Sievers and Higgins, 2014; Madeira et al., 2019), and the sequence of the strain China [EPI_ISL_402124] was used as a reference genome. The alignment file was analyzed using MVIEW program of Clustal Omega (Brown et al., 1998). Till 20th May, only 14 complete genome sequences of SARS-CoV-2 isolates were deposited in the database from Banglaesh and our further analysis was done using these 14 sequences.

Prediction and identification of the viral genes

FGENESV of SoftBerry (http://linux1.softberry.com/berry.phtml), which is a Trained Pattern/Markov chain-based viral gene prediction tools, was adopted for the prediction of the genes as well as the proteins from the viral genomes. Each predicted protein (for each viral genomes) was identified using the Basic Local Alignment Search Tool (BLAST), at the interface of the National Center for Biotechnology Information (NCBI) (Madden, 2013). The identity of each protein was evaluated compared to the proteins of the reference strain.

Detection of mutation Spectrum

Again, Clustal Omega was used for the multiple sequence alignment of each protein, which further analyzed by MVIEW. The amino acid variations were identified in each protein comparing to the protein of the reference strain. Further, both nucleotide variations and amino acid variations were compared to study the types of mutations.

Prediction of mutational effects

The structural and functional effects of the missense variants, along with the stability change, were analyzed using different prediction tools. I-mutant was employed to analyze the stability change where all the parameters were kept in default (Capriotti et al., 2005). Additionally, Mutpred2 was adopted to predict the molecular consequences and functional effect of these mutations (Pejaver et al., 2017).

Homology modeling of spike proteins and model validation

The BLASTp program at the NCBI interface (link) was used to find the most suitable template for homology modeling. Blasting against the protein databank reservoir (PDB) identified spike protein (Human) with PDB ID: 6VSB as a suitable template, as it has 99.59 % sequence similarity and 94 % coverage with the target sequence. The homology modeling of all mutant spike proteins along with the spike protein of the reference was done using SWISS-MODEL (Schwede et al., 2003). The validation of the predicted model was done by adopting Rampage and ERRAT (Colovos and Yeates, 1993; Lovell, 2002).

Molecular docking of Spike Protein with ACE2 receptor

The molecular docking approach was employed to investigate the interaction of mutant spike protein with the human ACE2 receptor. First, the crystal structure of human ACE2 (PDB ID: 6D0G) was obtained from the Protein Data Bank (PDB), and PyMOL was used to clean the structure to remove all the complex molecules and water (Berman et al., 2000; DeLano, 2002). The HDOCK webserver was used for prediction of the interaction between Spike protein and human ACE2 receptor through the protein-protein molecular docking (Yan et al., 2017). PyMOL was also used for the visualization of docking interactions.

Results

Retrieved genome sequence of the SARS-CoV-2

A total number of 311 complete genome sequences of the SARS-CoV-2 isolates from Bangladesh and 12 genome sequence from the isolates of other countries (China, India, Saudi Arabia, Spain, Italy, Japan, Qatar, Canada, Kuwait, USA, France, Sweden, and Switzerland) have been retrieved from GSAID. The strain of Wuhan accession number with EPI_ISL_402124 was considered as the reference strain.

Phylogenetic tree analysis

Phylogenetic tree analysis revealed that Bangladeshi isolates which initially collected from Dhaka were very close to Spain as well as Switzerland whereas the isolates collected from Chittagong were found to share a common ancestor with India and USA (Fig. 1 ). The isolates collected from Chittagong also centroid with the Middle East countries such as Kuwait and Saudi Arabia. Moreover, all the isolates initially collected before the pandemic started in Bangladesh were clustered with the strain of China indicating the same lineage of the virus. However, the phylogenetic distance of the isolates from this lineage increased over time.
Fig. 1

Maximum likelihood phylogenetic tree reconstructed with the sequences of all Bangladeshi isolates and other countries. The value in the nodes represents the bootstrap value of the branches where the branch length represents the evolutionary distance.

Maximum likelihood phylogenetic tree reconstructed with the sequences of all Bangladeshi isolates and other countries. The value in the nodes represents the bootstrap value of the branches where the branch length represents the evolutionary distance.

Predictions of the genes and proteins

FGENESV predicted the presence of 12 genes in the reference. Interestingly, all except five isolates (EPI_ISL_445213, EPI_ISL_445214, EPI_ISL_450342, EPI_ISL_450343, and EPI_ISL_450344) of Bangladesh also showed a similar result. Both isolates EPI_ISL_445213 and EPI_ISL_445217 found to have ten genes (missing of ORF7a and ORF10 genes) and isolate EPI_ISL_450343 and EPI_ISL_450344 have 11 genes (missing ORF8 gene). Multiple sequence alignment revealed that most of the variation in Bangladeshi isolates occurred in the ORF1a polyprotein, surface glycoproteins, and nucleocapsid phosphoprotein. Remarkably, envelope glycoprotein, ORF6, ORF8, and ORF10 were found 100 % identical in most of the isolates compared to the reference sequence (Table 1 ).
Table 1

Predicted number of genes and identity compared to the reference strain. (Legends: S1: EPI_ISL_437912; S2: EPI_ISL_445213; S3: EPI_ISL_445214; S4: EPI_ISL_445215; S5: EPI_ISL_445216; S6: EPI_ISL_445217; S7: EPI_ISL_445244; S8: EPI_ISL_450339; S9: EPI_ISL_450340; S10: EPI_ISL_4503441; S11: EPI_ISL_450342; S12: EPI_ISL_450343; S13: EPI_ISL_450344; S14: EPI_ISL_450345; M: Missing).

NoProteinS1S2S3S4S5S6S7S8S9S11S11S12S13S14
1ORF1a Polyprotein99.9899.9399.9599.9510099.9510099.9599.9899.9899.9599.9899.9899.98
2ORF1b Polyprotein99.9610010010010010010010010010099.96100100100
3Surface Glycoprotein99.9210099.8499.9299.9299.9299.9210010010010099.9299.92100
4ORF3a protein10099.6410099.6410099.6410010010010010099.2799.6499.64
5envelope protein100100100100100100100100100100100100100100
6Membrane Glycoprotein100100100100100100100100100100100100100100
7ORF6 protein10010010010010010010010010010010099.36100100
8ORF7a protein100M100100100M100100100100100100100100
9ORF7b100100100100100100100100100100100100100100
10ORF810010010010010010010099.1710099.1799.17MM99.35
11Neucleocapsid phospoprotein99.5299.7699.2899.2810099.2810099.7699.5299.5299.7699.7699.7699.76
12ORF10100M100100100M100100100100M100100100
Predicted number of genes and identity compared to the reference strain. (Legends: S1: EPI_ISL_437912; S2: EPI_ISL_445213; S3: EPI_ISL_445214; S4: EPI_ISL_445215; S5: EPI_ISL_445216; S6: EPI_ISL_445217; S7: EPI_ISL_445244; S8: EPI_ISL_450339; S9: EPI_ISL_450340; S10: EPI_ISL_4503441; S11: EPI_ISL_450342; S12: EPI_ISL_450343; S13: EPI_ISL_450344; S14: EPI_ISL_450345; M: Missing).

Mutation Spectrum of bangladeshi SARS-CoV-2 isolates

Analysis of all 14 Bangladeshi isolates revealed a total of 42 single nucleotide variants (Fig. 2 ); 24 of them were nonsynonymous missense in character. Besides, three large deletions were also found in those isolates (Table 2 ). Among the deletions, two deletions were responsible for the deletion of ORF7a in EPI_ISL_445213 and EPI_ISL_445217 isolates. Another large deletion from nucleotide 27911–28254, occurred in EPI_ISL_450343 and EPI_ISL_450344 isolates, responsible for the deletion of ORF8 in both isolates. Surprisingly, three consecutive mutations were found at nucleotide position 28882–28884; resulted in two amino acids substitution in nucleocapsid phosphoprotein.
Fig. 2

Variations Plot of SARS-CoV-2 in Bangladeshi isolates.

Table 2

All mutations found in the coding regions of the 14 isolates compared to the reference strain. (Legends: S1: EPI_ISL_437912; S2: EPI_ISL_445213; S3: EPI_ISL_445214; S4: EPI_ISL_445215; S5: EPI_ISL_445216; S6: EPI_ISL_445217; S7: EPI_ISL_445244; S8: EPI_ISL_450339; S9: EPI_ISL_450340; S10: EPI_ISL_4503441; S11: EPI_ISL_450342; S12: EPI_ISL_450343; S13: EPI_ISL_450344; S14: EPI_ISL_450345).

StrainMutationProteinAmino Acid ChangesMutation Types
S11, 14283:C > TORF1a PolyproteinNo changeSynonymous
S9, 10602:C > TORF1a PolyproteinNo ChangeSynonymous
S1,2,3, 4,61164:A > TORF1a PolyproteinI300FMissense
S1,2,3, 4, 5, 6, 7, 12, 133038:C > TORF1a PolyproteinNo ChangeSynonymous
S53689:C > TORF1a PolyproteinNo ChangeSynonymous
S2,3, 4, 64445:G > TORF1a PolyproteinNo ChangeSynonymous
S86730:A > GORF1a PolyproteinN2155SMissense
S2, 3, 4, 68372:G > TORF1a PolyproteinQ2702HMissense
S8, 9, 10, 11, 148783:C > TORF1a PolyproteinNo changeSynonymous
S8, 9, 10, 1110330:A > GORF1a PolyproteinD3355GMissense
S1410871:G > TORF1a PolyproteinK3353RMissense
S210980:G > AORF1a PolyproteinV3572MMissense
S1112120:C > TORF1a PolyproteinP3952SMissense
S812485:C > TORF1a PolyproteinNo ChangeSynonymous
S1, 2, 3, 4, 5, 6, 7, 12, 1314409:C > TORF1ab PolyproteinP214LMissense
S5, 8, 9, 10, 11, 1415325:C > TORF1ab PolyproteinNo ChangeSynonymous
S815739:C > TORF1ab PolyproteinNo changeSynonymous
S415896:C > TORF1ab PolyproteinNo ChangeSynonymous
S117020:G > TORF1ab PolyproteinE1084DMissense
S12, 1318878:C > TORF1ab PolyproteinNo ChangeSynonymous
S1119405:G > AORF1ab PolyproteinV1883TMissense
S12, 1322445:C > TSurface GlycoproteinNo changeSynonymous
S1423321:C > TSurface GlycoproteinNo changeSynonymous
S8, 9, 10, 11, 1422469:G > TSurface GlycoproteinNo changeSynonymous
S1,2, 3, 4, 5, 6, 7, 12, 1323404:A > GSurface GlycoproteinD623GMissense
S324488:T > CSurface GlycoproteinF1118LMissense
S12, 1325495:G > TORF3a proteinNo changeSynonymous
S1425506:A > TORF3a proteinQ38LMissense
S1225512:C > TORF3a proteinS40LMissense
S12, 1325564:G > TORF3a proteinQ57HMissense
S2, 4, 625907:G > TORF3a proteinG172CMissense
S12, 1326736:C > TMembrane GlycoproteinNo ChangeSynonymous
S1227282:G > TORF6 proteinW27LMissense
S227432−27651:DELORF7a proteinWhole protein deletionDeletion
S627486−27613:DELORF7a proteinWhole protein deletionDeletion
S12, 1327911−28254:DELORF8Whole protein deletionDeletion
S1428098:C > TORF8A65VMissense
S8, 9, 10, 11, 1428145:T > CORF8L84SMissense
S8, 9, 10, 11, 1428879:G > ANeucleocapsid phospoproteinS202NMissense
S1,2,3, 4, 628882:G > ANeucleocapsid phospoproteinR203KMissense
S1,2,3, 4, 628883:G > ANeucleocapsid phospoproteinR203KMissense
S1,2,3, 4, 628884:G > CNeucleocapsid phospoproteinG204RMissense
S9, 1029293:G > TNeucleocapsid phospoproteinK373NMissense
S2,3, 4, 629404:A > GNeucleocapsid phospoproteinD377GMissense
S8, 9, 10, 11, 1429643:G > AORF10No ChangeSynonymous
Variations Plot of SARS-CoV-2 in Bangladeshi isolates. All mutations found in the coding regions of the 14 isolates compared to the reference strain. (Legends: S1: EPI_ISL_437912; S2: EPI_ISL_445213; S3: EPI_ISL_445214; S4: EPI_ISL_445215; S5: EPI_ISL_445216; S6: EPI_ISL_445217; S7: EPI_ISL_445244; S8: EPI_ISL_450339; S9: EPI_ISL_450340; S10: EPI_ISL_4503441; S11: EPI_ISL_450342; S12: EPI_ISL_450343; S13: EPI_ISL_450344; S14: EPI_ISL_450345).

Mutational effects

Mutational effects analysis of the 24 missense mutations found that 18 mutations were responsible for decreasing structural stability. Mutations located in the ORF1a polyprotein and surface glycoprotein were predicted to decrease the structural stability of both proteins (Table 3 ). Additionally, three mutations occurring in surface glycoprotein, ORF3a and ORF6 were predicted to alter the molecular consequences, including loss of sulfation in surface glycoprotein and loss of proteolytic cleavage in ORF3a and loss of allosteric site in ORF6 (Table 4 and Supplementary Table 1).
Table 3

Prediction of the mutational effects on the structural stability.

ProteinAmino Acid ChangesSVM2 Prediction EffectDDG Value (kcal/mol)
ORF1a PolyproteinI300FDecrease−1.79
ORF1a PolyproteinN2155SDecrease−0.60
ORF1a PolyproteinQ2702HDecrease−0.68
ORF1a PolyproteinD3355GDecrease−0.95
ORF1a PolyproteinK3353RIncrease−0.13
ORF1a PolyproteinV3572MDecrease−0.88
ORF1a PolyproteinP3952SDecrease−1.21
ORF1b PolyproteinP214LDecrease−0.83
ORF1b PolyproteinE1084DDecrease−0.75
ORF1b PolyproteinV1883TDecrease−1.46
Surface GlycoproteinD623GDecrease−0.93
Surface GlycoproteinF1118LDecrease−0.81
ORF3a proteinQ38LIncrease0.12
ORF3a proteinS40LIncrease0.40
ORF3a proteinQ57HDecrease−0.90
ORF3a proteinG172CDecrease−0.83
ORF6 proteinW27LDecrease−0.96
ORF8A65VIncrease0.02
ORF8L84SDecrease−2.29
Neucleocapsid phospoproteinS202NIncrease−0.78
Neucleocapsid phospoproteinR203KDecrease−0.93
Neucleocapsid phospoproteinG204RDecrease−0.52
Neucleocapsid phospoproteinK373NIncrease−0.10
Neucleocapsid phospoproteinD377GDecrease−0.44
Table 4

Prediction of the mutational effects on the molecular consequences.

Protein NameMutationEffects
Surface GlycoproteinF1118LAltered Ordered interface
Altered Disordered interface
Altered DNA binding
Loss of Sulfation at Y1119
Altered Metal binding
ORF3aG172CLoss of O-linked glycosylation at S171
Gain of Disulfide linkage at G172
Loss of Intrinsic disorder
Altered Transmembrane protein
Altered Ordered interface
Gain of Loop
Loss of Proteolytic cleavage at D173
ORF6W27LAltered Ordered interface
Altered Disordered interface
Loss of Strand
Gain of Helix
Loss of Allosteric site at F22
Gain of Sulfation at Y31
Altered DNA binding
Altered Transmembrane protein
Prediction of the mutational effects on the structural stability. Prediction of the mutational effects on the molecular consequences.

Prediction and validation of the homology models

In total, three models were generated using the template PDB ID: 6VSB; one model for the spike protein of reference strain, and the two others were for two different mutant isolates from Bangladesh (Fig. 3 ). Two types of mutations were found in the spike proteins of all Bangladeshi isolates, where most of the isolates were found to contain a substitution of D623 G. Only one strain, EPI_ISL_445214, found to have two substitutions; one was similar to the previous substitution, and the other was F1118 L. The validation assessment scores of these three models were mostly similar to the template, which provided the reliability of these models (Table 5 ).
Fig. 3

Homology model of the spike proteins; (A) wildtype (B) Model with one mutation: D623 G (C) Model with two mutations: D623 G and F1118 L (D) Superimpose of all models. Here, in B and C, red dot represents the mutation site. In D, purple color represents the wildtype model; the cyan represents a model with one mutation, and the green represents a model with two mutations.

Table 5

Model Validation assessment score.

StructuresRampage Score
ERRAT Score
Favoured RegionAllowed Region
Template95.8 %4.1 %76 %
Wild type92.9 %5.7 %83 %
Mutant Model 192.6%5.3 %84.69 %
Mutant Model 292.8%5.3 %83.78 %
Homology model of the spike proteins; (A) wildtype (B) Model with one mutation: D623 G (C) Model with two mutations: D623 G and F1118 L (D) Superimpose of all models. Here, in B and C, red dot represents the mutation site. In D, purple color represents the wildtype model; the cyan represents a model with one mutation, and the green represents a model with two mutations. Model Validation assessment score.

Analysis of the interaction between spike proteins and human ACE2 receptor

HDOCK server was used to predict the interaction between the above-mentioned 3D models of reference spike proteins along with mutant models and the human ACE2 receptor. Interestingly, this molecular docking analysis revealed that the docking score for the three models against the human ACE2 receptor was similar, and it was -244.42 (Table 6 ); mutation in the spike proteins do not hamper binding with ACE2 receptor. For three spike protein models, this study found that a domain of spike protein instead of whole protein, amino acid ranging from 345 to 527, was involved in the interactions. This domain was conserved in all isolates resulting in similar interactions with ACE2 (Fig. 4 ).
Table 6

Molecular docking results of human ACE2 receptor against wild-type and muatant spike protein of SARS-CoV-2.

ModelsVariationsHDOCK Score
Model 1Wild type−244.42
Model 2D623G−244.42
Model 3D623 G, F1118L−244.42
Fig. 4

Interaction of Spike protein with ACE2: (A) carton model and (B) Surface model. Here, green represents the receptor binding domain (RBD) of spike protein, and cyan represents human ACE2.

Molecular docking results of human ACE2 receptor against wild-type and muatant spike protein of SARS-CoV-2. Interaction of Spike protein with ACE2: (A) carton model and (B) Surface model. Here, green represents the receptor binding domain (RBD) of spike protein, and cyan represents human ACE2.

Discussion

COVID-19 has become a global challenge for the scientific communities affecting millions of people and taking thousands of lives every day. Scientists worldwide are working hard to combat against SARS-CoV-2, but no significant outcome is obtained (Lake, 2020; Yuen et al., 2020). Along with other studies, genetic studies can give a significant clue to understanding the pathogenesis of COVID-19. Together with the critical therapeutic target, the genomic sequence data may provide insights into the pattern of global spread, the diversity during the epidemics, and the dynamics of evolutions, which are crucial to unwind the molecular mechanism of COVID-19 (Khailany et al., 2020). This study gives insights into the transmission of SARS-CoV-2, genetic diversity of the isolates, and predicts the impacts of mutations in Bangladesh. It has been reported that during the COVID-19 outbreak about 600,000 people had entered into Bangladesh from the other countries including Spain and Switzerland (wsws, 2020). The phylogenetic study revealed that the Bangladeshi isolates found in Dhaka were descendent from Europe, and most of the isolates from Chittagong are descendent from India. India is the neighbour country of Bangladesh and a lot of people crosses the border between Bangladesh-India every day for business, education and treatment purposes. So, the chances of India for being the origin of the virus which caused the COVID-19 pandemic in Bangladesh is very high. Besides, Middle East could also be a potential source of the virus as they were very close to the isolates collected from Chittagong. However, some isolates of Chittagong were close to the isolates from Dhaka. Dhaka is the capital city of Bangladesh and the sixth most densely populated city in the world. This virus may spread to other regions of the country from this city as it is the central hub of Bangladesh for financial, political, entertainment, and education. The SARS-CoV-2 isolates collected from Chittagong are close to the strain from the Middle East is not surprising. As most of the migrants from Bangladesh live in Middle East are from Chittagong, and during the COVID-19 outbreak, thousands of them returned to their home city (Dastider, 2018; Ullah, 2020). Moreover, the phylogenetic distance from the initially collected isolates increased over time which indicates about the extensive mutation that the virus had gone during the human to human transmission in Bangladesh. Mutation in the viral genome is a ubiquitous phenomenon for the viruses to escape the host defence. But the mutation rate in SARS-CoV-2 much lower than the other RNA viruses, including seasonal flu viruses (Oberemok et al., 2020). In this study, there was found some variations in the SARS-CoV-2 isolated in Bangladesh, which may affect the epidemiology and pathogenicity of the virus. A total of 42 mutations were identified with a large deletion in the coding regions, where about half were synonymous. Even some isolates were found not to encode one or more accessory proteins such as ORF7a, ORF8, and ORF10 caused by a large deletion in the genome. An 80-nucleotides deletion in ORF7a was also reported by a study conducted in Arizona (Mercatelli and Giorgi., 2020). Absent of these accessory proteins may have adverse effects on the viral replication or pathogenesis and the expression of structural protein E (Keng et al., 2006). Moreover, ORF8 is involved in the crucial adaptation pathways of coronavirus from human-to-human. At the same time, ORF7a contributes to the viral pathogenesis in the host by inhibiting Bone Marrow Stromal Antigen 2 (BST-2), which restricts the release of coronaviruses from affected cells. Loss of ORF7a causes a much more significant restriction of the virus's spreading into the host(Taylor et al., 2015; Decaro and Lorusso, 2020). Loss of these accessory proteins may lead to the virus being less pathogenic, resulting in a meager infection rate and mortality compared to the other countries (Keng et al., 2006). Additionally, many variations in structural and non-structural proteins caused substitutions of one or more amino acids were found in the isolates of Bangladesh compared to the reference. Most of the mutations found to affect the structural stability of the proteins rather than alter the molecular functions. Among the structural proteins, most variations were found in Surface glycoproteins (spike) and Nucleocapsid phosphoprotein. Spike proteins play a crucial role in the viral entry into the cell by interacting with the human ACE2 receptor. At the same time, Nucleocapsid phosphoprotein is essential for the packaging of viral genomes into a helical ribonucleocapsid (RNP) and fundamental for viral self-assembly (Chang et al., 2014; Hoffmann et al., 2020). These functions may not affect much by those mutations, as Mutpred2 predicted that these mutations did not alter any molecular consequences of the proteins which are consistent with the study conducted by Wrapp and his co (Wrapp et al., 2020). Interestingly, D623 G mutation was found in the spike protein of all isolates which is similar to the mutation D614 G of the spike protein of SARS-CoV-2 mentioned by many studies. They only differed in the amino acid numbering which occurred due to the use of predictive model in this study. This mutation in spike protein has now become the dominant genotype around the world and could boost the transmission of the virus (Grubaugh et al., 2020). However, several recent studies demonstrated that this mutation had not any differences in the hospitalization outcomes (Korber et al., 2020; Wagner et al., 2020; Lorenzo-Redondo et al., 2020). Moreover, our molecular docking analysis revealed that these mutations in spike proteins do not affect the interaction with the ACE2 receptor; give us a notion that mutation in the spike protein maybe for the better adaption of the SARS-CoV-2. This observation is also supported by two independent studies (Grubaugh et al., 2020; Isabel et al., 2020). Additionally, this study identified a domain in the spike protein (amino acid ranging from 345 to 527) involved with human ACE2 receptor interaction rather than the whole protein. This domain was conserved in all isolates reported in Bangladesh, resulting in no effect of the mutations. A recent study identified the receptor-binding domain of spike protein, amino acid ranging from 319 to 541, to interact with the ACE2 receptor, which is similar to our findings (Lan et al., 2020).

Conclusion

SARS-CoV-2 isolates from Dhaka and Chittagong were close to European and Mideast lineage. A large deletion in the EPI_ISL_445213, EPI_ISL_445214, EPI_ISL_450343, and EPI_ISL_450344 isolates may explain the less pathogenic result of COVID-19 compared to other countries. Mutations in the spike protein of SARS-CoV-2 may induce more adaptation of this fetal virus; can cause less effective therapeutics if targeted. Our study gives novel insights to understand the SARS-CoV-2 epidemiology in Bangladesh.

Ethical approval

Not required.

Data availability

All data supporting the findings of this study are available within the article and its supplementary materials.

Funding

SUST Research Center funds for MJH. SA is supported by the (1) Alberta Innovates Graduate Student Scholarship (AIGSS), and the (2) Maternal and Child Health (MatCH) Scholarship programs.

CRediT authorship contribution statement

Md. Sorwer Alam Parvez: Conceptualization, Methodology, Software, Data curation, Formal analysis, Visualization, Validation, Writing - original draft. Mohammad Mahfujur Rahman: Formal analysis, Validation, Investigation. Md. Niaz Morshed: Formal analysis, Validation, Investigation. Dolilur Rahman: Formal analysis, Writing - original draft. Saeed Anwar: Data curation, Writing - review & editing. Mohammad Jakir Hosen: Supervision, Conceptualization, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  8 in total

Review 1.  Methodology-Centered Review of Molecular Modeling, Simulation, and Prediction of SARS-CoV-2.

Authors:  Kaifu Gao; Rui Wang; Jiahui Chen; Limei Cheng; Jaclyn Frishcosy; Yuta Huzumi; Yuchi Qiu; Tom Schluckbier; Xiaoqi Wei; Guo-Wei Wei
Journal:  Chem Rev       Date:  2022-05-20       Impact factor: 72.087

Review 2.  Evolution of SARS-CoV-2: Review of Mutations, Role of the Host Immune System.

Authors:  Helene Banoun
Journal:  Nephron       Date:  2021-04-28       Impact factor: 2.847

3.  Extensive genetic diversity with novel mutations in spike glycoprotein of severe acute respiratory syndrome coronavirus 2, Bangladesh in late 2020.

Authors:  S Z Afrin; S K Paul; J A Begum; S A Nasreen; S Ahmed; F U Ahmad; M A Aziz; R Parvin; M S Aung; N Kobayashi
Journal:  New Microbes New Infect       Date:  2021-04-24

Review 4.  SARS-CoV-2 and Emerging Foodborne Pathogens: Intriguing Commonalities and Obvious Differences.

Authors:  Ahmed G Abdelhamid; Julia N Faraone; John P Evans; Shan-Lu Liu; Ahmed E Yousef
Journal:  Pathogens       Date:  2022-07-27

5.  Transmission Dynamics and Genomic Epidemiology of Emerging Variants of SARS-CoV-2 in Bangladesh.

Authors:  Md Abu Sayeed; Jinnat Ferdous; Otun Saha; Shariful Islam; Shusmita Dutta Choudhury; Josefina Abedin; Mohammad Mahmudul Hassan; Ariful Islam
Journal:  Trop Med Infect Dis       Date:  2022-08-20

6.  Molecular Analysis of SARS-CoV-2 Circulating in Bangladesh during 2020 Revealed Lineage Diversity and Potential Mutations.

Authors:  Rokshana Parvin; Sultana Zahura Afrin; Jahan Ara Begum; Salma Ahmed; Mohammed Nooruzzaman; Emdadul Haque Chowdhury; Anne Pohlmann; Shyamal Kumar Paul
Journal:  Microorganisms       Date:  2021-05-12

7.  Temporal landscape of mutational frequencies in SARS-CoV-2 genomes of Bangladesh: possible implications from the ongoing outbreak in Bangladesh.

Authors:  Otun Saha; Israt Islam; Rokaiya Nurani Shatadru; Nadira Naznin Rakhi; Md Shahadat Hossain; Md Mizanur Rahaman
Journal:  Virus Genes       Date:  2021-07-12       Impact factor: 2.332

Review 8.  Acute Cerebellar Inflammation and Related Ataxia: Mechanisms and Pathophysiology.

Authors:  Md Sorwer Alam Parvez; Gen Ohtsuki
Journal:  Brain Sci       Date:  2022-03-10
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.