Anika Tahsin1, Rubaiat Ahmed1, Piyash Bhattacharjee1, Maisha Adiba1, Abdullah Al Saba1, Tahirah Yasmin1, Sajib Chakraborty2, A K M Mahbub Hasan1, A H M Nurun Nabi3. 1. Laboratory of Population Genetics, Department of Biochemistry and Molecular Biology, University of Dhaka, Bangladesh. 2. Systems Cell-Signalling Laboratory, Department of Biochemistry and Molecular Biology, University of Dhaka, Bangladesh. 3. Laboratory of Population Genetics, Department of Biochemistry and Molecular Biology, University of Dhaka, Bangladesh. Electronic address: nabi@du.ac.bd.
Abstract
Since the emergence of SARS-CoV-2 at Wuhan in the Hubei province of China in 2019, the virus has accumulated various mutations, giving rise to many variants. According to the combinations of mutations acquired, these variants are classified into lineages and greatly differ in infectivity and transmissibility. In 2021 alone, a variant of interest (VoI) Mu (B.1.621), as well as, variants of concern (VoC) Delta (B.1.617.2) and Omicron (BA.1, BA.2) and later in 2022, BA.4, BA.5, and BA.2.12.1 have emerged. Since then, the world has seen prominent surges in the rate of infection during short periods of time. However, not all populations have suffered equally, which suggests a possible role of host genetic factors. Here, we investigated the strength of binding of the spike glycoprotein receptor-binding domain (RBD) of the SARS-CoV-2 variants: Mu, Delta, Delta Plus (AY.1), Omicron sub-variants BA.1, BA.2, BA.4, BA.5, and BA.2.12.1 with the human angiotensin-converting enzyme 2 (hACE2) missense variants prevalent in major populations. In this purpose, molecular docking analysis, as well as, molecular dynamics simulation was performed of the above-mentioned SARS-CoV-2 RBD variants with the hACE2 containing the single amino acid substitutions prevalent in African (E37K), Latin American (F40L), non-Finnish European (D355 N), and South Asian (P84T) populations, in order to predict the effects of the lineage-defining mutations of the viral variants on receptor binding. The effects of these mutations on protein stability were also explored. The protein-protein docking and molecular dynamics simulation analyses have revealed variable strength of attachment and exhibited altered interactions in the case of different hACE2-RBD complexes. In vitro studies are warranted to confirm these findings which may enable early prediction regarding the risk of transmissibility of newly emerging variants across different populations in the future.
Since the emergence of SARS-CoV-2 at Wuhan in the Hubei province of China in 2019, the virus has accumulated various mutations, giving rise to many variants. According to the combinations of mutations acquired, these variants are classified into lineages and greatly differ in infectivity and transmissibility. In 2021 alone, a variant of interest (VoI) Mu (B.1.621), as well as, variants of concern (VoC) Delta (B.1.617.2) and Omicron (BA.1, BA.2) and later in 2022, BA.4, BA.5, and BA.2.12.1 have emerged. Since then, the world has seen prominent surges in the rate of infection during short periods of time. However, not all populations have suffered equally, which suggests a possible role of host genetic factors. Here, we investigated the strength of binding of the spike glycoprotein receptor-binding domain (RBD) of the SARS-CoV-2 variants: Mu, Delta, Delta Plus (AY.1), Omicron sub-variants BA.1, BA.2, BA.4, BA.5, and BA.2.12.1 with the human angiotensin-converting enzyme 2 (hACE2) missense variants prevalent in major populations. In this purpose, molecular docking analysis, as well as, molecular dynamics simulation was performed of the above-mentioned SARS-CoV-2 RBD variants with the hACE2 containing the single amino acid substitutions prevalent in African (E37K), Latin American (F40L), non-Finnish European (D355 N), and South Asian (P84T) populations, in order to predict the effects of the lineage-defining mutations of the viral variants on receptor binding. The effects of these mutations on protein stability were also explored. The protein-protein docking and molecular dynamics simulation analyses have revealed variable strength of attachment and exhibited altered interactions in the case of different hACE2-RBD complexes. In vitro studies are warranted to confirm these findings which may enable early prediction regarding the risk of transmissibility of newly emerging variants across different populations in the future.
The novel coronavirus, named Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), also referred to as nCoV-19, was first identified in the Wuhan, Hubei province of China in late December 2019 as the causative agent of the Coronavirus Disease 2019 (COVID-19) [1]. Since then, the virus has spread all over the world, affecting more than 440 million people and causing more than 5.97 million deaths worldwide, as of March 2022. These numbers only represent the clinically diagnosed cases. The real number of infections is much higher because the virus can cause asymptomatic infections [2]. SARS-CoV-2 belongs to the Betacoronavirus genus under the Coronaviridae family. The single-stranded positive-sense RNA genome of the SARS-CoV-2 is approximately 29.9 kb in length. It is replicated by an RNA-dependent RNA polymerase (RdRP), encoded in the viral genome [3].The virus spreads via respiratory droplets, targets the respiratory tract of humans, and infects the human angiotensin-converting enzyme 2 (hACE2) positive lung cells, which also express transmembrane protease, serine 2 (TMPRSS2) e.g: nasal and bronchial epithelial cells, also alveolar type II pneumocytes [4]. Upon entering the body it interacts with hACE2 of cells via its spike glycoprotein (S). The spike protein has two distinct subunits, S1 and S2, which are responsible for receptor binding and membrane fusion, respectively. Receptor-binding domain (RBD) of the spike protein interacts directly with the hACE2 and is a part of the S1 subunit, spanning residues 319–541. Cathepsin L and TMPRSS2 mediated priming of the spike protein also facilitates membrane fusion of the virus, through cleavage at the S1/S2 junction and S2’ site [5].Many generic therapeutic regimens implemented in the case of previous outbreaks of coronaviruses, i.e. SARS-CoV and MERS-CoV have been repurposed for SARS-CoV-2 infection. Moreover, with advanced computer-aided drug design, many specific therapeutic components in the forms of small molecules, monoclonal antibodies, siRNA and miRNA have been proposed since the emergence of the novel coronavirus [[6], [7], [8]]. In this short period of time, several vaccines have been produced with considerable efficacy and many are currently in clinical trial [[9], [10], [11], [12]]. In addition, multi-epitope vaccine constructs have been designed through in silico approaches as potential prophylactics [13].Like other RNA viruses, the SARS-CoV-2 has a high mutation rate, and this has caused the emergence of many variants of this virus since its identification [14]. The mutations that are necessarily present in all isolates of a particular variant of the virus are called lineage-defining mutations for that variant. An isolate is assigned to a variant clade based on the presence of certain lineage-defining mutations [15]. Based on the nature of the mutations, the variants may exhibit alteration to many different characteristics of the virus such as transmissibility, disease severity, immune escape, increase in virulence, change in chemical manifestations of disease, and diagnostic or therapeutic escape. These mutations may also contribute to community transmission and bring about detrimental changes in COVID-19 epidemiology.According to the CoVariant portal of GISAID (https://covariants.org/), 27 variants have been identified so far. The World Health Organization (WHO) has declared 5 of them (Alpha, Beta, Gamma, Delta, Omicron) as Variant of Concern (VoC) and 2 of them (Mu, Lambda) as Variant of Interest (VoI). The VoCs are named according to their time of emergence with respect to the identification of the Wuhan strain, which is considered the reference for SARS-CoV-2. The Alpha variant emerged in the UK (reported in December 2020), Beta in South Africa (December 2020), Gamma in Brazil (January 2021), Delta and Delta plus in India (May 2021), and Omicron in South Africa (November 2021). The VoI Lambda emerged in Peru (December 2020) and Mu in Columbia (January 2021) [16]. The mutations cause changes in the respective protein's structures [17]. Mutations occurring in the spike protein have conferred the virus higher transmissibility and allow them to escape from the immune response generated by the monoclonal neutralizing antibody or vaccination [18]. There are nine mutations in the spike protein of Delta and Mu variant, 36 in Omicron BA.1, 31 in Omicron BA.2, 34 mutations shared by Omicron BA.4 and BA.5 and 33 in Omicron BA.2.12.1, compared to the Wuhan strain. Among these two are in the RBD for Delta variant, three for Mu, 15 for Omicron BA.1, 16 for Omicron BA.2, 17 for Omicron BA.4, BA.5, and 17 for Omicron BA.2.12.1.The host receptor hACE2 harbours many polymorphisms, of which, some are missense variants that result in changes in the amino acid residue of the protein, and subsequent conformational change. These may alter the structure of the hACE2 which may affect the attachment with the spike protein of SARS-CoV-2. The polymorphisms are inherited within and can be attributed to a population in a region, where one may prevail in a given population, and others may prevail in other populations inhabiting different regions. The population frequency of a particular polymorphism might appear as a small number in percent value, but technically it includes a very large number of people. For example, the P84T missense variant in the hACE2 occurs at a frequency of 3.58% in the South Asian population, according to gnomAD exomes r2.1.1 dataset, and is estimated to be harboured by almost 56 million people in India and Bangladesh alone [19]. Therefore, it is important to take into consideration the effect of hACE2 polymorphism while analysing the virulence of SARS-CoV-2. The mutations occurring within the viral genome as well as natural variants within its receptor, hACE2 may contribute to the binding affinity followed by the entry and transmission of the virus in a population or region-specific manner [20]. Thus, the unique combinations of population-specific hACE2 polymorphisms and SARS-CoV-2 spike protein mutations may elucidate the region-dependent emergence and preference of each SARS-CoV-2 variant for one or more particular populations.Molecular docking analysis provides a prediction of the binding affinity and stability of a complex. Therefore, it is possible to detect alterations in binding affinity between the reference and the variant complexes, resulting from the minute deviations of the interacting protein structures. An increase or decrease in binding affinity depends on the molecular interactions between the interacting molecules. Amino acid substitutions, depending on their physicochemical properties, often disrupt existing interactions and establish new interactions. Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) is end-point free energy calculation methodology which is vigorously used in the prediction of binding free energy (BFE) and identification of appropriate protein-protein complexes. The technique in conjunction with per-residue energy decomposition of the binding free energy of a protein-protein complex is particularly useful in highlighting the key residues present in the binding interface. Furthermore, Molecular dynamics (MD) simulation provides insights into the changes in a protein's structure and behaviour in physiological circumstances with respect to time. Initially, proteins were conceptualized to be rigid entities as the X-ray diffraction method produces structures that are fixed in a crystal lattice [21]. However, just like all other macromolecules in physiological systems, proteins exist in a dynamic equilibrium. Advances in computational technologies have enabled us to simulate a physiological system in a computer system and observe the dynamics of any biological macromolecule in that system [22]. Through MD simulation, the stability and dynamics of protein-protein interactions can be observed, thus providing insights into the interactions between proteins, and the behaviour of each of the protein's constituent amino acids during the course of simulation runtime.Several in silico and experimental studies have shown promising associations of hACE2 polymorphisms with population-wise disease susceptibility and prevalence [[23], [24], [25], [26]]. Additionally, in silico and experimental studies have also been conducted to evaluate the binding affinity of the novel SARS-CoV-2 variants to hACE2 [[27], [28], [29]]. These approaches generate observations regarding variations in receptor-binding interactions based on either receptor hACE2 or the viral spike protein RBD. We aim to bridge the gap in the comparative assessment of binding interactions between polymorphic hACE2 and emerging SARS-CoV-2 variants.This study, firstly, aims to evaluate the structural effects of the mutations occurring in the spike glycoproteins of the recently emerged SARS-CoV-2 variants. Secondly, considering the structural deviations of the emerging SARS-CoV-2 variants induced by these mutations, we sought to investigate the role of polymorphisms of hACE2 on the variable susceptibility of these variants among different populations.
Methods and materials
Retrieval of amino acid sequences and sequence variants of SARS-CoV-2 spike glycoprotein and human ACE2
The complete reference sequence of SARS-CoV-2 surface glycoprotein (accession: YP_009724390) was retrieved from the NCBI virus database (https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/). Lineage defining mutations of the protein in the virus variants – Delta (21A), Delta Plus (21I), Mu (21H), and Omicron sub-variants BA.1 (21K), BA.2 (21L), BA.4 (22A), BA.2 (22B), and BA.2.12.1 (22C) were listed from the CoVariants portal on GISAID database (https://covariants.org/) (Supplementary Table 1).The reference sequence of human angiotensin-converting enzyme 2 (hACE2) (accession: Q9BYF1) was obtained from the UniProt database (https://www.uniprot.org/uniprot/Q9BYF1). The interacting domain of hACE2 with the receptor-binding domain (RBD) of the viral spike glycoprotein encompasses the amino acid residues: 30–41, 82–84, and 353–357 of the protein, as documented in UniProt. Missense variants of hACE2 protein were then filtered using these amino acid coordinates from the variant table of hACE2 gene in the Ensembl database (available at https://asia.ensembl.org/Homo_sapiens/Gene/Variation_Gene/Table?db=core;g=ENSG00000130234;r=X:15494566-15607236). Population frequencies of each variant were obtained from the gnomAD v2.1.1 dataset (https://gnomad.broadinstitute.org/). The most highly prevalent missense variant within a population was chosen as the representative missense variant of that population. For more than one variant classified into the same population, co-occurrence was tested for the variant pair using the Variant Co-Occurrence tool (https://gnomad.broadinstitute.org/variant-cooccurrence). Between the variants occurring in different haplotypes, the one with the higher population frequency was selected, as the representative variant of the respective population. Four SNPs were thus listed for four populations: African (AFR), Latin American (AMR), non-Finnish European (NFE), and South Asian (SAS); as reported in the gnomAD v2.1.1 dataset.
Mutational analysis of spike glycoprotein variants
Structure-based analysis of the lineage-defining mutations of the spike protein of the SARS-CoV-2 variants: Delta, Delta Plus, Mu, Omicron subvariants BA.1, BA.2, BA.4, BA.5 and BA.2.12.1 was performed using MAESTROweb (https://pbwww.services.came.sbg.ac.at/maestro/web/maestro/) [30], INPS-3D (https://inpsmd.biocomp.unibo.it/inpsSuite/default/index3D) [31], CUPSAT (http://cupsat.tu-bs.de/) [32], PremPS (https://lilab.jysw.suda.edu.cn/research/PremPS/) [33], and SDM2 (http://marid.bioc.cam.ac.uk/sdm2/) [34] web servers. These tools predict ΔΔG, the difference in free energy of unfolding between wild-type and mutant proteins. Based on their algorithm, the tools interpret results differently. For MAESTROweb and PremPS, if the ΔΔG value is less than zero (negative), the mutation is inferred as destabilising [35,36]. For SDM2, CUPSAT, and INPS-3D, ΔΔG values greater than zero (positive) indicate a stabilising mutation and vice versa [35]. For a given mutation, the prediction provided by the majority of the tools (3 out 5) is considered and has been used for further analysis.
Construction and validation of mutant models of the viral spike glycoprotein receptor-binding domain (RBD) and human ACE2 protein using homology modelling
The three-dimensional structure of SARS-CoV-2 spike glycoprotein RBD bound with hACE2 (PDB ID: 6M0J) was retrieved from the Protein Data Bank (https://www.rcsb.org/structure/6m0j). The lineage-defining mutations were incorporated into the amino acid sequences of the respective spike glycoprotein variants (Delta, Mu, and Omicron). Chain E (RBD) of the hACE2-RBD complex (PDB ID: 6M0J) served as the template for the construction of the RBD models of the SARS-CoV-2 variants on the Swiss-Model server (https://swissmodel.expasy.org/) [37]. Chain A (hACE2) of the hACE2-RBD complex (PDB ID: 6M0J) was entered along with the single amino acid substitutions specified for the populations into the Missense 3D server (http://missense3d.bc.ic.ac.uk/missense3d/) [38] to generate the population-specific mutant models of hACE2 protein.The modelled structures of both hACE2 and RBD variants were validated using Swiss-Model Structure Assessment (https://swissmodel.expasy.org/assess), ERRAT [39] & PROCHECK [40] (https://saves.mbi.ucla.edu/) and ProSA-web (https://prosa.services.came.sbg.ac.at/prosa.php) [41] servers.
Molecular docking assay and analysis of protein-protein interactions
Template-based protein-protein docking between the hACE2 and RBD was performed using the HDOCK server (http://hdock.phys.hust.edu.cn/) [42]. Protein-protein docking utilises the measure of intermolecular interactions to predict the strength of a molecular complex. The protein chains were prepared for docking experiments by removing water, heteroatoms, and ligand groups. The interacting residues for hACE2 were set as reported in the UniProt database as well as those detected by the PDBSum server (http://www.ebi.ac.uk/thornton-srv/databases/pdbsum) [43], using the hACE2-RBD complex (PDB ID: 6M0J) as reference. Using the same complex as the reference, the interacting residues of the RBD were set in a similar manner. The interacting residues of both hACE2 and RBD are presented in Table 1
. The docking energy scores were used for further analysis.
Table 1
List of interacting residues for hACE2 and SARS-CoV-2 Spike glycoprotein receptor-binding domain (RBD).
List of interacting residues for hACE2 and SARS-CoV-2 Spike glycoprotein receptor-binding domain (RBD).Protein-protein interactions across hACE2 and RBD interfaces were visualised by BIOVIA Discovery Studio Visualizer v21.1.0.20298. Details regarding the two-dimensional interactions between the RBD variants and hACE2 (wildtype and selected missense variants) molecules were generated using the PDBsum server.
Prediction of binding energy and decomposition of free energy contributions
The binding energy of the reference and variant hACE2 – RBD complexes were calculated using an end-point free energy calculation methodology, Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) on the HawkDock server (http://cadd.zju.edu.cn/hawkdock/) [44]. The server analyses protein-protein interactions by integrating the ATTRACT docking algorithm, the HawkRank scoring function and the MM/GBSA free energy decomposition analysis. MM/GBSA is calculated based on the ff02 force field, the implicit solvent model and the GBOBC1 model (interior dielectric constant = 1). The whole system was minimized for 5000 steps with a cutoff distance of 12 Å for van der Waals interactions (2000 cycles of steepest descent and 3000 cycles of conjugate gradient minimizations).In MM/GBSA analysis, the binding free energy (ΔGbind) of the system can be expressed as a change in free energy, which can be calculated by Equation (1)Here, ΔEinternal denotes change in internal MM energy, ΔEelectrostatic indicates change in electrostatic energy, ΔEvdw depicts change in van der Waals energy, ΔGGB denotes the electrostatic solvation energy, ΔGSA depicts the deviation of solvation free energy combined with non-electrostatic solvation, and TΔS is the total system binding entropy.
Molecular dynamics simulation
To evaluate the changes in stability and binding patterns of the docked protein-protein complexes with time, we performed a 125 ns molecular dynamics (MD) simulation of the Omicron BA.1 and BA.2 complexes, as well as, the highest and lowest scoring complexes of Delta and Mu variants. The simulation was run using the Desmond v3.6 program in the Schrödinger suite [45]. A single-point charge (SPC) water model was used to solvate the system and an orthorhombic periodic boundary box with the dimension of 10 Å * 10 Å * 10 Å was employed to maintain a stable volume. Na+ and Cl− ions were used at a concentration of 0.15 M to electronically neutralize the system. The OPLS3e force field was employed to optimize the system. Ensembles of the NPT (constant number of particles, pressure, and temperature) were maintained at 300.0 K temperature and 1.01325 bar pressure. After an initial relaxation, the system was recorded at 125 ps intervals with an energy of 1.2. The results of the MD simulation are presented as the evaluation of the Root Mean Square Deviation (RMSD) and Root Mean Square Fluctuation (RMSF) data.
RMSD analysis
RMSD is a measurement of the average displacement of selected atoms in a particular timeframe with respect to a reference frame [46]. During the initialization of the simulation, the protein structures were aligned with the backbone frame, and the RMSD was calculated using Equation (2).Here, N is the number of selected atoms; tref is the reference time (usually the first frame is considered the reference frame, and time is taken as t = 0); Frame x is the location of the chosen atoms when superimposed with the reference frame. Every simulation frame underwent this calculation to provide the RMSD data.
RMSF analysis
RMSF is useful to observe the behavior of each amino acid residue during the simulation period. It is an indicator of local conformational change around the amino acid considered. The calculation is done by using Equation (3).where T = trajectory time interval used to calculate the RMSF values, tref = the reference time interval, ri = the location of residue i, r’I = the location of the atoms in residue i after their overlap upon that reference, and the angle brackets (<>) indicates that the square distance is averaged across the residue's atoms.
Results
Retrieved high-frequency population-specific hACE2 missense variants within the hACE2-RBD binding interface
Six unique SNPs, specifying the hACE2-RBD binding interface residues, were initially retrieved. The African and the Non-Finnish European population each had two missense variants within the specified regions. When tested for co-occurrence, both of these variant pairs appeared to occur on different haplotypes. For the rest of the populations, each had only one missense variant. The four hACE2 SNPs thus obtained according to the highest population frequency are provided in Table 2
.
Table 2
List of hACE2 SNPs representing the four populations according to their frequency along with their SNP IDs and resulting amino acid substitutions.
Population
AFR
AMR
NFE
SAS
dbSNP ID
rs146676783
rs924799658
rs961360700
rs759134032
Amino acid substitution
E37K
F40L
D355 N
P84T
Population frequency(gnomAD)
0.0512
0.0001
0.0175
0.0358
List of hACE2 SNPs representing the four populations according to their frequency along with their SNP IDs and resulting amino acid substitutions.
Validation of the predicted structures of RBD variants
The predicted RBD structures of the SARS-CoV-2 variants: Delta, Mu, Omicron BA.1, BA.2, BA.4 and BA.2.12.1 yielded favourable validation scores on Swiss-Model Structure Assessment, ERRAT, and ProSA-web servers (Table 3
).
Table 3
Validity scores of the predicted RBD structures of the SARS-CoV-2 variants.
Variant
Substitutions
RC favoured
QMEAN
ERRAT
ProSA
Delta
2
95.83%
−1.79
91.573
−6.04
Mu
3
96.35%
−1.61
90.3409
−5.83
Omicron BA.1
15
94.27%
−2.01
91.4773
−5.87
Omicron BA.2
16
95.31%
−2.11
91.4286
−5.68
Omicron BA.4
17
94.79%
−2.06
91.4286
−5.67
Omicron BA.2.12.1
17
95.31%
−2.08
91.4286
−5.7
Validity scores of the predicted RBD structures of the SARS-CoV-2 variants.All the modelled RBD structures produced QMEAN “Z-score” for absolute quality estimates of protein models well above −4.0 on Swiss-Model structure assessment indicating good quality models [47]. All RBD models scored an overall quality factor for non-bonded atomic interactions within a protein above 90 on ERRAT which satisfies the criteria for good quality model. The Ramachandran plots of the RBD models obtained from PROCHECK are presented in Supplementary Fig. 1. 90% or more residues of all the modelled RBD structures were in energetically most favoured region, hence, qualify as acceptable for further analyses [48]. The respective Z-scores of all RBD models, produced by ProSA-web server, were consistent with those determined experimentally as depicted in the Z-score plots (Supplementary Fig. 2).
Effects of spike protein mutations on protein stability and hACE2 binding affinity
Lineage-defining amino acid substitutions of the SARS-CoV-2 variants are presented in Fig. 1
.
Fig. 1
Lineage-defining amino acid substitutions of the SARS-CoV-2 variants.
Lineage-defining amino acid substitutions of the SARS-CoV-2 variants.Predictions on the stability of 48 amino acid substitutions found in the spike protein of SARS-CoV-2 variants we investigated, provided by different tools, are presented in Fig. 2
.
Fig. 2
Predictions on effects of mutations appearing in the spike protein of Delta, Delta plus, Mu, Omicron BA.1, BA.2 variants, BA.4 and BA.2.12.1 variants, as determined by five prediction tools. The columns represent single amino acid substitutions present in at least one SARS-CoV-2 variant mentioned above. The rows indicate the tools used for the prediction of the mutational effects. An amino acid substitution may have either of the effects: stabilising, destabilising or undetermined.
Predictions on effects of mutations appearing in the spike protein of Delta, Delta plus, Mu, Omicron BA.1, BA.2 variants, BA.4 and BA.2.12.1 variants, as determined by five prediction tools. The columns represent single amino acid substitutions present in at least one SARS-CoV-2 variant mentioned above. The rows indicate the tools used for the prediction of the mutational effects. An amino acid substitution may have either of the effects: stabilising, destabilising or undetermined.Six of these mutations (R346K, T376A, L452R, L452Q, T547K, N969K) were unanimously predicted to be destabilising, whereas, two of these (S371L, S371F) were unanimously – stabilising. 11 mutations (G142D, Y144S, Y145 N, R158G, S373P, G446S, Q493R, G496S, Y505H, N856K, Q954H) were passed as destabilising and nine mutations (T95I, A222V, S373F, D405 N, E484A, N501Y, H655Y, S704L, D796Y) were passed as stabilising by 80% of the prediction tools. 11 mutations (A27S, A67V, Y145D, L212I, V213G, R408S, K417 N, T478K, E484K, F486V, L981F) appeared to be destabilising, while, seven mutations (T19R, G339D, N440K, S477 N, D614G, N764K, D950 N) appeared as stabilising by more than half of the prediction tools employed. Two mutations (T19I and Q498R) were predicted to be destabilising by two of the five tools, undetermined by one tool, and stabilising by the rest of the tools, hence, undetermined.For the spike glycoprotein of the Delta variant, three mutations were destabilising out of the six single amino acid substitutions analysed. Two of these mutations (L452R and T478K) occupy the receptor-binding domain. In addition to these three destabilising mutations found in the Delta variant, a stabilising mutation (A222V) was identified outside the RBD of the Delta plus variant. In the case of the Mu variant, eight mutations were analysed and four of them were destabilising. The RBD of the Mu variant contained two destabilising mutations (R346K and E484K) and one stabilising mutation (N501Y).The Omicron sub-variants increased in both destabilising and stabilising mutations. For subvariant BA.1, 15 mutations were destabilising out of the 28 mutations analysed. These include the stabilising mutation N501Y, previously reported in Mu, and the destabilising mutation G496S. Omicron BA.2 had, however, 14 destabilising mutations out of 26 mutations analysed. This variant did not include destabilising mutation G496S. 13 out of 27 analysed mutations in Omicron BA.4, were shown to be destabilising, whereas, 12 were shown to be stabilising. For the most recent Omicron subvariant BA.2.12.1, 13 out of 27 mutations analysed were shown to be destabilising and 13 to be stabilising. Considering the emerging sequence of the variants, mutations are accumulating rapidly within the spike protein (Fig. 3
), particularly, among RBD residues.
Fig. 3
The pattern of stabilising and destabilising mutation accumulation for the emerging SARS-CoV-2 variants. The variants are presented sequentially, in order of their emergence in the virus's recent evolutionary timeline.
The pattern of stabilising and destabilising mutation accumulation for the emerging SARS-CoV-2 variants. The variants are presented sequentially, in order of their emergence in the virus's recent evolutionary timeline.
Docking energy score analysis of the hACE2-RBD complexes
The docking energy scores of thirty-five complexes of spike RBD variants with the reference and missense hACE2 variants prevalent in four major populations are depicted in Fig. 4
. The numerical values are provided in Supplementary Table 2.
Fig. 4
Docking energy scores of hACE2 reference and four singly mutated hACE2 variants (African, Latin American, Non-Finnish European, and South Asian) and the receptor-binding domains (RBD) of the spike glycoprotein from the Wuhan-Hu-1 (reference) and six emerging SARS-CoV-2 variants (Delta, Mu, Omicron BA.1, Omicron BA.2, Omicron BA.4, and Omicron BA.2.12.1).
Docking energy scores of hACE2 reference and four singly mutated hACE2 variants (African, Latin American, Non-Finnish European, and South Asian) and the receptor-binding domains (RBD) of the spike glycoprotein from the Wuhan-Hu-1 (reference) and six emerging SARS-CoV-2 variants (Delta, Mu, Omicron BA.1, Omicron BA.2, Omicron BA.4, and Omicron BA.2.12.1).Omicron BA.2 variant showed fairly the highest receptor-binding capacity (docking energy score = −367.74), surpassing the previous Omicron BA.1 variant (docking energy score = −366.84), in complex with the reference hACE2. The receptor-binding capacity of the Omicron BA.1 variant, however, remains high for the reference and all missense hACE2 variants we investigated. Especially, in the case of the NFE hACE2 containing the substitution D355 N, the Omicron BA.1 variant has the highest receptor-binding ability with the docking energy score of −381.24, compared to other missense hACE2 variants. However, the docking energy score was less negative for the Omicron BA.2 variant (−339.47) in complex with the NFE hACE2. For the AFR hACE2 containing the E37K substitution, the emerging variant spike RBDs exhibited a gradual increase in binding capacity, which slightly decreased in the case of the latest variant Omicron subvariant BA.2. Similar results were observed for the SAS hACE2 containing the P84T substitution, except for a more negative docking energy score in combination with the Delta RBD variant (−357.96). The receptor-binding ability of Omicron BA.1 and BA.2 was still very high with docking energy scores of −377.71 and −375.25, respectively. The reference and variant spike RBDs showed the most negative docking energy scores in the case of the AMR hACE2, containing the F40L substitution. This particular allele had a notably minor frequency (0.0001%). Nevertheless, the Omicron BA.2 manifested the highest receptor binding interaction with a docking energy score of −364.96 for AMR hACE2, compared with other populations. Among the most recently emerged Omicron sub-variants, Omicron BA.4 exhibits the least negative docking energy score −299.66 with the reference hACE2, compared with all variants analysed. However, the variant exhibits the most negative docking energy score (−338.16) with the NFE hACE2, compared with the other populations. Omicron BA.2.12.1, the closest relative to Omicron BA.2, produces a more negative docking energy score of −325.74. The variant shows the most negative docking energy score of −359.89 with SAS hACE2, much more towards the negative compared with Omicron BA.4 (docking energy score = −312.54). Interestingly, the recent Omicron sub-variant BA.4 exhibits a less negative docking energy score than the previous BA.1 and BA.2 overall.
Comparative analysis of the intermolecular interactions for different hACE2-RBD complexes
The summary of the interactions derived from the docked complexes of the hACE2 and RBD has been presented in Fig. 5
.
Fig. 5
Comparison of the type and number of bonds formed between thirty-five different hACE2-RBD variant complexes. Types of bonds include salt-bridge (blue), H-bond (salmon), and non-bonded (green) interactions. Points of interaction vary with the SARS-CoV-2 variants among the AFR, AMR, and NFE groups of hACE2, whereas, consistent among SAS hACE2.
Comparison of the type and number of bonds formed between thirty-five different hACE2-RBD variant complexes. Types of bonds include salt-bridge (blue), H-bond (salmon), and non-bonded (green) interactions. Points of interaction vary with the SARS-CoV-2 variants among the AFR, AMR, and NFE groups of hACE2, whereas, consistent among SAS hACE2.In the reference hACE2-RBD structural complex (6M0J), the following H-bonds were observed: Asp30-Lys417 [O⋯H–N], Glu35-Gln493 [O⋯H–N], Gln42-Gly446 [N–H⋯O], Lys353-Gly496 [N–H⋯O] and a salt bridge between Asp30 of hACE2 and Lys417 of RBD. The bond formation patterns of the thirty-five hACE2-RBD complexes are depicted in Fig. 6
and detailed in Supplementary Table 3.
Fig. 6
Comparison of the salt bridges and H-bonds between the thirty-five hACE2 – RBD variant complexes, with an emphasis on the RBD residues mutated in at least one SARS-CoV-2 variant. Salt bridges and H-bonds are depicted as dashed and solid lines, respectively. RBD residues interacting with more than one hACE2 residue appear more than once.
Comparison of the salt bridges and H-bonds between the thirty-five hACE2 – RBD variant complexes, with an emphasis on the RBD residues mutated in at least one SARS-CoV-2 variant. Salt bridges and H-bonds are depicted as dashed and solid lines, respectively. RBD residues interacting with more than one hACE2 residue appear more than once.The depiction compares the RBD residues which undergo mutations in at least one variant investigated and/or forms a salt bridge. The residues are Arg403, Lys417, Gly446, Glu484, Gln493, Gly496, Gln498, Asn501, Tyr505 of the spike protein RBD. The complex formed by the reference hACE2 and the spike RBD of the Wuhan-Hu-1 variant has been referred to as the ‘reference’ complex. The rest of the complexes are ‘variant’ complexes, in which either hACE2 or RBD is a variant.The notable H-bond and salt bridge interactions found within different variant hACE2-RBD complexes compared with the reference hACE2-RBD complex are depicted in Fig. 7
.
Fig. 7
3D interaction diagrams of some unique interactions found in different hACE2 - RBD variant complexes, compared with the Ref hACE2 - Wuhan-Hu-1 RBD complex. hACE2 and RBD variants are depicted in pink and cyan, respectively. The Wuhan-Hu-1 strain of SARS-CoV-2 is denoted as wild-type (wt). H-Bonds are presented as green dashed lines and the salt bridge between Asp38-Arg493 in NFE hACE2-Omicron BA.1 RBD complex is presented in brick-red dashed line.
3D interaction diagrams of some unique interactions found in different hACE2 - RBD variant complexes, compared with the Ref hACE2 - Wuhan-Hu-1 RBD complex. hACE2 and RBD variants are depicted in pink and cyan, respectively. The Wuhan-Hu-1 strain of SARS-CoV-2 is denoted as wild-type (wt). H-Bonds are presented as green dashed lines and the salt bridge between Asp38-Arg493 in NFE hACE2-Omicron BA.1 RBD complex is presented in brick-red dashed line.
Prediction of free energy of protein-protein complexes and free energy decomposition of protein-protein interactions
The binding free energies predicted for the hACE2 – RBD variant complexes are summarized in Supplementary Table 4. The binding free energies of hACE2 variants in complex with Omicron BA.1, BA.2, BA.4, BA.2.12.1 and Delta are highlighted in Fig. 8
.
Fig. 8
Comparison of binding free energy among Omicron BA.1, BA.2, BA.4, BA.2.12.1 and Delta RBD variant complexes with the reference and missense variants of hACE2. On the x-axis, VDW, ELE, GB, SA, and TOTAL denote the van der Waals Energy (kcal/mol), Electrostatic Energy (kcal/mol), Polar Solvation Energy (kcal/mol), Solvent Accessible Surface Area Energy (kcal/mol), Total Binding Energy (kcal/mol), respectively.
Comparison of binding free energy among Omicron BA.1, BA.2, BA.4, BA.2.12.1 and Delta RBD variant complexes with the reference and missense variants of hACE2. On the x-axis, VDW, ELE, GB, SA, and TOTAL denote the van der Waals Energy (kcal/mol), Electrostatic Energy (kcal/mol), Polar Solvation Energy (kcal/mol), Solvent Accessible Surface Area Energy (kcal/mol), Total Binding Energy (kcal/mol), respectively.Omicron BA.1 sub-variant exhibited a more negative binding free energy in complex with AFR hACE2 (−79.11 kcal/mol) and NFE hACE2 (−77.53 kcal/mol), compared with reference (−74.51 kcal/mol). Whereas, Omicron BA.2 sub-variant depicted a significantly less negative binding free energy in complex with NFE hACE2 (−62.14 kcal/mol), compared with both SAS hACE2 (−74.36 kcal/mol) and the reference hACE2 (−76.95 kcal/mol). The Delta variant produced a significantly more negative binding free energy with SAS hACE2 (−74.03 kcal/mol), compared with the reference hACE2 (−68.77 kcal/mol). Omicron BA.4 manifests a much less negative binding energy in complex with both the reference (−66.78 kcal/mol) and NFE hACE2 (−66.72 kcal/mol), similar the pattern observed in case of docking energy score. In case of Omicron BA.2.12.1, highly negative binding free energy was observed in complex with both the reference (−78.55 kcal/mol) and NFE hACE2 (−76.7 kcal/mol).The free energy contributions at the level of amino acid residues were calculated using free energy decomposition analyses. The differences in residual free energy contributions of wild type hACE2 residues and the mutated residues (E37K, F40L, D355 N, and P84T) caused by SNPs, highly prevalent in African (AFR), Latin American (AMR), Non-Finnish European (NFE), and South Asian (SAS) populations are depicted in Fig. 9
. The numerical values are provided in Supplementary Table 5.
Fig. 9
The differences in residual free energy contributions (in kcal/mol) of wild type hACE2 residues and the mutated residues (A) E37K, (B) F40L, (C) D355 N, and (D) P84T, representing African (AFR), Latin American (AMR), Non-Finnish European (NFE), and South Asian (SAS) populations, respectively.
The differences in residual free energy contributions (in kcal/mol) of wild type hACE2 residues and the mutated residues (A) E37K, (B) F40L, (C) D355 N, and (D) P84T, representing African (AFR), Latin American (AMR), Non-Finnish European (NFE), and South Asian (SAS) populations, respectively.All mutated residues except F40L, were shown to shift the free energy towards a more positive value. Although it seems to decrease the binding affinity at a single residue level, the overall the binding free energy does appear to follow a similar trend.RMSD analysis of the backbones of RBD of Omicron BA.1 with different hACE2 variants revealed that Ref hACE2 – Omicron BA.1 RBD complex was initially unstable but later stabilised eventually with some momentary fluctuations during the simulation run time. RMSD values of the reference and Omicron sub-variant complexes of BA.1 and BA.2 during the 125 ns MD simulation are presented in Fig. 10 (A) (i), (B) (i); Delta and Mu variant complexes in Fig. 11 (A) (i), (B) (i)
.
Fig. 10
(i) RMSD and (ii) RMSF plots of hACE2 – RBD complexes of variants (A) Omicron BA.1 and (B) Omicron BA.2 during the 125 ns MD simulation. In the RMSF plots, RBD residues mutated in each SARS-CoV-2 variant are labelled with flags.
Fig. 11
(i) RMSD and (ii) RMSF plots of hACE2 – RBD complexes of variants (A) Delta and (B) Mu during the 125 ns MD simulation. In the RMSF plots, RBD residues mutated ineach SARS-CoV-2 variant are labelled with flags.
(i) RMSD and (ii) RMSF plots of hACE2 – RBD complexes of variants (A) Omicron BA.1 and (B) Omicron BA.2 during the 125 ns MD simulation. In the RMSF plots, RBD residues mutated in each SARS-CoV-2 variant are labelled with flags.There is less deviation on the MD trajectory of the AFR hACE2 – Omicron BA.1 RBD complex at the initial stage, indicating more stable binding that was maintained for the runtime. The SAS hACE2 – Omicron BA.1 RBD complex also showed stronger initial interaction as it had lower RMSD at the beginning compared to the Ref hACE2 – Omicron BA.1 RBD complex. For the hACE2 complexes with Omicron BA.2 RBD, an overall greater stability was observed in the Ref hACE2 – Omicron BA.2 RBD complex. The AFR hACE2 – Omicron BA.2 RBD and NFE hACE2 – Omicron BA.2 RBD complexes showed some fluctuations at the beginning. However, with time, the complexes stabilised and exhibited lower RMSD compared to the Ref hACE2 – Omicron BA.2 RBD complex. Interestingly, we observed opposite trends in RMSD in case of the NFE hACE2 and SAS hACE2 complexes with BA.1 and BA.2 variants. The NFE hACE2 – Omicron BA.2 RBD and SAS- hACE2 – Omicron BA.2 RBD are greatly stabilised compared to their complexes with Ref hACE2. The average RMSD values of the different hACE2 complexes with Omicron BA.1 and BA.2 RBD variants are given in Table 4
.
Table 4
Average RMSD (Ǻ) of missense hACE2 variants in complexes with Omicron BA.1 and Omicron BA.2 RBD variants.
Average RMSD (Ǻ)
hACE2
Omicron BA.1
Omicron BA.2
Ref
2.7918
2.5584
AFR
2.5755
2.7644
AMR
2.8106
2.3169
NFE
3.3771
2.6362
SAS
2.6804
3.3519
Average RMSD (Ǻ) of missense hACE2 variants in complexes with Omicron BA.1 and Omicron BA.2 RBD variants.In case of SAS hACE2 – Delta RBD complex, very stable binding was observed compared with the Ref hACE2 – Delta RBD complex, although some minor fluctuations are observed momentarily. Mu RBD exhibited substantially higher RMSD values initially in complex with NFE hACE2, compared with that with AMR hACE2. The average RMSD values of the different hACE2 complexes with Delta and Mu RBD variants are given in Supplementary Table 6.RMSF values of the RBD of the reference and Omicron sub-variantsBA.1 and BA.2 complexed with the reference hACE2 and different hACE2 variants obtained during the 125 ns MD simulation are presented in
Fig. 10 (A) (ii), (B) (ii)); and RMSF values of the RBD of the Delta and Mu variants are presented in Fig. 11 (A) (ii), (B) (ii).(i) RMSD and (ii) RMSF plots of hACE2 – RBD complexes of variants (A) Delta and (B) Mu during the 125 ns MD simulation. In the RMSF plots, RBD residues mutated ineach SARS-CoV-2 variant are labelled with flags.RMSF analysis of the protein backbone of the SARS-CoV-2 RBD variants revealed significant variations in RMSF values in the Omicron BA.1 RBD variants interacting with different hACE2 variants. The Asn360 was stable while interacting with AFR and SAS hACE2, however, exhibited fluctuation while interacting with AMR hACE2. The wild type Asn370 residue and the mutated Ser371Leu residue in the RBD were stabilised while interacting with AFR hACE2 but fluctuated higher in the rest of the cases. Leu387 and Asp389 showed high fluctuations while interacting with AMR hACE2 but were most stable in case of AFR and SAS hACE2. The mutated Gly446Ser residue exhibited greater stability while interacting with AFR and SAS hACE2 compared with the rest. The RBD region spanning residues 496–508 of Omicron BA.1 harbours four mutations and they have stabilised while interacting with all hACE2 variants compared with the reference hACE2.In the Omicron BA.2 RBD variant complexes with different hACE2, greater fluctuations were observed at the RBD residues Asn360, Thr393, Gly413, Asp428, Asn460, Ile468, and Thr500 in interaction with the SAS hACE2 compared with the rest. The RBD residues of the AFR hACE2 – Omicron BA.2 RBD complex were most stabilised in terms of RMSF. Interestingly, we observed a high fluctuation in the Gly447 residue of the Omicron BA.2 RBD while interacting with reference hACE2. However, greatly reduced fluctuations were observed for interactions with SAS, NFE, and AFR hACE2. While interacting with AMR hACE2, the residue stabilised only moderately. On the other hand, the RBD residues of the Delta variant were observed to undergo major fluctuations in complex with reference hACE2, whereas, were substantially stabilised in complex with SAS hACE2.The N-terminal and C-terminal residues of all complexes show higher RMSF value due to their positions at the ends of the protein molecule which fail to take part in any secondary structure formation, resulting in greater flexibility.
Discussion
Among the SARS-CoV-2 variants of concern, we especially investigated the Delta variant and Omicron sub-variants BA.1 and BA.2, as well as BA.4, BA.5 and BA.2.12.1, which emerged recently in 2022. A single amino acid substitution (A222V) distinguishes Delta plus variant from its predecessor Delta (B.1.617.2) which is located outside the RBD of the spike glycoprotein. The receptor-binding domains of the Delta and the Delta plus variants appear to be identical. Hence, the spike RBDs of both the Delta and the Delta plus variants are represented as the ‘Delta’ RBD variant in our results to avoid redundancy. Similarly, Omicron subvariants BA.4 and BA.5 share an identical spike protein, hence, the observations for ‘Omicron BA.4’ apply to both BA.4 and BA.5 variants. A variant of interest Mu was also included in this report as it showed limited spread within specific regions and despite acquiring noteworthy mutations, fewer studies have been conducted on the transmission pattern of this variant.As hACE2 serves as the primary receptor for the entry of SARS-CoV-2, it is the first probable ‘genetic gateway’ that might determine disease progression [49]. The genetic polymorphisms of hACE2 are not evenly distributed across populations. Rather, the frequencies of these polymorphisms considerably differ among populations. Hence, a population may preferentially harbour one polymorphism at a much higher frequency, compared with the rest. Moreover, both experimental and in silico studies demonstrated that the binding interactions of SARS-CoV-2 vary in presence of hACE2 polymorphisms [50]. Furthermore, previous experiments have shown that the SARS-CoV-2 variants also interact differently with hACE2 [51]. The spread of the viral variants, thus, is likely to differ extensively among the host populations harbouring distinct hACE2 polymorphisms. However, none of the currently available studies have considered the RBD mutations and the hACE2 polymorphisms simultaneously. We combined both the recently emerged SARS-CoV-2 variants and the hACE2 single missense variants prevalent in major populations in a combinational approach to investigate the differences in the receptor-binding interactions between different RBD – hACE2 complexes. Our results indicate notable differences in receptor-binding interactions of the spike RBD of different SARS-CoV-2 variants within a population represented by a hACE2 single missense variant. The receptor-binding affinity of a particular SARS-CoV-2 variant also varies across populations carrying distinct single missense variants of hACE2. The findings may provide useful insights into the differential transmission rate of the emerging SARS-CoV-2 variants over individual populations.This study focuses particularly on the missense mutations of hACE2 protein, present within its binding interface with the RBD of the spike glycoprotein of SARS-CoV-2. Substitutions of these residues are more likely to impact the binding affinity of the hACE2-RBD complex. Accordingly, instead of the whole spike glycoprotein, our focus remained on its RBD, which directly interacts with hACE2. Furthermore, the hACE2 missense variants with the highest frequencies within a population were preferred as they represent a larger target population, broadening the scope of this study. The hACE2 missense variants assigned to the African (AFR), Latin American (AMR), non-Finnish European (NFE), and the South Asian (SAS) populations represent 5.12%, 0.001%, 1.75%, and 3.58% of the respective populations.Protein stability and protein-protein interactions are greatly affected by the accumulation of both stabilising and destabilising mutations. It is hence worthwhile to study the mutational pattern of the SARS-CoV-2 variants. This is particularly necessary for the evaluation of the efficacy of protective treatments against the disease, as well as, for predicting the fate of the ongoing pandemic. Stabilising mutations in the spike protein might increase its longevity under unfavourable conditions, whereas, destabilising mutations could significantly reduce the fitness of the protein. However, destabilising mutations tend to enhance protein flexibility, which often considerably affects the infectivity of a viral strain through an increase in receptor-binding affinity [52]. On the other hand, stabilising mutations render structural rigidity which may confer increased stability of the protein-protein complex [53]. In this study, the lineage-defining mutations of SARS-CoV-2 variants: Delta, Delta plus, Mu, Omicron sub-variants BA.1, BA.2, BA.4 and BA.2.12.1 were analysed. The destabilising mutation K417 N in the RBD region of the Omicron variants was reported to increase receptor-binding affinity [54]. This mutation along with five other notable mutations (N440K, G446S, E484A, Q493R and N501Y) have shown to markedly reduce the activity of monoclonal antibodies: bamlanivimab, etesevimab, casirivimab, and imdevimab [[55], [56], [57]] and moderate reduction in the activity in tixagevimab and sotrovimab [57]. Another destabilising mutation, L452R, present in the Delta variant was found to resist the activities of several monoclonal antibodies in therapeutic use [58]. The destabilising mutation E484K, present in the RBD of the spike glycoprotein of the Mu variant, exhibited an alteration in the complementarity of binding plasma polyclonal neutralizing antibodies [59]. The stabilising mutation N501Y, shared by the Mu, Omicron sub-variants BA.1, BA.2, BA.4 and BA.2.12.1 was found to increase the hACE2-binding affinity of the RBD [60]. This may also present a possible route of immune escape [61].We sought to explore the effects of the SARS-CoV-2 variant spike glycoprotein RBD mutations on the binding with hACE2 containing the most prevalent single amino acid substitutions in the RBD binding regions for the African, Latin American, non-Finnish European and South Asian populations using molecular docking. A more negative docking energy score should indicate higher stability between receptor and ligand, hence, lower binding energy, increased affinity, and stronger interaction [62]. Furthermore, MM/GBSA calculations are theoretically more rigorous than scoring functions as commonly used in many of the docking algorithms, and can be used to validate the docking energy scores generated in this manner [63].Previous studies have confirmed that Lys417, Tyr449, Phe456, Phe486, Gln493, Gly496, Gln498, Asn501 and Tyr505 are some key residues essential in the formation of a stronger complex with hACE2 [64,65]. Among which, a stable salt bridge between Lys417 of RBD and Asp30 of hACE2, along with three sustained H-bonds between RBD residues: Tyr449, Gln493 and Gln498 and hACE2 residues: Asp38, Glu35 and Lys353, distinguishes SARS-CoV-2 from SARS-CoV [66]. Hydrogen bonding plays a key role in hACE2-RBD binding among other intermolecular interactions. Mutations in the primary interface of hACE2-RBD induce stronger binding by increasing H-bonds as well as expanding buried solvent accessible surface [67]. Hence, we compared two types of interactions in terms of their number: salt-bridge, and hydrogen bonds between the reference and variant complexes. Increase and decrease in the quantity of these bonds fairly correlate with the strength, and to some extent, stability of the reference and variant complexes under investigation.Molecular Dynamics simulation was performed to further evaluate the stability of interactions between the RBD and hACE2 variants over time using Root Mean Square Deviation (RMSD) values of the hACE2-RBD complexes. The stability of a protein-protein complex decreases as the RMSD value increases [68]. MD simulations also revealed differences in fluctuation among the RBD variant residues while in complex with different hACE2 variants through Root Mean Square Fluctuation (RMSF) values. The RMSF values represent the degree of fluctuation for each of the amino acid residues of the molecule of interest. High RMSF values may be correlated with the flexibility of the protein which may alter a protein-protein binding [69].The N501Y substitution, first emerged in the Mu variant, had been reported to increase the affinity of the hACE2-RBD complex by forming new protein-protein interactions that modify internal structural dynamics [60,70]. Preferential selection of the mutation N501Y in more transmissible variants were also observed in an in vitro evolution study of the spike glycoprotein RBD [71] Our result is consistent with this finding as the binding free energy drastically moves towards the negative in case of the Mu variant in complex with the reference hACE2 (−80.43 kcal/mol) compared with that of the Delta variant (−68.77 kcal/mol) which lacks the given mutation, indicating higher binding affinity of the Mu variant for the reference hACE2. In comparison with the NFE hACE2, the Mu RBD manifests lower average RMSD (3.5072 Å) in complex with AMR-hACE2. Furthermore, key RBD residues of this variant, except Phe486, show lower RMSF while in complex with AMR hACE2 and much higher RMSF while in complex with NFE hACE2, when compared with reference hACE2. This finding aligns with the failure of the variant of South American origin to propagate into other populations, especially the Non-Finnish Europeans, as it revealed a less stable interaction with the NFE hACE2, compared with the AMR hACE2 [72].The Lys353-Tyr501 [N–H⋯O] bond was also present in the NFE hACE2 – Omicron BA.1 RBD complex which exhibited a much higher affinity (−77.53 kcal/mol) compared with the variant complex with reference hACE2 (−74.51 kcal/mol). However, a substantially high RMSD (3.3771 Å) was observed in the NFE hACE2 - Omicron BA.1 RBD complex compared with the reference hACE2 (2.7918 Å), which indicates the binding was not maintained properly during the simulation runtime. The Gly496 residue is mutated in the RBD of the Omicron BA.1, but not in the Omicron BA.2 variant. The G496S substitution in the Omicron BA.1 variant has been reported to decrease the binding affinity of the complex [60]. The mutated Ser496 Omicron BA.1 RBD variant forms H-bonds with Asp38 and Gln42 of the reference hACE2 with distances 2.62 Å and 3.22 Å. Whereas, unmutated Gly496 forms a single H-bond with Lys353 of the reference hACE2 with a distance of 2.77 Å. Nevertheless, the AFR hACE2 – Omicron BA.1 RBD complex exhibited the most negative binding free energy (−79.11 kcal/mol). The complex also reported the lowest average RMSD (2.5755 Å). This may explain the origin and initial spread of the strain in the African population as the mutations accumulated in this strain appear to yield an advantage in maintaining stable binding to AFR hACE2, as demonstrated by the MD simulation data.Our results suggest a higher affinity of the Omicron BA.2 variant, compared with its predecessor Omicron BA.1 variant. Furthermore, during the course of 125 ns MD simulation, Omicron BA.2 showed a lower average RMSD (2.5584 Å) in complex with the reference hACE2 compared with that of Omicron BA.1 (2.7918 Å). Hence, transmissibility is likely to increase for the Omicron BA.2 variant, which has caused a surge in the rate of infection. Since the WHO recognised Omicron as a Variant of Concern (VoC) on November 24, 2021, there have been 487,926 entries of complete genome sequences of Omicron BA.1 variant in the GISAID database [73] as of June 29, 2022. In contrast, there are 1,146,042 entries of complete genome of the Omicron BA.2 variant. This may represent the greater spread of Omicron BA.2 as the number of viral genome sequence deposition is representative of the circulating viral strain worldwide.The higher RMSF at residues 496–508 in both Omicron BA.1 and BA.2 complexes may relate to previous reports of reduced antibody response [74]. Although a lower RMSF indicates a more rigid behaviour of the residue in concern, it does not guarantee a more stable protein-protein interactions as protein-protein interactions largely rely on the conformational flexibility [75].The exceedingly low affinity of the reference hACE2 – Delta RBD variant complex, compared with a much higher affinity of the variant complex with the SAS hACE2 might be explained by comparing their patterns of interactions. The Delta RBD forms much fewer H-bonds with the reference hACE2, involving only four (Gln493, Gln498, Asn501, Phe486) of the key interacting residues mentioned above. Each of the residues exhibit high RMSF values in complex with the reference hACE2. Only Gln493 forms H-bonds with Lys31 and Gln35 of hACE2. Whereas, the Delta RBD residues manifest much lower RMSF values while interacting with SAS hACE2, of which Gln493 and Gln498 form H-bonds and Lys417 forms a salt bridge with hACE2 residues Lys31 and His34, Asp38, and Asp30, respectively. During the course of 125 ns MD simulation, the Delta variant maintained a much stable binding in complex with the SAS hACE2 compared with that of reference hACE2. This outcome is also notably consistent with the devastating spread of the variant in South-East Asia, as a large population of around 49.7 million in India alone would represent the SAS hACE2 harbouring the P84T substitution [76].Although stronger binding to the receptor yields a higher probability of receptor endocytosis of the virus and subsequently successful replication, it is not the only determinant of the rate of transmissibility of the virus and the severity of the disease. A furin-like protease, transmembrane serine protease 2 (TMPRSS2), plays an equal role in the virus's ability to infect a permissive cell. The protease cleaves the viral spike protein at the S1/S2 junction, enabling its fusogenic activity and allowing subsequent fusion with the host cell [77]. An earlier report has suggested the Delta variant outcompetes the Omicron BA.1 variant in infecting TMPRSS2 overexpressing VeroE6 cells in culture [78]. In our result, the Omicron BA.1 and BA.2 RBD variant exhibited much higher receptor affinity, compared with the Delta RBD variant. However, the Delta variant may overcome the lower receptor-binding affinity with the heightened ability of infection using TMPRSS2 activity. Nevertheless, Omicron variants have been shown to employ a TMPRSS2-independent pathway for infection [79]. Hence, the higher receptor-binding capacity of the Omicron variants may still imply higher transmissibility of these emerging variants.
Conclusions
Our study suggests variation in the transmissibility of the emerging SARS-CoV-2 variants across populations harbouring specific hACE2 missense variants. Although viral susceptibility at the population level depends on many factors related to individual hosts, this study may provide insights into transmissibility of emerging SARS-CoV-2 variants among different populations. Furthermore, our results also reinforce the available evidence supporting that the Omicron subvariant BA.2 shows the highest transmissibility as well as the increasing incidences of Omicron BA.2.12.1, among the recently emerged SARS-CoV-2 variants. The severity and outcome of disease, however, can only be predicted upon considering host immune responses and related factors. Nevertheless, predictions on variant transmissibility with accurate interpretations may encourage effective implementation of preventive measures to contain community transmission and the rate of infection caused by the newly emerging variants.
Author contributions
AHMNN conceived the idea and supervised the research work. AT, RA and PB retrieved the sequences. AT and RA performed modelling of the proteins. MA, AAS and SC performed the mutational analyses. AT, RA and MA performed the docking analyses. AT performed the MMGBSA analyses. RA performed the MD simulations and subsequent data analysis. AHMNN, TY, AT and RA prepared the first draft. AT, RA, PB, MA and TY prepared the revised manuscript. AHMNN, AKMMH and TY then curated the revised draft critically. All authors read and approved the revised manuscript.
Declaration of competing interest
The authors declare that there is no conflict of interest.
Authors: Kaiming Tao; Philip L Tzou; Janin Nouhin; Ravindra K Gupta; Tulio de Oliveira; Sergei L Kosakovsky Pond; Daniela Fera; Robert W Shafer Journal: Nat Rev Genet Date: 2021-09-17 Impact factor: 53.242
Authors: Christina Yek; Vu Sinh Nam; Rithea Leang; Daniel M Parker; Seng Heng; Kimsan Souv; Siv Sovannaroth; Mayfong Mayxay; Sazaly AbuBakar; R Tedjo Sasmono; Nhu Duong Tran; Hang Khanh Le Nguyen; Chanthap Lon; Kobporn Boonnak; Rekol Huy; Ly Sovann; Jessica E Manning Journal: Front Trop Dis Date: 2021-11-18
Authors: Abdullah Al Saba; Maisha Adiba; Piyal Saha; Md Ismail Hosen; Sajib Chakraborty; A H M Nurun Nabi Journal: Comput Biol Med Date: 2021-07-30 Impact factor: 4.589