Nisha Bhattarai1, Prabin Baral1, Bernard S Gerstman1,2, Prem P Chapagain1,2. 1. Department of Physics, Florida International University, Miami, Florida 33199, United States. 2. Biomolecular Sciences Institute, Florida International University, Miami, Florida 33199, United States.
Abstract
The novel coronavirus (SARS-CoV-2) pandemic that started in late 2019 is responsible for hundreds of millions of cases worldwide and millions of fatalities. Though vaccines are available, the virus is mutating to form new strains among which are the variants B.1.1.7 and B.1.351 that demonstrate increased transmissivity and infectivity. In this study, we performed molecular dynamics simulations to explore the role of the mutations in the interaction of the virus spike protein receptor binding domain (RBD) with the host receptor ACE2. We find that the hydrogen bond networks are rearranged in the variants and also that new hydrogen bonds are established between the RBD and ACE2 as a result of mutations. We investigated three variants: the wild-type (WT), B.1.1.7, and B.1.351. We find that the B.1.351 variant (also known as 501Y.V2) shows larger flexibility in the RBD loop segment involving residue K484, yet the RBD-ACE2 complex shows higher stability. Mutations that allow a more flexible interface that can result in a more stable complex may be a factor contributing to the increased infectivity of the mutated variants.
The novel coronavirus (SARS-CoV-2) pandemic that started in late 2019 is responsible for hundreds of millions of cases worldwide and millions of fatalities. Though vaccines are available, the virus is mutating to form new strains among which are the variants B.1.1.7 and B.1.351 that demonstrate increased transmissivity and infectivity. In this study, we performed molecular dynamics simulations to explore the role of the mutations in the interaction of the virus spike protein receptor binding domain (RBD) with the host receptor ACE2. We find that the hydrogen bond networks are rearranged in the variants and also that new hydrogen bonds are established between the RBD and ACE2 as a result of mutations. We investigated three variants: the wild-type (WT), B.1.1.7, and B.1.351. We find that the B.1.351 variant (also known as 501Y.V2) shows larger flexibility in the RBD loop segment involving residue K484, yet the RBD-ACE2 complex shows higher stability. Mutations that allow a more flexible interface that can result in a more stable complex may be a factor contributing to the increased infectivity of the mutated variants.
The highly contagious severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) is a
form of severe acute respiratory syndrome coronavirus (SARS) that had an outbreak in China
in 2003 and causes the COVID-19 disease in a number of animal species including
humans.[1−3] The first case of
SARS-CoV-2 was detected in December 2019 in Wuhan, China, and a pandemic was declared in
March 2020 due to its worldwide transmission. A person contracting this disease can have
symptoms including fever, dry cough, headache, breathing difficulties, and
pneumonia.[2−4] In January 2021, the number
of SARS-CoV-2 cases surpassed 100 million worldwide with over 2 million deaths. Vaccines
produced by Pfizer/BioNTech and Moderna have been authorized for emergency use in the United
States after demonstrating efficacy in clinical trials.[5]Despite the progress in vaccine administration, emergence of mutated variants is raising
concerns about vaccine efficacies. Notably, the viruses that emerged in the United Kingdom
(B.1.1.7 lineage) and separately in South Africa (B.1.351 lineage, also known as 501Y.V2)
are reported to increase the transmission of the virus and host immune
evasion.[6,7] In the
spike protein alone, the B.1.1.7 variant has amino acid deletions at H69, V70, and Y144 and
mutations N501Y, A570D, P681H, T716I, S982A, and D1118H, while the South African variant has
mutations L18F, D80A, D215G, R246I, K417N, E484K, N501Y, and A701V.[8] The
N501Y mutation at the receptor binding domain (RBD) that directly interacts with the human
receptor ACE2, as shown in Figure , is found to be
common in both of these mutants and is assumed to increase virus infectivity and
transmittivity.[9] In the South African variant, mutations in the RBD
(K417N, E484K, and N501Y) are thought to have functional significance.[10]
The B.1.1.7 variant is reported to have a 71% higher transmission rate than the other
variants.[11] The full significance of the B.1.351 variant is still
unclear, and it is assumed that this lineage could be associated with increased
transmissibility[10] and render current vaccines significantly less
effective.[12]
Figure 1
SARS-CoV-2 spike trimers with RBD-up structure complexed with ACE2 for (a) B.1.1.7 and
(b) B.1.351 variants after relaxing with molecular dynamics (MD) simulation. The mutated
residues are highlighted in purple.
SARS-CoV-2spike trimers with RBD-up structure complexed with ACE2 for (a) B.1.1.7 and
(b) B.1.351 variants after relaxing with molecular dynamics (MD) simulation. The mutated
residues are highlighted in purple.The occurrence of mutated variants with increased rates of virus transmission and immune
evasion has raised concern about the necessity of modifying existing vaccines and
treatments.[12,13]
Guidance for these interventions can benefit from an understanding of changes in the
molecular mechanisms of infection as a result of the mutations, which is also essential for
understanding the enhanced virulence of the mutated variants. Although some of the studies
have been performed with these variants,[14] the details of amino acids
that play crucial roles at the RBD–ACE2 interface yet remain unexplored. In this
study, we investigated the changes in molecular bonding patterns between the spike
protein-mutated RBD and the host ACE2 for the B.1.1.7 and B.1.351 variants compared to that
of the wild-type (WT). We find that the mutations have significant effects by changing the
amino acids responsible for hydrogen bonding between the RBD and ACE2.
Materials and Methods
System Preparation
A fully glycosylated SARS-CoV-2spike trimer in which one of the receptor binding domains
is facing upwards and complexed with ACE2 was retrieved from the archives of the
CHARMM-GUI[15,16]
COVID-19 repository (PDB ID 6VSB_6VW1).[17−19] This wild-type (WT)
structure contains modeled glycan chains at the glycosylated sites and has the missing
residues filled with GalaxyFill.[20,21] The mutations or deletions were then introduced to this WT structure
using CHARMM-GUI.[16] We prepared a total of nine independent systems:
the spike trimer complexed with ACE2, spikeRBD–ACE2 complex, and RBD-only system
for each of WT, B.1.1.7, and B.1.351 variants. Residues 1–1146 were considered in
the spike trimer structure and 19–614 in the ACE2 domain. Similarly, the RBD
residues 330–530 were used in the RBD-only as well as for the RBD–ACE2
complex systems, and residues 19–614 were used in the ACE2 domain. All nine systems
were set up using the CHARMM-GUI Solution Builder tool with standard parameters, including
a physiological salt concentration of 0.15 M KCl.
Molecular Dynamics Simulations
All-atom, explicit solvent molecular dynamics (MD) simulations were performed using the
NAMD 2.13 simulation package[22] with the Charmm36m force
field.[23,24] The
covalent bonds involving hydrogen atoms were constrained by using the SHAKE
algorithm,[25] and the pressure was controlled using the
Nose–Hoover Langevin piston method[26] with a piston decay of 25
fs and a period of 50 fs. The particle mesh Ewald method[27] was used to
calculate the long-range ionic interactions, and the temperature was controlled by using
the Langevin temperature coupling with a friction coefficient of 1 ps–1.
Minimization and equilibration were performed with NVT (constant volume and temperature)
conditions, and the production runs were performed under NPT (constant pressure and
temperature) conditions at 303 K using 2 fs time steps. The RBD-only systems were run for
600 ns each, and the RBD–ACE2 systems were run for 200 ns each. Additional replicas
of RBD-only systems were run for 300 ns. For all larger systems of spike protein
trimer–ACE2 complexes (∼900 000 atoms), the production runs were
limited to 50 ns each, which should be sufficient for the side-chain relaxations upon
mutation but not for large-scale conformational changes. Visualization and rendering of
trajectories were done with visual molecular dynamics (VMD).[28]
Dynamical Network Analysis
We used the Network Plugin in VMD[29,30] for dynamical network analysis calculations. Carma[31] and Catdcd[30] were used to calculate the covariance and correlations
between interfacial amino acid pairs during an MD trajectory. To generate the amino acid
community network, gncommunities[29] was employed. Using the
Girvan–Newman algorithm, the time-averaged connectivity of the nodes was determined
and the shortest path between the nodes for connections was also identified.[32]
Principal Component Analysis
The principal component analysis (PCA) of the Bio3D[33] software package
was used to study the coordinated motions of amino acids and to understand the
conformational motions of the atoms in all three systems. First, least-squares fitting is
used to remove the translational and rotational motions of the proteins during the MD
trajectory and the remaining motion of the protein atoms is analyzed.
Results and Discussion
The spike protein of the B.1.1.7 variant has deletions of H69, V70, and Y144 and mutations
N501Y, A570D, P681H, T716I, S982A, and D1118H amino acids. Similarly, the B.1.351 variant
has mutations L18F, D80A, D215G, R246I, K417N, E484K, N501Y, and A701V. Figure shows the mutations (highlighted in purple) in the spike
trimer complexed with ACE2 in both B.1.1.7 and B.1.351 variants. The RBD of B.1.1.7 has only
one mutation N501Y, whereas B.1.351 has N501Y and additional mutations K417N and E484K. To
investigate the interactions with the human host receptor ACE2 in more detail, we focused on
the RBD-only as well as the RBD–ACE2 complexes.
Structural Changes and Flexibility of the RBD Due to Mutations
Mutations in SARS-CoV-2 genes can lead to alterations in protein structures.[34] To investigate the changes in the RBD structure and dynamics due to
mutations, we performed MD simulations of the RBD-only systems, not complexed with ACE2,
for the three variants: WT, B.1.1.7 with the mutation N501Y, and B.1.351 with three
mutations N501Y, K417N, and E484K. Figure a–c shows the RBD for the WT, B.1.1.7, and B.1.351 variants after 600 ns
simulations. To visualize the structural flexibility of the RBD of each variant, we
overlapped 20 conformations, taken one frame every 25 ns of the last 500 ns trajectory, as
displayed in Figure S1. We colored each of these conformations according to the
residues’ root-mean-square fluctuations (rmsf), calculated for a 50 ns window that
is centered at that specific frame. The overlapped structures, combined with the color
gradients from blue–white–red, show that all variants have a flexible loop
region (red) consisting of residues 473–489. Simulation of the B.1.351 variant
shows significant structural changes in this loop region, which involves the mutation
E484K, as shown in Figure c,d. This loop is an
important RBD region as it is involved in binding with ACE2. To understand the
conformational changes of the loop in B.1.351, we visualized the hydrogen-bonding pattern.
For the B.1.351 variant, the initial hydrogen bonds break or weaken and the loop reorients
(Figure S2a). In Figure S2b, we plotted the acceptor–donor distances between the
important hydrogen-bond-forming residues for all three variants. The hydrogen bond
Y473–Y489 remains intact for all three variants, whereas other hydrogen bonds
fluctuate and differ. We superimposed five conformations of each variant from the last 500
ns (one conformation every 100 ns) and are displayed in Figure d, highlighting the flexible loop segment. To understand the
flexibility of the loop region, we calculated the average root-mean-square fluctuations
(rmsf) of the loop residues 480–489 for sliding windows of a width of 10 ns, with a
total of 60 windows (Figure e). The average rmsf
of the loop residues 480–489 shows that the residues in the B.1.351 show larger
fluctuations in general. While the loop in the WT shows occasional large motion, as shown
by the distance between the Cα atoms of residues 484 and 489, this motion
is different in the B.1.1.7 and B.1.351 variants (Figure f). The loop in the B.1.1.7 variant shows multiple semistable states. However,
the loop is quite flexible in B.1.351 (Figures e,f and S2 and Movie S1). This observation is reproduced in the 300 ns replica run, which
shows even clearer differences in the flexibility (Figure S3b,d). A flexible loop in B.1.351 may allow the RBDto bind with
ACE2 more easily by induced fit mechanism. To investigate the binding with ACE2, we
further explored the interactions in the RBD–ACE2 complexes.
Figure 2
RBD structures at the end of 600 ns simulation for (a) WT, (b) B.1.1.7, and (c)
B.1.351. The loop segments consisting of residues 473–489 are highlighted in
blue. Mutations in the variants are labeled with red fonts. (d) Structural comparison
of the five conformations of each variant, taken from the last 500 ns of the
simulation (i.e., one frame every 100 ns). (e) Root-mean-square fluctuations (rmsf) as
a function of time for residues 480–489 for all three variants: WT (green),
B.1.1.7 (blue), and B.1.351 (red). (f) Distance between the Cα atoms
of residues 484 and 489 showing large-scale motions of the loop segment. The rmsf and
the distance graphs for the replica runs are plotted in Figure S3b,d.
RBD structures at the end of 600 ns simulation for (a) WT, (b) B.1.1.7, and (c)
B.1.351. The loop segments consisting of residues 473–489 are highlighted in
blue. Mutations in the variants are labeled with red fonts. (d) Structural comparison
of the five conformations of each variant, taken from the last 500 ns of the
simulation (i.e., one frame every 100 ns). (e) Root-mean-square fluctuations (rmsf) as
a function of time for residues 480–489 for all three variants: WT (green),
B.1.1.7 (blue), and B.1.351 (red). (f) Distance between the Cα atoms
of residues 484 and 489 showing large-scale motions of the loop segment. The rmsf and
the distance graphs for the replica runs are plotted in Figure S3b,d.
RBD–ACE2 Interactions in the Variants
To understand the differences in the ACE2 binding, we explored the dynamics of the
RBD–ACE2 complexes for all three variants. The analyses of the interaction pattern
between the viral spike protein RBD and the receptor protein ACE2 from the 200 ns of MD
simulations show that the B.1.1.7 and B.1.351 mutations have a significant impact on the
association of residues in the interfacial region. Figure shows the conformations of each system at 100 ns. The mutated
residues are labeled, and the residues across the interface interacting within 3.5 Å
of each other are highlighted.
Figure 3
Snapshot of the RBD–ACE2 complex at 100 ns of MD simulation time: (a) WT, (b)
B.1.1.7, and (c) B.1.351 variants. Top row: residues interacting within 3.5 Å of
each other are highlighted in red for the RBD and blue for the ACE2 protein. The
mutated residues in the B.1.1.7 and B.1.351 are labeled. Bottom row: residues forming
interfacial hydrogen bonds between the viral spike RBD (green) and the host ACE2
(orange) WT, B.1.1.7, and B.1.351 variants. The highlighted box shows the residue
pairs forming hydrogen bonds that only occur in the SA variant.
Snapshot of the RBD–ACE2 complex at 100 ns of MD simulation time: (a) WT, (b)
B.1.1.7, and (c) B.1.351 variants. Top row: residues interacting within 3.5 Å of
each other are highlighted in red for the RBD and blue for the ACE2 protein. The
mutated residues in the B.1.1.7 and B.1.351 are labeled. Bottom row: residues forming
interfacial hydrogen bonds between the viral spikeRBD (green) and the host ACE2
(orange) WT, B.1.1.7, and B.1.351 variants. The highlighted box shows the residue
pairs forming hydrogen bonds that only occur in the SA variant.To determine how the B.1.1.7 and B.1.351 variants differ from the WT in RBD–ACE2
bonding, we calculated the number of hydrogen bonds formed during the last 100 ns of the
MD simulations. The residue pairs that have significant interaction between the RBD and
the ACE2 are identified from the hydrogen bond occupancy rate for each residue pair. Some
of the residues contributing to the formation of hydrogen bonds in the RBD–ACE2
complex for the WT, B.1.1.7, and B.1.351 variants are highlighted in Figure . Figure a–c shows the percentage of time that various interprotein hydrogen bonds
existed during the last 100 ns of the simulations for the WT, B.1.1.7, and B.1.351
variants. The H-bonds of the RBD–ACE2 residue pairs G502–K353,
T500–D355, Q498–Q42, and N487–Y83 occurred in all three complexes but
with different occupancy probabilities. Some H-bonds in the WT become less important or
disappear in the B.1.1.7 and B.1.351 variants but new hydrogen bonds S477–E22,
T478–E22, and K484–E75 with high occupancy are observed in the B.1.351
variant. H-bonds unique to each system (i.e., present in one and absent in the remaining
two) are displayed in Figure d.
Figure 4
Interaction matrix for the residue pairs contributing to interprotein hydrogen
bonding: (a) WT, (b) B.1.1.7, and (c) B.1.351 variants. The residue pairs are ordered
in the matrices according to the percentage of hydrogen bond occupancy for the WT. The
percentage of the hydrogen bonds that are unique to each variant is colored in green.
(d) Prominent hydrogen bonds in each variant.
Interaction matrix for the residue pairs contributing to interprotein hydrogen
bonding: (a) WT, (b) B.1.1.7, and (c) B.1.351 variants. The residue pairs are ordered
in the matrices according to the percentage of hydrogen bond occupancy for the WT. The
percentage of the hydrogen bonds that are unique to each variant is colored in green.
(d) Prominent hydrogen bonds in each variant.Figure shows that the strong salt-bridge
interaction K417–D30 between RBD–ACE2 is present in both the WT and B.1.1.7
variants but not in the B.1.351 variant with the K417N mutation. In addition, another WT
salt-bridge interaction E484–K31 is changed toK484–E75 in the B.1.351
variant. The S477–E22 and T478–E22 H-bond pairs in B.1.351 are especially
interesting because they are absent in the WT and B.1.1.7 variants. Both of these H-bonds
shown in Figure d are located at the terminal
region of the interface (highlighted within the dotted box in Figure
c). Overall, the hydrogen bond analysis shows one new H-bond
(Y501–K353) in B.1.1.7 and eight new H-bonds (N487–Q23, K484–E75,
Q493–K31, F490–K31, S477–E22, A475–Q23, Y505–E37, and
T478–E22) in B.1.351. Despite the larger flexibility of the RBD interfacial region
of the B.1.351 variant when not complexed with ACE2, the presence of the additional
hydrogen bonds at the RBD–ACE2 interface suggests an enhanced specificity and the
formation of a stable complex for this variant. We calculated the binding energies for the
RBD–ACE2 complexes using PRODIGY webserver,[35] which utilizes the
interfacial contacts and other properties such as polarity and charge at the interfaces to
predict the binding affinity. The binding affinity values calculated for five different
frames in the last 20 ns of the 200 ns run are shown in Table S2. These calculations show that the binding affinities are
essentially the same in these variants, suggesting that the complexes have similar
stabilities. To further explore the interprotein interactions and the stability of
complexes, we performed dynamical network analysis.
Dynamic Correlated Motions in the Variants
The bonding interactions described above between the viral RBD and the host ACE2 are
crucial in stabilizing the complex. The relative stability of the RBD–ACE2 complex
for the different variants can be elucidated by investigating the conformational
flexibility between the RBD and ACE2. We examined this by using a dynamical network
analysis method. Dynamical network analysis quantifies connection networks and communities
by examining correlated motions in proteins and nucleic acids.[29,36−38] It generates detailed information on the
connections of amino acids in the form of communities and gives information about how
strongly the amino acids are connected and how much their motion is correlated. Figure shows various amino acid communities in the
dynamic network obtained from the last 100 ns of the MD simulation. Amino acids are
represented as nodes centered at the Cα, and the correlated motion
between the amino acids is represented by the weighted edge (bar) between the nodes. The
thickness of the edges represents the strength of the connections between the nodes. For
nodes to be in the same community, they need to have a connection for more than 80% of the
time.[29]
Figure 5
Dynamic network community analysis for (a) WT, (b) B.1.1.7, and (c) B.1.351 variants
from the last 100 ns of the MD simulation trajectory. Communities that span the
interface between the RBD and ACE2 are numbered 1–4 and shown in an expanded
view for (d) WT (tan, red; 16 interprotein connections), (e) B.1.1.7 (silver, green,
red; 18 interprotein connections), and (f) B.1351 (purple, gray, red; 18 interprotein
connections).
Dynamic network community analysis for (a) WT, (b) B.1.1.7, and (c) B.1.351 variants
from the last 100 ns of the MD simulation trajectory. Communities that span the
interface between the RBD and ACE2 are numbered 1–4 and shown in an expanded
view for (d) WT (tan, red; 16 interprotein connections), (e) B.1.1.7 (silver, green,
red; 18 interprotein connections), and (f) B.1351 (purple, gray, red; 18 interprotein
connections).Figure a–c shows various dynamic network
communities obtained for all variants, with each community colored differently. To
visualize the interprotein interactions between the RBD and the ACE2, the communities that
span over both of these proteins are highlighted in the expanded view of the interfacial
region shown in Figure d–f. Amino acids
present within the same community have highly correlated motions. This means that if a
community spreads over the interface to include both the RBD and ACE2, the flexibility of
the RBD with respect to the ACE2 is reduced. The amino acids in the RBD and ACE2 that
compose each community and their interprotein connections are given in Table S1. The decrease in interfacial flexibility is notable for the B.1.351
variant, which has more and stronger interprotein connections (larger edge weights),
compared to that in the WT and the B.1.1.7 variant. From these findings of the number of
interprotein connections and the strength of the connections, it appears that the N501Y,
E484K, and K417N mutations in the B.1.351 variant allow the formation of a stable
RBD–ACE2 complex.We further explored the conformational motions of these RBD–ACE2 complexes by
using principal component analysis (PCA) with the 200 ns of the MD simulations. PCA is
helpful in analyzing the motion of complicated systems with many particles and many
internal degrees of freedom. The principal components describe the axes of maximal
variance of the distribution of the structure. Eigenvectors represent coordinated motions
in the fluctuations in atomic positions during the MD simulation, and the magnitude of the
fluctuations of atoms is given by the corresponding eigenvalues. The first few
eigenvectors represent the largest motion.Figure S4 shows the range of motion from the principal components (PCs) 1
and 2 for each amino acid for the WT, B.1.17, and B.1.351 variants. The blurrier RBD in
Figure S4a implies a greater range of motion for the WT, compared to that of
the B.1.351 variant in Figure S4c. The conformational motion of the first three PCA eigenvectors
PC1, PC2, and PC3 was projected onto 2D subspaces, as shown in Figure S4d–f. The conformational state of the system in each MD frame
is represented by the blue and red dots. These distributions show that the WT complex has
a greater conformational range than that of the mutated systems, with the B.1.351 showing
more clustered PC1/PC2 and PC1/PC3. These findings are consistent with the results of the
hydrogen bond and dynamical network analyses.
Conclusions
The SARS-CoV-2spike protein is the major surface protein of the virion that is primarily
responsible for the virus’s ability to enter the human host through the ACE2
receptor. The interfacial region of the spikeRBD–ACE2 complex plays a major role in
the binding of the virus to the host cell. The emergence of mutated variants in the B.1.1.7
and B.1.351 raises concerns about alterations in the interprotein amino binding in this
complex as well as in the efficacy of vaccines designed for the WT. Here, we have performed
MD simulations and determined changes in the amino acids and their binding at the
RBD–ACE2 interface due to the mutations in three different variants. We find that the
RBD interfacial region shows a much larger flexibility in B.1.351 when not complexed with
ACE2 but a stronger binding in the complex. The observed changes in the number and strength
of the interprotein hydrogen bonds imply an enhanced specificity and binding in the B.1.351
variant, suggesting an optimized flexible interface that allows stronger binding via induced
fit mechanism for this variant. Consistent with the stronger interprotein binding, dynamical
network analysis and principal component analysis show that the RBD–ACE2 complex with
the B.1.351 variant is less flexible. The question of how the flexibility of the RBD and the
subtle changes in the interactions with ACE2 affect the virus’s transmissibility or
the vaccine efficacy for a variant is difficult to address but these studies provide
information on the molecular interactions of RBD–ACE2 complexes in different
variants, and it can be important for understanding the molecular mechanism of the virus
entry into the cell and designing therapeutic interventions. Further studies are needed to
understand the full scope of the effects resulting from the mutations or deletions. For
example, the mutations may affect trimer formation or trimer stability. In fact, it is
recently shown that the D614G mutation enhances the spike protein trimer stability and
prevents premature release of S1 subunit.[39] It will be useful to know how
the mutations affect the springing up of the RBD from the closed conformation that is
shielded with glycans[40,41] or how the mutations in the rest of the spike, including the ones in the
furin cleavage site,[42,43]
affect the mechanism of virus entry, infectivity, or immune evasion.
Authors: Paulina Kaplonek; Deniz Cizmeci; Stephanie Fischinger; Ai-Ris Collier; Todd Suscovich; Caitlyn Linde; Thomas Broge; Colin Mann; Fatima Amanat; Diana Dayal; Justin Rhee; Michael de St Aubin; Eric J Nilles; Elon R Musk; Anil S Menon; Erica Ollmann Saphire; Florian Krammer; Douglas A Lauffenburger; Dan H Barouch; Galit Alter Journal: bioRxiv Date: 2021-08-31