Aditya K Padhi1, Soumya Lipsa Rath2, Timir Tripathi3. 1. Laboratory for Structural Bioinformatics, Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro, Tsurumi, Yokohama, Kanagawa 230-0045, Japan. 2. Department of Biotechnology, National Institute of Technology, Warangal, Telangana 506004, India. 3. Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong 793022, India.
Abstract
The COVID-19 pandemic has emerged as a global medico-socio-economic disaster. Given the lack of effective therapeutics against SARS-CoV-2, scientists are racing to disseminate suggestions for rapidly deployable therapeutic options, including drug repurposing and repositioning strategies. Molecular dynamics (MD) simulations have provided the opportunity to make rational scientific breakthroughs in a time of crisis. Advancements in these technologies in recent years have become an indispensable tool for scientists studying protein structure, function, dynamics, interactions, and drug discovery. Integrating the structural data obtained from high-resolution methods with MD simulations has helped in comprehending the process of infection and pathogenesis, as well as the SARS-CoV-2 maturation in host cells, in a short duration of time. It has also guided us to identify and prioritize drug targets and new chemical entities, and to repurpose drugs. Here, we discuss how MD simulation has been explored by the scientific community to accelerate and guide translational research on SARS-CoV-2 in the past year. We have also considered future research directions for researchers, where MD simulations can help fill the existing gaps in COVID-19 research.
The COVID-19 pandemic has emerged as a global medico-socio-economic disaster. Given the lack of effective therapeutics against SARS-CoV-2, scientists are racing to disseminate suggestions for rapidly deployable therapeutic options, including drug repurposing and repositioning strategies. Molecular dynamics (MD) simulations have provided the opportunity to make rational scientific breakthroughs in a time of crisis. Advancements in these technologies in recent years have become an indispensable tool for scientists studying protein structure, function, dynamics, interactions, and drug discovery. Integrating the structural data obtained from high-resolution methods with MD simulations has helped in comprehending the process of infection and pathogenesis, as well as the SARS-CoV-2maturation in host cells, in a short duration of time. It has also guided us to identify and prioritize drug targets and new chemical entities, and to repurpose drugs. Here, we discuss how MD simulation has been explored by the scientific community to accelerate and guide translational research on SARS-CoV-2 in the past year. We have also considered future research directions for researchers, where MD simulations can help fill the existing gaps in COVID-19 research.
Molecular dynamics (MD) simulation is a numerical method to study many-particle systems,
such as molecules, clusters, and even macroscopic systems like gases, liquids, and solids.
Broadly, it is a form of computer simulation in which atoms and molecules are allowed to
interact for a fixed time period, which typically solves the classical equations of motion
for atoms and molecules and obtains the time evolution information on a system. The initial
grand success of MD simulation in material science and chemical physics paved the way for a
broad yet unexplored field of biological sciences.[1] It represents an
interface between wet- and dry-lab and, therefore, is often described as a “virtual
microscope” with high temporal and spatial resolution. MD simulation provides
complete knowledge of a studied system, where if all trajectories are known, the
thermodynamic, dynamic, and physicochemical properties of the molecules can be extracted and
analyzed. As biological macromolecules exert their functions due to their dynamic rather
than static nature, MD simulation serves as an ideal approach to investigate the range of
accessible configurations and conformations of biomolecules as a function of time by the
simultaneous integration of Newton’s equations of motion.[2] Over
the past decades, MD simulations have been utilized in numerous studies, starting from
understanding biomolecular structure–dynamics–function relationships,
conformational dynamics, allostery, drug design, and structure prediction refinement, to
understanding disease pathophysiologies by mimicking physiological conditions and generating
experimentally testable hypotheses and predictions (Figure ).[3−6] Inevitably, performing biochemical experiments on severe acute
respiratory syndrome coronavirus 2 (SARS-CoV-2) is time-consuming and requires sophisticated
safety protocols. In comparison, the computational studies are quick and easily performed
and provide information that is sometimes challenging to obtain from the wet-lab
experiments.[7] Thus, MD simulation has emerged as the most common yet
obvious method to investigate biomolecular interactions and conformational dynamics.
Multiscale coarse-grained models have been used to understand the behavior of the complete
SARS-CoV-2 virion.[8] The experimentally determined high-resolution 3-D
structures of SARS-CoV-2 proteins have been used for simulation studies to determine their
detailed mechanistic attributes and dynamics and identify conformational changes. The
combination of docking and MD simulation-based binding free energy calculations has proven
valuable in understanding protein–protein interaction and identifying potential
inhibitors. In addition, the MD studies have revealed crucial information on
virus–host interactions. All of this information has helped accelerate COVID-19
research and has improved our knowledge of SARS-CoV-2 biology. In this Perspective, we
discuss how MD simulation has been utilized to address various aspects of SARS-CoV-2-induced
pathogenesis, with the specific intent being to help fill the gaps in our understanding of
the new disease.
Figure 1
MD simulation system for the SARS-CoV-2 Mpro. A simulation box with two
monomers of the Mpro dimer (PDB ID: 6LU7) is shown in blue and purple cartoons. Water is shown as
transparent, and ions K+ and Cl– are shown in tan and cyan
van der Waals spheres, respectively. Reproduced with permission from ref (71). Copyright 2020 American Chemical Society.
MD simulation system for the SARS-CoV-2Mpro. A simulation box with two
monomers of the Mpro dimer (PDB ID: 6LU7) is shown in blue and purple cartoons. Water is shown as
transparent, and ions K+ and Cl– are shown in tan and cyan
van der Waals spheres, respectively. Reproduced with permission from ref (71). Copyright 2020 American Chemical Society.
Protein Interactions and Conformational Dynamics
Perhaps the most crucial application of MD simulations in COVID-19 research has been its
ability to reveal the structural dynamics and conformational arrangements of the viral
proteins and associated protein–protein interactions. MD simulations have been
instrumental in studying the structure, flexibility, packing, and interactions of SARS-CoV-2
proteins. In this section, we discuss the application of MD simulations for obtaining
information on the structure and dynamics of viral proteins.
Spike Glycoprotein
Spike glycoprotein (S-protein), one of the most prominent structures of SARS-CoV-2, is
present on the surface of the virus envelope and helps in attaching to the target cell
receptor, specifically the angiotensin-converting enzyme 2 (ACE2) receptor. The S-protein
is club-shaped and exists as a trimer, with each monomer consisting of two domains (S1 and
S2). It is also heavily glycosylated with the N-linked and O-linkedglycans. While the
S2-domain is embedded in the viral membrane, the S1-domain is exposed to the surface.
Since the S-protein is involved in the viral attachment and entry, numerous studies have
been carried out to understand the structural dynamics associated with the interactions of
S-protein with the ACE2 receptor. The X-ray crystal structures of S-protein revealed
interesting and unique dynamics.[9−11] The
S-protein can exist in three conformational states: open (or up), semiopen, and closed (or
down). These conformations refer to the structure of the receptor-binding domain (RBD,
residues 319–541) found on the S1 region of the protein. The transition of the
S-protein from open to closed conformation occurs via a semiopen conformation. In the
closed form, large numbers of intermolecular salt bridges and hydrogen-bonded interactions
were present between the monomers, which reduced the overall dynamics of the RBD. However,
interactions were gradually lost when it traversed to the open conformation, as revealed
by a steered MD study.[12] Interestingly, residues that bind to ACE2 were
solvent accessible in the closed conformation, but they could not bind to ACE2 due to
steric hindrance. The closed and open conformations occupied two different energy wells in
the free energy landscape, whereas the semiopen conformation had the intermediate energy
well. The comparison of 83 different β-coronavirusS-proteins revealed that the
incidence of an open and closed conformation was dependent on the interdomain contacts,
which also determines the surface antigenicity.SARS-CoV-2 is highly adaptable to external environmental conditions; the S-protein
acquires the closed conformation at high temperatures and a more open conformation between
temperatures 20 and 40 °C, indicating temperature-sensitivity.[13]
The influence of the environment can also be ascertained from the well-known D614Gmutation, which changes the conformation of RBD, allowing it to form better interactions
with ACE2. Introducing mutations that alter interdomain interactions altered the
RBD-domain, at times exposing its immunogenic receptor-binding regions.[14] Understanding such dynamics is crucial for the rational design of S-protein and ACE2
decoys for therapeutic applications.[15] The S1- and S2-domains have a
tiny contact area, which allows considerable conformational flexibility in S1. The RBD,
N-terminal domain (NTD, residues 14–305), and subdomains of S1 move along with the
connector domain as a single subunit. The connector domain comprises three
“hinges” called the “hip, knee, and ankle” regions. The hinges
allow the large RBD and the NTD of S1 to adjust their height and proximity to ACE2
receptors for stronger binding. A 2.5-μs-long MD simulation of the S-protein
revealed that the head region of the S-protein remained largely stable, but the connector
region with the three hinges was flexible.[16] Although the
spike’s head is heavily tilted, the movement of the hinge region allows the heads
to connect accurately to the membranes. Notably, the hinges are heavily N-glycosylated
(90% glycosylation) that protects them from antibody recognition and binding. Molecular
simulations and network modeling studies characterized the dynamics and mobility features
of the distinct functional states and revealed that the bending dynamics of the stalk
region helped the virus to scan the cell surface more efficiently[17]
(Figure ). Another all-atomMD analysis
further identified the mechanisms of SARS-CoV-2membrane fusion and showed that a trimeric
unit of SARS-CoV-2S-protein was efficient in triggering initial stages of membrane
fusion, where the residues 816–855 of S-protein represent the fusion
peptide.[18]
Figure 2
Functional dynamics and analysis of collective motions in the locked, closed, and
open states of the SARS-CoV-2 S-trimer prefusion form. (A) Mean-square fluctuations of
the locked state averaged over the three lowest-frequency modes for the cryo-EM
structure of the disulfide-stabilized SARS-CoV-2 S-trimer. (B) Essential mobility
profiles averaged over the three lowest-frequency modes for the cryo-EM structure of
the S-trimer in the closed state and (C) the open state are shown. Structural maps of
the essential mobility profiles for the locked state of the S-prefusion trimer (D),
closed state (E), and open state (F) are shown. Reproduced with permission from ref
(17). Copyright 2020 American Chemical
Society.
Functional dynamics and analysis of collective motions in the locked, closed, and
open states of the SARS-CoV-2 S-trimer prefusion form. (A) Mean-square fluctuations of
the locked state averaged over the three lowest-frequency modes for the cryo-EM
structure of the disulfide-stabilized SARS-CoV-2 S-trimer. (B) Essential mobility
profiles averaged over the three lowest-frequency modes for the cryo-EM structure of
the S-trimer in the closed state and (C) the open state are shown. Structural maps of
the essential mobility profiles for the locked state of the S-prefusion trimer (D),
closed state (E), and open state (F) are shown. Reproduced with permission from ref
(17). Copyright 2020 American Chemical
Society.The SARS-CoV-2S-protein is heavily glycosylated, with 22 N-linked and 4 O-linked
glycosylation sites. The heavy glycosylation helps the virus to evade the humoral immune
response of the host. The MD simulation studies helped significantly with understanding
the distribution and role of glycans in the S-protein.[19,20] While the stalk region is heavily
glycosylated, shielding it from possible antibodies, the RBD has glycan holes providing an
opportunity for the antibodies to bind. Only 62% of the RBD region is shielded from
antibodies compared to 90% of the stalk region. Amaro et al. performed a microsecond MD
simulation of the fully glycosylated S-protein on a realistic membrane system and
identified that the S1-domain has relatively lesser glycans; however, they play a
significant role in the S-protein binding and dynamics.[19] The NTD also
has fewer glycans providing greater access to its epitopes. The glycanmutants N165A and
N243A shifted the conformation of the S-protein from an open to a closed state, thereby
affecting the protein function. These residues are crucial for stabilizing the open
conformation of the RBD-domain.[19] The N234 stabilizes the open
conformation, whereas the N165 glycan resists the open to close transition. Network
analysis also revealed that they have a high betweenness centrality, which aids in the
“scissoring” motion between the NTD of the two monomers of the S-protein
trimer (Figure ).[19]
Additionally, all-atomMD simulations of the up and down forms of a fully glycosylated
S-protein of SARS-CoV-2 revealed key interdomain interactions responsible for stabilizing
each form and inducing large-scale conformational transitions.[21]
Moreover, the huge size of the S-protein creates challenges in understanding its key
conformational transitions. To address this, Li and co-workers developed a reliable
simulation protocol that considers many short simulations, which resulted in obtaining
insights into the opening of the fusion peptide on a submicrosecond time-scale following
cleavage at the S2′ site.[22]
Figure 3
Glycan shield of the RBD-ACE2 interacting region. The accessible surface area of the
receptor-binding motif (RBM-A) and the area shielded by neighboring glycans in the
closed (A) and open (D) states. The values have been averaged across replicas and are
reported with standard deviation. Highlighted in cyan is the RBM-A area that remains
accessible in the presence of glycans, which is also graphically depicted on the
structure in the panels located below the plots. Molecular representation of closed
and open systems from the top (B and E, respectively) and side (C and F, respectively)
views. In panels E and F, RBD within chain A (cyan) is in the “up”
conformation and emerges from the glycan shield. Reproduced with permission from ref
(19). Copyright 2020 American Chemical
Society.
Glycan shield of the RBD-ACE2 interacting region. The accessible surface area of the
receptor-binding motif (RBM-A) and the area shielded by neighboring glycans in the
closed (A) and open (D) states. The values have been averaged across replicas and are
reported with standard deviation. Highlighted in cyan is the RBM-A area that remains
accessible in the presence of glycans, which is also graphically depicted on the
structure in the panels located below the plots. Molecular representation of closed
and open systems from the top (B and E, respectively) and side (C and F, respectively)
views. In panels E and F, RBD within chain A (cyan) is in the “up”
conformation and emerges from the glycan shield. Reproduced with permission from ref
(19). Copyright 2020 American Chemical
Society.A comparison of the interface of the S-protein and ACE2 receptor of SARS-CoV and
SARS-CoV-2 using MD simulations revealed that the interfacial residues of SARS-CoVS-protein were mutated when compared to that of SARS-CoV-2. Interestingly, an
approximately 1.5-fold increase was observed in the interaction energy between the
proteins after mutation. The contact between the S-protein and ACE2 receptor was studied
from a detailed contact map constructed from data obtained from all-atomMD simulations.
It was observed that the total interaction energies between SARS-CoV-2S-protein and ACE2
receptor primarily increased due to the structural rearrangements, which increased the
electrostatic and van der Waals contacts. The surface electrostatics of SARS-CoV-2 showed
an increase in the negative charge on the S1-domain. Although the overall dynamics of
S-protein in SARS-CoV-2/ACE2 and SARS-CoV/ACE2 receptors were largely similar, the loop
dynamics in SARS-CoV-2 showed a higher correlation toward ACE2. Simulations also revealed
that ACE2 rearranges its structure to fit well with the S-protein.[23]
Millisecond MD simulation studies showed that mutations in the SARS-CoV-2S-protein
present an evolutionary advantage for the virus. The new conformation permits the
S-protein to readily bind to the ACE2 at the cost of very little entropy, leading to an
increased rate of infectivity of the SARS-CoV-2 virus. Several crucial hydrogen bonds and
salt bridges have been observed in the RBD-ACE2 interface of the SARS-CoV-2, which result
in higher free energy binding; however, they were absent in SARS-CoV.[24−26]Detailed MD and free energy simulations of the binary complexes of the RBD-domains of
SARS-CoV and SARS-CoV-2 with ACE2 revealed that enhanced binding of SARS-CoV-2 RBD to ACE2
occurs through complex networks of hydrogen-bonding and hydrophobic interactions. The
results elegantly showed the dynamic fluctuations and interplay of differential
hydrophobic contacts on one side of the RBD and hydrogen-bonding network extended to the
opposite end of RBD in SARS-CoV-2 (Figure ).[27] Additionally, a key mutation at the 417 position
resulted in greater electrostatic complementarity.[27] Collectively,
these MD-derived data suggested that the interactions of the specific residues of
SARS-CoV-2S-protein and corresponding humanACE2 residues were due to an extensive
network of hydrogen bonds and a large surface of noncovalent
interactions.[27,28]
In another study, researchers used an integrated approach combining coarse-grained MD
simulations with protein stability- and perturbation-based network analyses to examine the
role of mutations in the SARS-CoV-2S-protein trimer to explain the dynamics and
allosteric regulation in great detail.[29]
Figure 4
Characteristic dynamic fluctuations of the RBD-ACE2 complexes of SARS-CoV-2 and
SARS-CoV. Fluctuations depicted by the two lowest-frequency principal components, PC1
and PC2 (A), and dynamic conformations projected onto the two PC vectors (C) are
shown. In panel B, the RMSD and surface contact areas of the RBM in both complexes
during the 200 ns MD simulations are shown. In panel D, the tilt angles of the two
RBD-ACE2 complexes near the center of the N-terminal helix to the centers of mass
(COM) for RBD and ACE2 are shown, respectively. Reproduced from ref (27), open-access article distributed under CC-BY.
Copyright 2020 National Academy of Sciences, USA.
Characteristic dynamic fluctuations of the RBD-ACE2 complexes of SARS-CoV-2 and
SARS-CoV. Fluctuations depicted by the two lowest-frequency principal components, PC1
and PC2 (A), and dynamic conformations projected onto the two PC vectors (C) are
shown. In panel B, the RMSD and surface contact areas of the RBM in both complexes
during the 200 ns MD simulations are shown. In panel D, the tilt angles of the two
RBD-ACE2 complexes near the center of the N-terminal helix to the centers of mass
(COM) for RBD and ACE2 are shown, respectively. Reproduced from ref (27), open-access article distributed under CC-BY.
Copyright 2020 National Academy of Sciences, USA.The analysis of atomistic MD simulations of glycosylated S-protein helped identify the
protein energetics and immunoreactive potential of different subdomains in the
S-protein.[30] Using energy decomposition and the matrix of low
coupling energies, different immunoreactive subdomains were identified, including the RBD,
regions corresponding to NTD, the central part of the S1A-domain, and a highly reactive
carbohydrate cluster in the S2-domain. Few other potential epitopes for antibody design
were also identified by analyzing the top 5% weakly coupled residue pairs on the
S-protein.[30]One of the prominent genetic variations that emerged in SARS-CoV-2 is the insertion of a
novel furin cleavage site, which is absent in its predecessors. The dynamics of furin
protease binding to the furin cleavage site of S-protein were studied using protein
docking and MD simulations. The studies showed that the furinmolecules bind at the middle
of the S-trimer at the adjacent sides. Subsequently, the furin protease binds tightly to
the S-protein/furin complex by burying a vast surface area. The protease residues that
interact with furinmolecules were well-placed in the complex. A large number of van der
Waals and hydrogen-bonding interactions enabled the binding of protease to the S-protein.
When potential inhibitors were added along with furin, the resultant protease/S-protein
complex was found to have stronger binding affinities mediated by hydrogen-bond and polar
interactions and thus could act as a possible drug target.[31]
Main Protease
After the initiation and assembly of the SARS-CoV-2, proteolytic processing generates
functional subunits of the virus. The central proteolytic enzyme is the main protease
(Mpro), also called the 3C-like protease (3CLpro). The
Mpro is a homodimer comprising three structural domains (domain I, II, and
III). Domains I and II are made up of β-sheets and form the main catalytic subunit,
whereas domain III is a compact α-helical domain connected to domain II by a long
linker loop. The Mpro’s of SARS-CoV and SARS-CoV-2 share 96% sequence
identity. Although domains I and II are crucial for the catalytic activity, deletion of
domain III has been found to prevent the dimerization of the enzyme and significantly
reduce the enzyme’s activity. Like the other proteases, the catalytic activity of
Mpro depends on the generation of oxyanion holes, mediated by residues
Cys145, His41, and other catalytic residues having specificity for a Gln present in the
peptide substrate. To understand the molecular mechanism of Mpro action, a
2-μs-long MD simulation of the SARS-CoV-2Mpro in various conformations
(monomer, dimer, with a short model peptide substrate, and without peptide substrate) was
performed.[32] The data revealed that mutations in the protease induced
rearrangements of domain II and domain III. The salt bridge interactions observed in
wild-type protease significantly changed in the mutants. The rearrangement of the domains
unfavorably impacts the binding of a model peptide substrate (mimicking the polyprotein
sequence recognized at the active site). The monomer form bound to the substrate
rigidifies the structure and prevents rotation of domain III, facilitating enzyme
dimerization required for the catalytic activity. A comparison of the dimer in peptide
bound and unbound states revealed a hydrophobic cluster that allosterically controls the
domain III rotation. Further analysis of the simulations indicated a 310
helical twist in the monomer near the enzyme active site. This reversible twist structure
blocks the peptide’s accessibility to active site residues in the monomeric form.
However, the active site residues were observed to be completely solvent-exposed in the
dimeric state.[32]Comparison of the MD simulation-based analyses of protein residue interaction networks
between SARS-CoV and SARS-CoV-2Mpro revealed that the overall topological
properties of both the structures differ only by 5%.[33] The average SARS
CoV-2 Mpro is 1900% more sensitive than SARS CoVMpro in
transmitting small structural changes across the complete protein through long-range
interactions. Thus, minor structural changes in SARS-COV-2Mpro lead to an
enormous increase in enzyme efficiency by transmitting small changes between long-range
amino acid residues network in the structure (Figure ).[33] The network of proteases in the presence of inhibitors
constructed from crystal structures revealed that although long-range transmission
occurred in the protease, most of the effect was concentrated around the inhibitor binding
site, specifically the catalytic Cys145 residue.[33] Although structural
changes were not evident, hidden complexity was deciphered using protein residue networks.
The elastic network models of the SARS-CoV-2Mpro supported the long-range
interactions. Moreover, the protease has a highly dynamic and flexible structure. The
analysis also hinted that although the wild-type protease does not show cooperativity, it
could be introduced by mutations. The study helped identify novel residues in the protein,
which dynamically control the active site microenvironment and can be explored to design
noncompetitive inhibitors.[34] Additionally, eight residues around the
dimer interface of Mpro were found to be involved in regulating the enzyme
activity. Multiple pH-dependent MD simulations revealed the role of solvent pH on the
Mpro activity. The maximum number of native contacts and the optimal
concentration of the catalytic triad formed at pH 7. At pH above or below 7, the integrity
of the Mpro structure was seriously compromised.[35]
Figure 5
Illustration of the Mpro topology. Illustration of subgraph (A, C) and LR
subgraph (B, D) centralities of the amino acid residues of the chain A of SARS-CoV
Mpro of (top) and of SARS-CoV-2 (bottom) is shown. Reproduced with
permission from ref (33). Copyright 2020 AIP
Publishing.
Illustration of the Mpro topology. Illustration of subgraph (A, C) and LR
subgraph (B, D) centralities of the amino acid residues of the chain A of SARS-CoVMpro of (top) and of SARS-CoV-2 (bottom) is shown. Reproduced with
permission from ref (33). Copyright 2020 AIP
Publishing.
Nucleocapsid
The SARS-CoV-2nucleocapsid (N-protein) is one of the crucial proteins involved in the
packaging of the viral genome. It can homo-oligomerize and associate with other viral
proteins or with the viral genome. The N-protein phase separates with the viral RNA,
indicating a strong correlation. Structurally, the N-protein has five domains: N- and
C-terminal domains (NTD and CTD), RNA binding domain, a central linker domain, and a
dimerization domain. The NTD and central domain are intrinsically disordered in structure.
Apart from the dimerization domain, the RNA binding domain can also form higher oligomers.
The structure and dynamics of the N protein were determined using single-molecule
spectroscopy in combination with atomistic MD simulations. The NTD and the RNA binding
domain are closely associated with each other. The central domain showed rapid
rearrangements, resulting in no links between the dimerization and the RNA binding
domains; however, the CTD and dimerization domain interacted with each other.[36] Computational data further revealed the formation of multiple transient
helices in the protein. A transient helix H3 from the central linker region was involved
in the RNA binding, while the adjacent transient helix H4 was involved in
protein–protein interactions. Similarly, in the CTD, the helices supported the
binding of the N-protein to RNA, membranes, or other proteins.[36] A
study using NMR-restraint-driven docking simulations combined with mutational analysis of
N-protein NTD revealed that the region is positively charged and can bind to both
single-stranded and double-stranded RNA molecules, thereby providing detailed insights
into RNA recognition. However, the binding surface was U-shaped indicating that
single-stranded RNA would adopt a half-turn conformation for optimal binding to the
protein.[37]
Nonstructural Proteins
Apart from structural proteins, SARS-CoV-2 encodes several nonstructural proteins (NSP1
to NSP16) that are also known to play various important functions.[38]
Several studies have utilized MD simulations to investigate the effect and binding modes
of approved and repurposed drugs, natural bioactive and metabolite molecules, and
inhibitors against NSPs.[39−45]
However, as they are at the early stages of possible therapeutics development, it is
outside the scope of the present perspective to discuss them in detail.
Effect of Mutations
Within one year of their discovery, the SARS-CoV-2 proteins have accumulated a vast number
of mutations, leading to the emergence of various strains. Understanding the roles of
mutations is critical in establishing infectivity, pathogenicity, and subsequent drug and
vaccine development. Toward this, several MD simulation-based studies revealed crucial
information. The MD simulations of 50 Mpromutations of SARS-CoV-2 at the early
pandemic stage showed that several important residues, such as Gly15, Val157, and Pro184,
mutated more than once in SARS CoV-2. The data also suggested that the introduction of Glu48mutation instead of Asp48 resulted in a novel “TSEEMLN” loop at the binding
pocket, and the residue Phe140 widens the substrate-binding surface.[46]
Another exciting study demonstrated that a key mutation at position 417 of the S-protein to
Lys417 establishes a salt bridge in the hydrophobic interface of the RBD-ACE2, resulting in
a higher electrostatic complementarity as well as enhanced hydrophobic packing as a result
of the elimination of four proline residues in the interacting loop of the SARS-CoV-2
complex.[27]The effect of Asp to Glu mutation at residue 614 (D614G) on the structure and thermodynamic
stability of the S-protein was analyzed by MD simulations, which revealed that the mutation
introduced structural mobility, decreased thermal stability, and further established a
strong binding affinity with furin as compared to the wild-type.[47] Using
large-scale atomistic MD simulations, it was shown that mutations distal from the RBD of the
S-protein affect the transmissibility of SARS-CoV-2. For instance, certain missense
mutations, although located 10 nm away from the RBD, enhanced the electrostatic and
hydration interactions between RBD and ACE2 (Figure ).[48] Another study employed MD simulations and network
modeling and identified the regulatory centers of allosteric interactions for distinct
functional states of the wild-type and missense variants of the prefusion S-protein trimer
of SARS-CoV-2.[17] The mechanism of the enhanced binding affinity of
SARS-CoV-2N501YS-proteinmutation for the ACE2 receptor was investigated using extensive
MD simulations, where the hydrophobic interactions were found to be crucial for the
increased binding energy.[49]
Figure 6
Distal mutants 1 and 2 both weaken the binding affinity between the RBD and ACE2. (A)
Final simulation snapshot of the wild-type S-protein trimer. (B) Contact region between
the wild-type S-protein and ACE2, where the eight RBD-ACE2 intermolecular hydrogen bonds
are highlighted. (C, D) Similar to panel B, except for the mutants 1 and 2,
respectively. (E) Potential energy between the SARS-CoV-2 RBD and ACE2. (F) The number
of intermolecular hydrogen bonds (HB) between the RBD and ACE2. In panels E and F, the
averages and standard deviations are calculated from five parallel runs. In mutant 1,
the N679SPRRA684 residues were removed, and mutant 2 is denoted by
E682E683. Reproduced with permission from ref (48). Copyright 2020 American Chemical Society.
Distal mutants 1 and 2 both weaken the binding affinity between the RBD and ACE2. (A)
Final simulation snapshot of the wild-type S-protein trimer. (B) Contact region between
the wild-type S-protein and ACE2, where the eight RBD-ACE2 intermolecular hydrogen bonds
are highlighted. (C, D) Similar to panel B, except for the mutants 1 and 2,
respectively. (E) Potential energy between the SARS-CoV-2 RBD and ACE2. (F) The number
of intermolecular hydrogen bonds (HB) between the RBD and ACE2. In panels E and F, the
averages and standard deviations are calculated from five parallel runs. In mutant 1,
the N679SPRRA684 residues were removed, and mutant 2 is denoted by
E682E683. Reproduced with permission from ref (48). Copyright 2020 American Chemical Society.In a related study conducted to better understand the stability of the RBD/ACE2 complex,
extensive MD simulations revealed the effect of mutations identified in the RBD of
SARS-CoV-2 strains isolated fromhumans. The study demonstrated that most of the naturally
occurring mutations in the RBD have better binding affinity to ACE2 when compared to the
wild-type Wuhan reference genome (NC_045512.2) (http://cov-glue.cvr.gla.ac.uk/#/home),
thereby highlighting the crucial role of certain hotspot residues at the interface of RBD
and ACE2.[50] On the other hand, it is important to understand the effect
of naturally occurring missense mutations in humanACE2 on the RBD-ACE2 interaction. To
address this, multiple computational studies, including MD simulations, were used that
showed the specific mutations affecting the affinity between ACE2 and S1-domain through
long-range interactions and various structural dynamics and conformational events arising
due to the closed and open states of the S-protein of SARS-CoV-2.[51] An
artificial intelligence-based model showed the incidence of mutations based on duration,
dispersal, and frequency of occurrences.[52]
Drug Design and Discovery
An all-atomMD simulation has been at the forefront of in silico
techniques used in drug design and discovery against COVID-19. The MD simulations have been
widely employed in understanding the viral protein–drug and protein–inhibitor
interactions, evaluating their binding affinities, energetics, and stability. Since the
beginning of the COVID-19 pandemic, researchers across the globe have extensively worked on
exploring and prioritizing hundreds of thousands of inhibitors and potential drugs (using
drug repurposing) using MD simulations. Several repurposed and investigational drugs have
been proposed to target the key SARS-CoV-2 proteins, including (i) S-protein, (ii)
RNA-dependent RNA polymerase (RdRp), and (iii) Mpro/3CLpro.Remdesivir, which targets the viral RNA-dependent RNA polymerase (RdRp) and induces the
evasion of proofreading and subsequent inhibition of viral RNA synthesis, has been studied
in detail using MD simulations. It is an adenine-based nucleotide analogue that was
developed against the Ebola virus. During the early days of the pandemic, the binding of
remdesivir to the SARS-CoV-2RdRp was investigated using MD simulations, and free energy
perturbation methods were used to understand its inhibition mechanism.[53]
The binding of remdesivir to SARS-CoV-2RdRp was significantly stronger compared to its
natural substrate, ATP. MD simulations coupled with ensemble docking of the apo and
RdRp-bound remdesivir revealed the blocking of template entry sites in the presence of
remdesivir. Analysis of the principal components suggested significant conformational
changes leading to the establishment of strong contacts between several catalytic residues,
including Ser759, Asp760, and Asp761.[54] Later, using molecular docking,
steered MD, and umbrella sampling, it was shown that remdesivir could strongly bind to both
SARS-CoV-2RdRp (via electrostatic interactions) and Mpro (via van der Waals
interactions).[55]MD simulations combined with free energy and fragment molecular orbital calculations showed
that the anti-HIV drugs, lopinavir and ritonavir, bind to the active site residues of
SARS-CoV-2Mpro.[56] Ritonavir formed a higher number of
contacts and bound efficiently using several specific electrostatic, dispersion, and charge
transfer contacts than lopinavir. Extensive MD simulations of the dimeric Mpro
revealed the dynamics of seven HIV inhibitors (darunavir, indinavir, lopinavir, nelfinavir,
ritonavir, saquinavir, and tipranavir) and binding mechanisms to the active site of
Mpro, at least twice in 28 simulations of 200 ns each. The microsecond-long
time-scale simulations further revealed a wide variation in the geometry of the binding
sites in Mpro and binding poses of the HIV-1 protease inhibitors. The
Mpro has a relatively flexible pocket when bound to inhibitors. Three different
binding sites were identified, suggesting that the inhibitor binding should not be
restricted only to the enzyme’s active site. The drugs were seen flipping and
changing their binding poses during MD simulations. The binding of the drugs also induced
conformational changes in the C-terminal of Mpro, which participates in
stabilizing the substrate or the ligand binding. Such data provide an opportunity to improve
and optimize the drugs for stronger binding and specificity for inhibiting the SARS-CoV-2Mpro.[57]An extensive MD simulation study on the anti-influenza drug umifenovir (Arbidol)
demonstrated that Arbidol binds at the RBD/ACE2 interface with a high affinity by forming
stronger intermolecular interactions with key residues of RBD compared to that with ACE2. It
was proposed that the binding of Arbidol induces structural rigidity in the virus
glycoprotein, resulting in restriction of the conformational rearrangements associated with
membrane attachment and virus entry.[58] In a related study, MD simulations
and structural analyses were carried out to highlight that Arbidol blocks the S-protein
trimerization, which is crucial for host cell adhesion and hijacking.[59] A
supervised MD study further explored the druggability of the SARS-CoV-2S-protein, which
showed that six FDA-approved drugs were capable of significantly blocking the RBD/ACE2
interaction.[60] A recent report employing all-atomMD simulations and
binding enthalpy calculations showed that an active site inhibitor (MLN-4760) could reduce
the adherence of S-protein with ACE2 by weakening and destabilizing the interactions at the
ACE2-RBD interface of SARS-CoV-2.[61] A similar work studied the effect of
small molecules SSAA09E2 and Nilotinib on the ACE2-RBD interface through MD simulations and
found that they intervene with hydrogen bonds at the interface and hence the flexibility of
the proteins.[62] Interestingly, stapled peptides have also been designed
that target the RBD of the SARS-CoV-2, which upon analysis through MD simulations and
binding free energy calculations has been shown to bind with a potency that is similar to
that of experimentally proven peptide inhibitors.[63]Several drug repurposing studies have identified a plethora of potential drugs against
Mpro. The FDA-approved antivirals, lopinavir, ritonavir, tipranavir, and
raltegravir, and several HIV protease inhibitors (HPIs) were identified as the potential
Mpro inhibitors.[64,65] An e-pharmacophore approach combined with MD simulations revealed that
drugs like binifibrate and bamifylline bind to the active site of Mpro.[66] A further study screened 1615 FDA-approved drugs and 4266 other approved
drugs, using an array of computational methods in combination with MD simulations, and found
that simeprevir (anti-HCV drug) and pyronaridine (antimalarial drug) bind
Mpro.[67] Several other drugs, including the anti-HIV drugs
indinavir and darunavir, also interact with Mpro.[68,69] Another study explored the mechanism
of covalent inhibition of Mpro by an α-ketoamide inhibitor.[70] Multiple MD simulations and theoretical approaches were used to investigate
the binding mechanisms of 19 marketed drugs to the ligand-binding pocket of Mpro.
The data demonstrated that ligand binding to the Mpro pocket could be improved if
a part of the ligand occupies a specific anchor site. This finding provides an opportunity
to explore new binding mechanisms and designs and optimize the related Mpro
inhibitors with higher binding affinities.[71] In addition, using MD
simulations and free energy calculations, several studies showed that natural compounds,
including polyphenols, flavones, coumarin, and green tea extract, could bind to the
Mpro.[53,55,72] Pyranonigrin A, a secondary fungal metabolite, was also
shown to bind to the Mpro with a high affinity.[73] The MD
simulations also showed that ligands, such as vitamins, retinoids, and steroids, could bind
to the free fatty acid pocket of the S-protein.[74] A free tool,
interactive MD in virtual reality (iMD-VR), has also been developed to create
Mpro complexes. The iMD-VR allows the users to perform flexible docking of the
Mpro inhibitors and oligopeptide substrates, interact with MD simulations, and
build protein complexes in a physically rigorous and flexible manner.[75]The MD simulation-based lead identification studies on other SARS-CoV-2 proteins are also
available (reviewed in ref (76)). Additionally,
multitarget studies have also been performed, where MD simulations were used to understand
the binding dynamics and affinity of the inhibitors (reviewed in refs (76 and 77)).
However, a critical limitation of most of the studies is that they were not confirmed using
wet-lab approaches. Nonetheless, they provided significant information about the active site
interaction and dynamics of the enzymes and guided the improvement and optimization in the
design of the existing drugs for improved binding and inhibition.
Design of Protein-Based Therapeutics
MD simulations have also been used to validate novel computationally designed therapeutics,
such as the design of ACE2 decoys.[78] A high-throughput study used a
de novo protein design approach to construct 35 000 ACE2 decoys
computationally; the binding of the top-ranked 196 designs to the SARS-CoV-2 RBD was tested
experimentally (Figure ).[78] The
data revealed that certain decoys were able to strongly neutralize a SARS-CoV-2 infection
in vitro. Notably, a single intranasal dose of decoy protected Syrian
hamsters from a subsequent lethal SARS-CoV-2 challenge.[78] In another
study, RBD-based peptides were computationally designed, and their ACE2 binding and
inhibiting potential were analyzed.[79]
Figure 7
Design and characterization of de novo ACE2 decoys. (A) ACE2 and its
binding motifs in complex with SARS-CoV-2 RBD are shown. (B) De novo
secondary structure elements were computationally generated. Seven combinations of the
secondary structure elements were considered. Reproduced with permission from ref
(78). Copyright 2020 The American Association
for the Advancement of Science.
Design and characterization of de novo ACE2 decoys. (A) ACE2 and its
binding motifs in complex with SARS-CoV-2 RBD are shown. (B) De novo
secondary structure elements were computationally generated. Seven combinations of the
secondary structure elements were considered. Reproduced with permission from ref
(78). Copyright 2020 The American Association
for the Advancement of Science.
Applications of Supercomputer-Assisted MD Simulations
Not only high-throughput computational methodologies but also massive large-scale hardware
has been effectively utilized to understand the molecular mechanisms of SARS-CoV-2-triggered
infection. Massive-scale MD simulations using state-of-the-art supercomputer machines have
been used to gain insights into the biology of SARS-CoV-2. The Amaro lab at the University
of California, San Diego used ∼250 000 processing cores and ∼4000
processor nodes to run MD simulations on one of the world’s top supercomputers named
“Frontera” to complete an all-atom simulation of the influenza virus
envelope.[80] Similarly, researchers at the RIKEN Center for Biosystems
Dynamics Research, Japan, used a drug discovery supercomputer MDGRAPE-4A to analyze the
structure-dynamics relationship of the Mpro of SARS-CoV-2 (https://www.riken.jp/en/news_pubs/news/2020/20200323_1/). Their MD model comprised
98 694 atoms, including 29 712 watermolecules and a total of 10-μs-long
simulations. Researchers conducted the simulations and made the raw simulation data publicly
available on the Mendeley Data repository for researchers across the globe (https://data.mendeley.com/datasets/vpps4vhryg/2). Furthermore, researchers at the
Department of Energy’s Oak Ridge National Laboratory used a supercomputer called
“Summit” to performMD simulations on 8000 compounds to screen for potent
binders to S-protein and identified 77 potential small-molecule drug compounds.[81] Interestingly, two research groups, one at the University of Arkansas, used
the Anton 2 supercomputer to understand the activation process of SARS-CoV-2S-protein (https://www.psc.edu/tag/covid/, https://www.psc.edu/psc-covid-19-update-june-15-2020/), while the other research
group at the University of California, Riverside, employed the Anton 2 supercomputer to
evaluate how the CRISPR-Cas gene-editing system recognizes the genetic material of
SARS-CoV-2 with microsecond-long MD simulations (https://insideucr.ucr.edu/stories/2020/06/17/uc-riverside-engineers-are-using-supercomputers-investigate-rapid-crispr-based).
Remarkably, over a million citizen scientists performed an unprecedented 0.1 s of MD
simulations through the Folding@home computing project to create the world’s first
Exascale computer and simulate protein dynamics of SARS-CoV-2.[82] This, in
turn, revealed how the S-protein of SARS-CoV-2 uses conformational masking to evade host
immunity and subsequently identified the hidden cryptic pockets that were not captured or
were extremely difficult to capture in experiments. A consortium of high-performance
computing (HPC) for research on COVID-19 was constituted in early 2020. The consortium
houses some of the largest supercomputers from academia, federal agencies, industries, and
national laboratories from around the world. It actively supports researchers by providing
computational grants for accelerating their research on COVID-19. Currently, more than 100
different research projects on COVID-19 are part of the consortium (https://covid19-hpc-consortium.org/).
Conclusion and Future Perspectives
Science has never been more dynamic than it is in the current pandemic. Thousands of
researchers in academia and the pharmaceutical industry throughout the world paused their
research curiosity and began working on COVID-19, rendering science CORONA-ized. The way the
pandemic resulted in a shift in scientific priorities is unprecedented. Many bold and
creative approaches helped develop several vaccines against COVID-19, and immunization
started globally at a record-breaking speed. However, the deployment of COVID-19 vaccines
should not deflect us from learning the biology of SARS-CoV-2, and the search for an
effective long-term therapeutic strategy should be our top priority.The COVID-19 pandemic offered researchers the opportunity to explore high-performance
computational methods, including MD simulation, as a technological prospect for biological
discoveries of SARS-CoV-2. With the increasing availability of technical infrastructures and
computational power, MD simulations can be performed and managed even from off-site. Thus,
understanding the biology of the virus has become an integral part of COVID-19 research.
Today, a large number of scientific publications are available that systematically used MD
simulations to provide a comprehensive understanding of the molecular mechanism of
SARS-CoV-2 infection. However, SARS-CoV-2 research will continue for many more years to
come, and as expected, with time, the computing power, resources, and mathematical basis of
simulations will keep increasing. These methods will be exploited by computational
biologists and are likely to emerge as a strong and supplementary pillar for the mechanistic
understanding of COVID-19. The future of COVID-19 research promises many exciting
opportunities for unsolved problems where MD simulation can prove its worth magnanimously.
We underline a few suggestions and directions where MD simulations may be helpful for
COVID-19 research.More than 1400
structures (crystals, cryo-EM, and others) of SARS-CoV-2 proteins have been deposited
in the Protein Data Bank in the past 18 months. However, these structures have been
solved with great speed and under immense pressure at the time of crisis. Thus, it is
highly possible that the structures can contain errors. Even minor errors in structure
calculation may severely compromise the process of structure-based drug design as the
potential inhibitor can be misinterpreted as biologically and pharmaceutically
relevant. The Coronavirus Structural Task Force is continuously working to evaluate,
improve, and remodel the structures (https://insidecorona.net/). Additional analyses of these structures through
computer simulations may provide significant biological insights into their
conformational dynamics and functional
interactions.Few simulation studies
have identified allosteric sites and cryptic pockets in certain SARS-CoV-2 proteins.
However, all the allosteric inhibitor binding sites and cryptic pockets in all the
SARS-CoV-2 proteins remain explored. More efforts are thus required for the purpose of
drug design.In the future, many
antivirals will be developed. It is imperative to exhaustively study their inhibitory
action using MD simulations. Pharmacophore modeling in combination with MD simulations
can help develop better and potent
antivirals.As new variants of
SARS-CoV-2 are emerging and will continue to emerge, an understanding of the
structure–function–disease relationship of new variants will be required
for the management of COVID-19 and future disease outbreaks. In such studies, MD
simulations can provide crucial information that will complement the results of
biochemical experiments. The variation in functions of the potential mutants can also
be studied further using MD simulations by characterizing their structure and
dynamics. The MD simulations can also be used to analyze possible residues that, upon
mutation, can render the protein structurally compatible for target-based drug
design.With the advancement of the
pandemic, the SARS-CoV-2may develop resistance mutations in various proteins as a
response to drug or immune pressure by undergoing positive selection. Using
high-throughput protein design approaches, few studies have identified potential
residues that could mutate in a short time to develop drug resistance in the
virus.[83,84]
Validation of such data using MD simulations will be helpful to predict mutations that
may emerge in the future. Resistance against various antibiotics is being reported due
to their excessive use to prevent secondary infection in COVID-19patients. Such
aspects should be investigated by multiple computational methods, including MD
simulations.Molecular dynamics
simulations in combination with other assistive technologies might be helpful to
assess the contributions of the genetic profile of a patient and viral genome
variability to the differential clinical outcomes of COVID-19patients.An initial coarse-grained
molecular model of the SARS-CoV-2 virion has been developed. However, these were early
results. Additional whole-cell simulations of the complete virion and the various
virion components should be performed. Information from holistic models of the virion
and its components can help understand new routes to tackle the virus by exploiting
viral mechanisms involving large-scale features. Additionally, the interplay between
proteins and genetic components will provide information for the development of
defense strategies, which, in turn, will help in preventing damage from similar
pandemics in the future.
Authors: Thomas W Linsky; Renan Vergara; Nuria Codina; Jorgen W Nelson; Matthew J Walker; Wen Su; Christopher O Barnes; Tien-Ying Hsiang; Katharina Esser-Nobis; Kevin Yu; Z Beau Reneer; Yixuan J Hou; Tanu Priya; Masaya Mitsumoto; Avery Pong; Uland Y Lau; Marsha L Mason; Jerry Chen; Alex Chen; Tania Berrocal; Hong Peng; Nicole S Clairmont; Javier Castellanos; Yu-Ru Lin; Anna Josephson-Day; Ralph S Baric; Deborah H Fuller; Carl D Walkey; Ted M Ross; Ryan Swanson; Pamela J Bjorkman; Michael Gale; Luis M Blancas-Mejia; Hui-Ling Yen; Daniel-Adriano Silva Journal: Science Date: 2020-11-05 Impact factor: 47.728
Authors: Hani A Alhadrami; Gaia Burgio; Bathini Thissera; Raha Orfali; Suzan E Jiffri; Mohammed Yaseen; Ahmed M Sayed; Mostafa E Rateb Journal: Mar Drugs Date: 2022-02-24 Impact factor: 5.118