Antonio Francés-Monerris1,2, Cécilia Hognon1, Tom Miclot1,3, Cristina García-Iriepa4,5, Isabel Iriepa5,6, Alessio Terenzi3, Stéphanie Grandemange7, Giampaolo Barone3, Marco Marazzi4,5, Antonio Monari1. 1. Université de Lorraine and CNRS, LPCT UMR 7019, F-54000 Nancy, France. 2. Departament de Química Física, Universitat de València, 46100 Burjassot, Spain. 3. Department of Biological, Chemical and Pharmaceutical Sciences and Technologies, Università degli Studi di Palermo, Viale delle Scienze Ed. 17, 90128 Palermo, Italy. 4. Department of Analytical Chemistry, Physical Chemistry and Chemical Engineering, Universidad de Alcalá, Ctra. Madrid-Barcelona, Km 33,600, 28871 Alcalá de Henares, Madrid, Spain. 5. Chemical Research Institute "Andrés M. del Río" (IQAR), Universidad de Alcalá, 28871 Alcalá de Henares, Madrid, Spain. 6. Department of Organic and Inorganic Chemistry, Universidad de Alcalá, Ctra. Madrid-Barcelona, Km 33,600, 28871 Alcalá de Henares, Madrid, Spain. 7. Université de Lorraine and CNRS, CRAN UMR 7039, F-5400 Nancy, France.
Abstract
The emergence in late 2019 of the coronavirus SARS-CoV-2 has resulted in the breakthrough of the COVID-19 pandemic that is presently affecting a growing number of countries. The development of the pandemic has also prompted an unprecedented effort of the scientific community to understand the molecular bases of the virus infection and to propose rational drug design strategies able to alleviate the serious COVID-19 morbidity. In this context, a strong synergy between the structural biophysics and molecular modeling and simulation communities has emerged, resolving at the atomistic level the crucial protein apparatus of the virus and revealing the dynamic aspects of key viral processes. In this Review, we focus on how in silico studies have contributed to the understanding of the SARS-CoV-2 infection mechanism and the proposal of novel and original agents to inhibit the viral key functioning. This Review deals with the SARS-CoV-2 spike protein, including the mode of action that this structural protein uses to entry human cells, as well as with nonstructural viral proteins, focusing the attention on the most studied proteases and also proposing alternative mechanisms involving some of its domains, such as the SARS unique domain. We demonstrate that molecular modeling and simulation represent an effective approach to gather information on key biological processes and thus guide rational molecular design strategies.
The emergence in late 2019 of the coronavirusSARS-CoV-2 has resulted in the breakthrough of the COVID-19 pandemic that is presently affecting a growing number of countries. The development of the pandemic has also prompted an unprecedented effort of the scientific community to understand the molecular bases of the virus infection and to propose rational drug design strategies able to alleviate the serious COVID-19 morbidity. In this context, a strong synergy between the structural biophysics and molecular modeling and simulation communities has emerged, resolving at the atomistic level the crucial protein apparatus of the virus and revealing the dynamic aspects of key viral processes. In this Review, we focus on how in silico studies have contributed to the understanding of the SARS-CoV-2 infection mechanism and the proposal of novel and original agents to inhibit the viral key functioning. This Review deals with the SARS-CoV-2spike protein, including the mode of action that this structural protein uses to entry human cells, as well as with nonstructural viral proteins, focusing the attention on the most studied proteases and also proposing alternative mechanisms involving some of its domains, such as the SARS unique domain. We demonstrate that molecular modeling and simulation represent an effective approach to gather information on key biological processes and thus guide rational molecular design strategies.
A new coronavirus,
which emerged in China at the end of 2019 and
was soon named SARS-CoV-2, was recognized as the causative agent of
severe acute respiratory syndrome (SARS) that in some cases evolved
into severe pneumonia.[1−3] Later, it was also recognized that SARS-CoV-2 was
able to cross the interspecies barrier[4] and was also characterized by a high transmissibility between humans.[5] SARS-CoV-2 has been recognized as the agent responsible
for the coronavirus disease 19 (COVID-19), which, after harshly striking
Asia and later Europe and the America, has presently evolved into
a widespread pandemic.Although the mortalityratio of COVID-19
is relatively low,[6] <2% in younger patients,
it increases to 40–60%
in subjects older than 70. The high contagion rate of COVID-19 is
also favored by the fact that some patients are totally asymptomatic[7,8] and hence actively participate in the virus diffusion. Because of
the lack of a vaccine or an efficient antiviral treatment, almost
all countries have introduced social distancing measures to palliate
the virus diffusion that are significantly affecting both economic
and social well-being.[9]The scientific
community has been particularly active in answering
the call and taking the challenge of characterizing the main mechanisms
related to the COVID-19 diffusion, as will be clearly demonstrated
in the present Review. In addition to the modeling of the macroscale
epidemic diffusion and of the effects of social distancing on its
spread,[10−12] a particular focus has also been devoted to the elucidation
of the molecular mechanisms related to the SARS-CoV-2, including the
infection of the host cell and its reproduction. Indeed, such a characterization
is propaedeutic to rational drug design and repurposing strategies
that should result in the rapid development of efficient antiviral
agents. Much effort has been put into the resolution of the key protein
structure of the SARS-CoV-2 capsid as well as of the adducts leading
to the interaction with human receptors, mostly performed either via
X-ray diffraction or via cryo-electron microscopy (CryoEM).[13−16]Some classes of proteins are particularly important in dictating
the global behavior of the virus, its life cycle, and, ultimately,
its pathogenicity and infectivity rate.[17,18] The recognition
of the host cell and its entry are mediated by the SARS-CoV-2spike
(S) protein.[19,20] The latter is a trimeric multidomain
glycoprotein that is able to specifically recognize the human angiotensin
converting enzyme 2 (ACE2) via its receptor binding domain (RBD).[21] The conformational changes induced by the RBD/ACE2
interactions ultimately result in the viral and cellular lipid membrane
fusion and hence in cell infection. Consequently, the RBD and ACE2
represent ideal targets to counteract the SARS-CoV-2 infection. Furthermore,
considering that ACE2 is present in lung cells and is involved in
the regulation of blood pressure, ACE2 impairment by the coronavirus
is one of the possible causes of severe respiratory stress and possible
complications in hypertensepatients.After cell entry, the
viral RNA is translated into nonfunctional
global polyproteins that should be precisely processed to yield the
separate final structural proteins (as the S protein) and nonstructural
proteins (Nsp’s). This fundamental step is assured by proteases,[22] which are themselves Nsp’s, and because
of the strategic role of this step in assuring the maturation and
replication of the virus, it also represents a key therapeutic target
to inhibit SARS-CoV-2 propagation. More specifically, two main proteases
have been identified in SARS-CoV-2, the 3-chymotrypsin-like protease
(3CLpro) and the papain-like protease (PLpro).[23,24] Whereas the catalytic activity of both proteases is similar, important
structural differences should be pointed out, with their active form
being either dimeric or monomeric. Furthermore, some proteases may
also interact with ubiquitine-like sites, becoming involved in altering
the regulation of the cellular signaling, which hampers the response
of the host immune system.The capacity of thiscoronavirus
in eluding the innate immune response
of the host is one of the reasons for the high pathogenicity and transmissibility.
This capacity can be partially ascribed to the role played by large
Nsp’s, which are recognized to interact with the ribosome,
altering the translation of the host cell.[25,26] Nsp’s are generally multidomain proteins, and it has been
pointed out that a specific domain present in SARS viruses, named
the SARS unique domain (SUD), contributes to eluding the immune response
by interacting with host mRNA, either in its native conformation or
in a guanine quadruplex (G-quadruplex) arrangement.[27,28] Potential drugs able to interfere with such mechanisms could be
instrumental in boosting the capacity of the host cell to eliminate
the viral material.From the previous brief and largely nonexhaustive
summary, it is
clear that possible drug targets are diverse, from both a structural
and a functional point of view. In this context, the possibilities
offered by up-to-date molecular modeling and simulation approaches
are of great interest and value. Indeed, high-throughput molecular
simulations, and, in particular, all-atom molecular dynamics (MD),
allow us to represent complex phenomena taking place between complex
systems at an atomic-scale resolution. They have proven to be extremely
efficient in modeling the interactions and the related conformational
reorganization between proteins, nucleic acids, and lipid membranes.[29−31] Molecular simulations have allowed us to resolve the complex processes
related to, among others, enzymatic catalysis[32−35] and DNA lesion production and
repair[36−40] and to unravel the key mechanisms of passive and active membrane
transporters.[41−46] These successes have also been made possible by the impressive development
of computational algorithms, including enhanced sampling and free-energy
methods,[47] which nowadays allow us to simulate
the behavior of systems of hundreds of thousands of atoms up to the
microsecond time scale.[31,48]Furthermore,
large-scale simulations based on the molecular modeling
approach, providing further insight into the structure and functions
of SARS-CoV-2 and hence offering an original strategy to counteract
its actions, have recently been reported by Amaro,[49,50] Chodera,[51,52] and collaborators in different
reviews and perspectives.In the present Review, we critically
analyze some of the most important
results obtained by molecular modeling and simulations in understanding
the key functions of SARS-CoV-2 proteins and in proposing possible
novel antiviral agents, from both a methodological and a biological
perspective. This Review also illustrates the success and maturity
of computational approaches in offering original responses to complex
biological problems.In this respect, although we are aware
of the fact that different
large-scale simulation projects are still being actively pursued,
the effort of the scientific community has already produced a large
amount of results and data since the beginning of the pandemic, especially
in molecular simulations. This offers a deeper vision of the viral
basic mechanisms and also suggests strategies for drug development
or repurposing. One of the aims of our Review is hence to provide
an arena to critically analyze and contextualize all of those efforts
and also resituate them in the perspective of the relationship with
the structural biology and pharmaceutical chemistry communities.
SARS-CoV-2 Spike Glycoprotein:
Structure and Dynamics of Coronaviruses Trojan Horse
Following
the identification of the new pathogen as a bat SARS-type
betacoronavirus in January 2020,[53] the
structure of the SARS-CoV-2 S protein complexed with the human zinc
peptidase ACE2 was rapidly resolved at the atomic level by several
research groups.[13,14,19,54−57] Because of its fundamental function,
the S protein has become a promising target to fight the SARS-CoV-2infection. As shown in this section, the ultimate goal is to use this
knowledge to rationally design vaccines, neutralizing antibodies,
and antiviral agents with the general purpose of blocking S-protein
recognition and preventing cellular infection.
Structure, Biology, and
Dynamics of the Spike Glycoprotein
Enveloped viruses such
as SARS-CoV-2 need to fuse with the host
cellular membrane to gain entry into the cell.[58] This step, although thermodynamically favorable, has an
important energetic barrier and therefore is catalyzed by viral fusion
proteins expressed on the viral envelope, namely, the S proteins.
Such proteins also determine the viral host range, that is, the spectrum
of cells they can infect, the tissue tropism, and the induction of
the host immune response.[20]The viral
S transmembrane protein is a class I fusion protein composed of three
segments: an ectodomain, a single-pass transmembrane anchor, and a
short tail inside the virus (Figure A).[13,20] The ectodomain is constituted
by a trimer of S1 subunits, whose main function is to recognize the
host receptor, connected to the viral envelope through an S2 subunit,
in charge of fusing the viral and host membranes. Prior to the host
recognition, the SARS-CoV-2 S protein is in a metastable prefusion
state.[13] When a subunit S1 binds to a host
ACE2 receptor by means of the specific RBD region, the prefusion trimer
of S1 subunits is destabilized, inducing a relatively large conformational
change that leads the system to the postfusion state.[20] Afterward, the S1 subunits are released while the S2 subunit
produces the fusion of the cellular and viral membranes, allowing
the insertion of the viral genome into the host cell.[20] The RBD subunit oscillates between two hinge-like conformations,
namely, “up/open” and “down/closed” (see Figure A).[13] Whereas the former conformation is more flexible[59,60] and is able to bind the ACE2 receptor, in the down state, the RBD
is hidden or buried, is more rigid, and is unable to recognize the
host cell. Interestingly, it has been reported that different from
the former SARS-CoV, the down state prevails in the SARS-CoV-2 RBD,
contributing to hiding the highly immunogenic RBD surface.[61] It has been suggested that this feature contributes
to the evasion of the host immune system surveillance, delaying the
innate immune response and giving rise to longer infective periods
that increase the spread of the contagion.[61] The partial loss of the potential infectivity of SARS-CoV-2 due
to the preference of the RBD down conformation is compensated by a
higher affinity of the RBD up state toward ACE2[13,62,63] and by a preactivation furin-sensitive cleavage
site, which is absent in the previous SARS-CoV, that can facilitate
the crossing of the interspecies barrier.[61] Henderson et al. have reported a fine control of these up/down configurations
and discussed the implications in the development of vaccines and
in the evolution of the SARS-CoV-2 pathogenicity.[64]
Figure 1
(A) Structure of the SARS-CoV-2 spike protein. S1 = receptor binding
subunit, S2 = membrane fusion subunit, RBD = receptor binding domain,
TA = transmembrane anchor, IT = intracellular tail. Inspired by ref (61). (B) Negative-stain electron
microscopy image of an ACE2 receptor bound to a SARS-CoV-2 spike protein
(left) and diagram of the microscopy image (right). Reprinted with
permission from ref (13). Copyright 2020 American Association for the Advancement of Science.
(A) Structure of the SARS-CoV-2spike protein. S1 = receptor binding
subunit, S2 = membrane fusion subunit, RBD = receptor binding domain,
TA = transmembrane anchor, IT = intracellular tail. Inspired by ref (61). (B) Negative-stain electron
microscopy image of an ACE2 receptor bound to a SARS-CoV-2spike protein
(left) and diagram of the microscopy image (right). Reprinted with
permission from ref (13). Copyright 2020 American Association for the Advancement of Science.Recently, Tian and Tao[60] applied a combination
of the Markov state model, transition path theory, and machine learning
methodologies to analyze two independent MD trajectories of the up
and down states performed by Shaw and coworkers.[65] The authors characterized the transition between the two
SARS-CoV-2 RBD states and quantified their relative prevalence, revealing
a stunning 95.5% preference for the down state, in agreement with
experimental observations.[61] Moreover,
multiple paths involving different intermediates leading from the
down to the up state were identified and quantified in terms of probability.[60] The most probable channel is illustrated in Figure A and has been estimated
to have a probability of the 23.7%. On the contrary, Gur et al. have
estimated the free-energy landscape connected within the down-to-up
transition, highlighting the role of salt bridges and solvent accessibility
and identifying an intermediate semiopen state (see Figure B).[66] The existence of the intermediate state also participates in lowering
the global free-energy barriers that should be overcome during the
transition. The energetic penalty of the rate-limiting step has been
estimated to be ∼3kBT. On the contrary, Xu and coworkers[67] have
analyzed the distribution of the bending angle between the S1 and
S2 subunits, evidencing that an angle of at least 52.5° is required
to recognize ACE2.
Figure 2
(A) Most probable transition (23.7%) from a “down”
to an “up” state in the SARS-CoV-2 RBD. The color code
is blue (down) → red (down) → green (up) → orange
(up) → yellow (up). ACE2 is represented in gray. Reprinted
with permission from ref (60). Copyright 2020 Informa UK Limited. (B) Energy landscape
for the down → up transition of SARS-CoV-2 RBD. Reprinted with
permission from ref (66). Copyright 2020 AIP Publishing.
(A) Most probable transition (23.7%) from a “down”
to an “up” state in the SARS-CoV-2 RBD. The color code
is blue (down) → red (down) → green (up) → orange
(up) → yellow (up). ACE2 is represented in gray. Reprinted
with permission from ref (60). Copyright 2020 Informa UK Limited. (B) Energy landscape
for the down → up transition of SARS-CoV-2 RBD. Reprinted with
permission from ref (66). Copyright 2020 AIP Publishing.Like in other viruses, such as HIV, the S protein of SARS-CoV-2
is extensively glycosylated.[68−70] Glycan shields play an important
role in protecting epitopes from antibody neutralization, contributing
to the eluding of the host immune response and diminishing the efficiency
of therapeutic agents. Therefore, a detailed knowledge of the glycosylation
sites and their dynamics is a prerequisite to understand the infection
and to develop vaccines and other treatments. Grant et al.[71] have performed extended MD simulations to study
the role of the S-proteinglycan shield in modulating the antigen
exposition and the accessibility of known antibodies to their corresponding
antigens. The authors estimate that the glycosylation shields ca.
40% of the surface susceptible to being recognized by antibodies and
highlight the RBD region of the S1 subunit as the most accessible
and largest antigenic surface of the whole S protein. This result
is coherent with the necessary recognition of the ACE2 receptor by
the RBD and the consequent formation of the protein complex. Conversely,
it has been also suggested that the RBD domain might be O-glycosilated
at the Thr323 position,[70,72] an occurrence that
could also have implications in the RBD/ACE2 recognition mechanism
and in the design of possible inhibitors of this process. The precise
role of glycosylation is, however, still debated, which is also due
to the inherent difficulties in the experimental structural resolution,
and further studies are required to clarify this possibility. In this
context, a full-length model of the glycosylated SARS-CoV-2 S protein,
in both the open and closed states, was recently built by Amaro and
coworkers[73] to interpret, at an atomistic
level, the role of glycans on the spike protein structure and dynamics
using all-atom MD simulations in the microsecond range.Very
recently, the structure–function relationship of the
fully glycosylated SARS-CoV-2spike protein was unveiled by the decomposition
of unbiased MD simulations. In this way, spike substructures were
identified by an energy criterium to possibly act as antibody binding
sites.[74]
Molecular Basis of the
Human ACE2 Recognition
As recently
reported by Li and coworkers,[61] the binding
mode of SARS-CoV-2 S to ACE2 is similar to the one of the SARS-CoV
S protein, even though the affinity of SARS-CoV-2 RBD is higher.[13,55,62,63,75,76]The
superior affinity of the SARS-CoV-2 RBD for humanACE2 compared with
other related coronaviruses, including SARS-CoV, was also confirmed
by different molecular modeling and simulation studies,[63,67,77−83] whereas other computational studies reported small or negligible
differences,[84−87] in contrast with the latest experimental findings.[13,19,55,61,62] The variability of the results can be explained,
in part, by differences in the specific MD setup, the choice of the
model systems, and the methods of analysis.A great deal of
attention has been devoted to the identification
of the key interactions that drive the RBD/ACE2 recognition in both
SARS-CoV and SARS-CoV-2 viruses. Coarse-grained[88] and biased MD simulations[89] indicate
that the process occurs without any stable intermediate. There is
consensus over the fact that noncovalent interactions, in particular,
an extended network of multiple hydrogen bonds, dominate the recognition.
Other interactions such as π-stacking or water/amino acid bridges
have also been deemed relevant.[63,89] García-Iriepa
et al. have performed all-atom MD simulations on top of the recently
resolved RBD/ACE2 complex,[14] quantifying
the most important hydrogen bonds that sustain the binding of the
viral RBD to the ACE2 peptidase domain (PD).[89] The most important protein–protein interactions involve amino
acids located at the α and β interfaces of the PD, as
shown in detail in Figure A. Whereas this feature is conserved in both SARS-CoV and
SARS-CoV-2,[19] Linial and coworkers demonstrated,
by means of MD simulations, substantial differences between SARS-CoV-2,
SARS-CoV, and the human coronavirusHCoV-NL63.[84] Indeed, SARS-CoV-2 has the largest interaction area involving
the largest number of RBD/ACE2 contacts,[81] whereas SARS-CoV interacts through fewer amino acids but via stronger
hot spots, and HCoV-NL63 RBD recognizes a different region of ACE2,
as schematically shown in Figure B.[84]
Figure 3
(A) Most important amino
acids involved in the SARS-CoV-2 RBD/ACE2
PD complexation. Amino acids color code: red (negatively charged),
blue (positively charged), green (polar), cyan (neutral His ε-protonated).
Reprinted from ref (89). (B) Schematic representations of the ACE2 recognition by the RBD
of different coronaviruses. Reprinted from ref (84). Copyright 2020 by the
authors. Licensee MDPI, Basel, Switzerland. Distributed under the
terms and conditions of the Creative Commons
Attribution (CCBY) license.
(A) Most important amino
acids involved in the SARS-CoV-2 RBD/ACE2PD complexation. Amino acids color code: red (negatively charged),
blue (positively charged), green (polar), cyan (neutral His ε-protonated).
Reprinted from ref (89). (B) Schematic representations of the ACE2 recognition by the RBD
of different coronaviruses. Reprinted from ref (84). Copyright 2020 by the
authors. Licensee MDPI, Basel, Switzerland. Distributed under the
terms and conditions of the Creative Commons
Attribution (CCBY) license.Using MD simulations, Spinello et al.[63] have built Pearson-based matrices to evaluate the cross-correlation
of the ACE2 residues with the RBD of SARS-CoV-2 and SARS-CoV, respectively.
A larger correlation for SARS-CoV-2 has been found, and the pivotal
role of some of its mutated residues (e.g., Asn501 and Phe486) in
increasing the strength of the interaction between the two protein
domains has been evidenced.[63] These results
globally support the hypothesis of the stronger binding of SARS-CoV-2
RBD to ACE2, compensating the dominant down conformations of the S1
subunits of the S protein.[61]
SARS-CoV-2
Mutations and Animal ACE2 Receptors
SARS-CoV-2
mutations potentially leading to multiple viral strains and hence
to the possible emergence of more pathogenic agents are also actively
studied in detail using in silico methodologies.
By analyzing 791 viral genomes of different SARS-CoV-2 strains, Alfaro
and coworkers[59] found that the S2 subunit
fusion is much less affected by mutations than the RBD. This fact
seems to indicate that changes in the protein units directly involved
in the membrane fusion may produce dysfunctional viruses that are
unable to infect host cells. Consequently, the S2 subunit should be
considered as a robust target for potential antiviral drugs given
its high conservation and low mutation tolerance.[59] Notwithstanding, certain residues of the RBD have also
been deemed indispensable for the efficient recognition of humanACE2.[79,90,91] On the contrary, Ou et al.[77] have reported that mutations in the RBD of SARS-CoV-2
observed from strains with different geographical distributions actually
enhance the binding to ACE2, indicating that some RBD polymorphs might
even be more infectious and hence increase the diffusion of the pandemic.The variability of ACE2 and hence the ability of SARS-CoV-2 to
infect other species, cross the interspecies barrier, and develop
zoonosis[92] has been computationally addressed
by simulating animal ACE2 proteins.[92−96] Pach et al.[94] have studied
several mutants of ACE2 belonging to cat, dog, ferret, hamster, mouse,
rat, and red squirrel by means of MD simulations. The authors found
molecular descriptors that could predict the susceptibility of these
animals to being infected, alerting that red squirrels could suffer
from SARS-CoV-2 infection. They also predicted that certain mutations
in dogs, rats, and mice could lead to effective SARS-CoV-2 infections.
These studies are also extremely relevant in the aim of finding or
creating suitable animal models for experimentation.[94] Furthermore, the possible influence of distinct ACE2 polymorphs
in the infectivity and severity of COVID-19 has been also reported.[97−99]
Rational Design of RBD/ACE2 Inhibitors
The capital
function of the viral S protein and the advances achieved in characterizing
the molecular bases of the RBD/ACE2 binding have motivated a considerable
number of experimental and computational studies, aiming at identifying
potential antiviral agents susceptible to inhibiting RBD/ACE2 recognition.[100−102] A commonly followed strategy, necessary also to cope with the societal
and sanitary urgency induced by the pandemic, is to repurpose already
approved drugs with known pharmacokinetics and toxicological profiles
to possibly obtain a significant antiviral effect.[103−107]Several known antivirals, including the famous remdesivir,
have been or are currently being tested against SARS-CoV-2,[108] with some of them undergoing clinical trials.[109−111] Whereas the mode of action usually consists of the inhibition of
the viral RNA polymerase, proteases, or neuraminidases,[112−115] the anti-influenza drug Arbidol[116] has
recently been proposed as an inhibitor of the trimerization at the
S2 subunit of the S protein.[117] On the
contrary, the efficacy of antimalarial drugs chloroquine and hydroxychloroquine,
which have been compassionately administered in severe COVID-19patients,
is being intensely debated.[118−121] Several modes of action have been hypothesized,
although the specific molecular mechanism or mechanisms responsible
for the in vitro inhibition of SARS-CoV-2 remain
unclear.[122] The benefit/risk balance of
the combination of hydroxychloroquine with the antibiotic azithromycin
has been questioned as well.[123−126] Fantini and coworkers[127] used MD simulations to study the molecular mechanism of
action of these combinations of drugs. The authors do not exclude
possible synergistic effects: whereas hydroxychloroquine is directed
toward the ACE2 protein of the host cell, azithromycin interacts with
the SARS-CoV-2 S protein. The binding site, however, is located not
at the RBD but at a ganglioside binding site in the lateral regions
of the S protein.[127,128] Other studies combining machine
learning and molecular docking did not point out remarkable interactions
between azithromycin and the viral S protein, although other glycopeptide
and macrolactam antibiotics showed inhibition potential.[129] Hence, no clear evidence unequivocally identifying
the molecular bases of the proposed therapeutic protocol can, at the
moment, be invoked.Flavonoids are a broad family of natural
compounds with antioxidant,
antibiotic, antinflammatory, anticancer, and antiviral properties.[130,131] Because of their known biological properties, general safety, and
broad availability, recent studies have tackled their potential activity
to block the SARS-CoV-2 RBD.[132−136] Among the different flavonoids, hesperidin (see the structure in Figure A), a compound present
in citrus fruits and other vegetables,[137] has shown the ability to block the RBD/ACE2 interaction (as well
as the main viral protease),[138] even motivating
large-scale extraction processes for massive production.[139]
Figure 4
(A) Chemical structure of different flavonoids. (B) Relevant
binding
hot spots of different flavonoids with human ACE2 including binding
energies (kcal/mol) and histogram of the distances between the RBD
and ACE2-PD centers of mass for (C) diosmin and (D) rutin. Adapted
from ref (89).
(A) Chemical structure of different flavonoids. (B) Relevant
binding
hot spots of different flavonoids with humanACE2 including binding
energies (kcal/mol) and histogram of the distances between the RBD
and ACE2-PD centers of mass for (C) diosmin and (D) rutin. Adapted
from ref (89).García-Iriepa et al. have studied
the potential ability
of the flavonoidsdiosmin, rutin, and naringin (Figure A) to inhibit RBD/ACE2 recognition by means
of both docking and all-atom MD simulations.[89] Molecular docking revealed three relevant binding hot spots on the
humanACE2 protein, the loop and the PDs at interface α and
interface β, with estimated binding affinities of ∼6–9
kcal/mol (Figure B).
The destabilization of the RBD/ACE2 complex was highlighted by MD
simulations, in particular, by measuring the distance between the
center of mass of the RBD and ACE2PD. Despite the promising binding
energies for diosmin, the MD results indicate the poor destabilization
of the RBD/ACE2 complex for thisflavonoid, given the limited distance
distributions with respect to the control (RBD/ACE2 complex in the
absence of drug; see Figure C). In contrast, the effect of rutin is more pronounced, yielding
larger RBD–ACE2 distances for all three binding hot spots (Figure D).[89] These results evidence that even though docking studies
are a first and necessary step for the preliminary assessment of a
potential drug activity against SARS-CoV-2, further analyses, including
extended MD simulations, are required to validate and quantify the
inhibition potentiality.A large number of antibiotics have
also been tested by means of
docking studies against the viral RBD.[140−143] Plicamycin (also known as mithramycin;
see the structure in Figure A) is clinically used as an anticancer agent[144,145] and shows a promising interaction with ACE2, in particular, at interface
β, a region that is directly involved in the recognition of
the viral RBD (see Figures B and 5B,C).[89]
Figure 5
(A)
Chemical structure of plicamycin (mithramycin). (B) Histogram
of the distances between the centers of mass of the RBD and the ACE2
PD measured from extended MD simulations. (C) Representative snapshots
of the two plicamycin conformations in equilibrium at the interface-β.
(D) Potential of mean force (PMF) corresponding to the RBD/ACE2 complexation
in the absence and in the presence of plicamycin. (E) Snapshots of
the complex in the presence of plicamycin at interface-β, at
its free-energy minimum (bound), and as separated structures. Reprinted
from ref (89).
(A)
Chemical structure of plicamycin (mithramycin). (B) Histogram
of the distances between the centers of mass of the RBD and the ACE2PD measured from extended MD simulations. (C) Representative snapshots
of the two plicamycin conformations in equilibrium at the interface-β.
(D) Potential of mean force (PMF) corresponding to the RBD/ACE2 complexation
in the absence and in the presence of plicamycin. (E) Snapshots of
the complex in the presence of plicamycin at interface-β, at
its free-energy minimum (bound), and as separated structures. Reprinted
from ref (89).The RBD/ACE2 complex destabilization was further
quantified by
determining the binding free energy using a combination of metadynamics[146] and adaptative biased force (eABF),[147] that is, the meta-eABF method.[148,149]The results of García-Iriepa et al. revealed
a 2 kcal/mol
destabilization of the RBD/ACE2 complex in the presence of the drug
as compared with the nondrugged situation. Furthermore, the RBD/ACE2
distance at the free-energy minimum is increased by ∼5 Å
with respect to the drug-free system, supporting the mode of action
of plicamycin as a possible antiviral against SARS-CoV-2 (Figure D,E).[89]An alternative strategy to inhibit the
RBD/ACE2 binding has been
recently proposed by Boldrini et al., in this case directed only toward
the human receptor.[150] The authors propose
to hamper the ACE2 maturation at a post-translational level by designing
inhibitors for specific folding intermediates. As a result, the downregulation
of ACE2 would limit the number of receptors susceptible to binding
the SARS-CoV-2 S protein and stopping the infection. The free-energy
profile corresponding to the folding process was computed through
biased MD simulations, revealing two intermediate conformations (early
and late intermediates; see Figure A). Only the late intermediate was found to be druggable,
for which two specific pockets were identified. A database of approved
drugs was tested to identify potential blockers of these two pockets
(Figure B). Among
the 35 hits, the authors identified mefloquine and hydroxychloroquine
as compounds showing possible anti-SARS-CoV-2 activity in
vitro (Figures C,D).[150] Despite the interest of this
strategy for antiviral purposes, care should be taken in examining
the possible side effects due to the decrease amount of active ACE2
receptors.
Figure 6
(A) Free-energy profile corresponding to the ACE2 folding. (B)
Druggable pockets of the late intermediate state, colored in magenta.
(C) Chemical structures of mefloquine and hydroxychloroquine, which
showed potential inhibition of the ACE2 late intermediate. (D) Snapshot
of mefloquine inside the binding pocket of the ACE2 late intermediate.
Reprinted and adapted from ref (150). Distributed under the terms and conditions
of the Creative
Commons Public Domain Dedication (CC0 1.0).
(A) Free-energy profile corresponding to the ACE2 folding. (B)
Druggable pockets of the late intermediate state, colored in magenta.
(C) Chemical structures of mefloquine and hydroxychloroquine, which
showed potential inhibition of the ACE2 late intermediate. (D) Snapshot
of mefloquine inside the binding pocket of the ACE2 late intermediate.
Reprinted and adapted from ref (150). Distributed under the terms and conditions
of the Creative
Commons Public Domain Dedication (CC0 1.0).The design of oligopeptides based on ACE2 sequences and able
to
bind to the RBD has been recently proposed to selectively target the
viral material.[151,152] Oligopeptides present several
advantages with respect to traditional small drugs: (i) They cover
large surfaces of the binding hot spots through multiple contacts,
(ii) the interactions are based on protein–protein recognition
and therefore are highly specific, (iii) because they are based on
humanACE2 sequences, they are unlikely to trigger unwanted immune
responses, and (iv) they can be potentially combined with nanoparticle
carriers.[153] By using extended MD simulations,
Han and Král[151] have studied four
peptides based on the ACE2PD (see Figure A–E), analyzed their stabilities through
the root-mean-square deviation (RMSD) values (Figure F), and assessed their ability to bind SARS-CoV-2
RBD by estimating the binding energies (Figure G). The sequences were designed to mimic
several regions of ACE2, although the majority of crucial amino acids
(up to 15) necessary for the RBD recognition belong to the α1-helix and are present in all prototypes. Globally, all peptides
showed very promising results, exhibiting similar binding energies
as compared with the full ACE2PD.[151] A
similar peptide was modeled by Pentelute and coworkers[152] and synthesized using automated fast-flow peptide
synthesis, whereas its specific binding affinity toward SARS-CoV-2
RBD was quantified experimentally through biolayer interferometry.
The peptide binds the viral RBD with a low 49 nM affinity; several
modifications that enhance the binding capacity have, however, been
proposed on the basis of MD simulations.[154,155]
Figure 7
(A–E)
Representation of inhibitors 1–4 and the control
c (α1-helix shown in red) complexed with the SARS-CoV-2
RBD (blue). (F) Analysis of the stabilities of 1–4 and c through
the root-mean-square deviation (RMSD) values. (G) Analysis of the
ability of 1–4 and c to bind SARS-CoV-2 RBD by estimating the
binding energies. Reprinted from ref (151).
(A–E)
Representation of inhibitors 1–4 and the control
c (α1-helix shown in red) complexed with the SARS-CoV-2
RBD (blue). (F) Analysis of the stabilities of 1–4 and c through
the root-mean-square deviation (RMSD) values. (G) Analysis of the
ability of 1–4 and c to bind SARS-CoV-2 RBD by estimating the
binding energies. Reprinted from ref (151).A strategy previously
used against SARS-CoV relies on the use of
peptides to bind the S2 subunit.[156] Oligopeptides
derived from the S2 heptad repeats 1 and 2 (HR1 and HR2, respectively)
have been proposed. HR1 and HR2 are known to associate to form a six-helix
bundle fusion core that is crucial to allow the cellular/viral membrane
fusion. Thus HR2 analogues able to effectively bind HR1 structures
would block the membrane fusion and therefore halt viral infection.
This strategy has been recently assessed for SARS-CoV-2 by Ling and
coworkers.[157] Indeed, MD simulations reveal
that the HR2-like peptides are able to bind HR1 with a free binding
energy of −33.4 kcal/mol. These results are also coherent with
the observed high conservation of the S2 genome between SARS-CoV and
SARS-CoV-2.Using a computational model of the complex between
the S-protein
of SARS-CoV-2 and the humanACE2 receptor, a supercomputer-based molecular
docking was recently performed by Smith and Smith,[158] revealing that 77 ligands bind strongly to either the S-protein/ACE2
interface or the isolated S protein.
SARS-CoV-2 Proteases
Although
the SARS-CoV-2 genome is rather large, being formed by
an RNA strand of ca. 30 000 nucleobases, it encodes a relatively
low number of proteins, which can be divided into structural and nonstructural
proteins (Figure ).[159] Whereas structural proteins are required to
generate viral particles, 16 nonstructural proteins (Nsp1–16)
are responsible for the diverse virus functionalities, mainly focusing
on replication and survival in the host cell. SARS-CoV-2, such as
other betacoronaviruses, translates, once inside the host cell, two
overlapping polyproteins (pp1a and pp1ab) that are responsible for
encoding all nonstructural and structural proteins. The latter evolve
into mature virions inside intermediate compartments formed by the
host endoplasmic reticulum and the Golgi apparatus before being excreted
to the extracellular region.[160] The viral
replication is initiated by the cleavage of both pp1a and pp1ab polyproteins[161,162] to generate all 16 nonstructural proteins.
Figure 8
SARS-CoV-2 genome structure, including structural
and nonstructural
proteins.
Initially, two
proteases encoded within the polyproteins—Nsp5,
corresponding to 3-chymotrypsin-like protease (3CLpro), and Nsp3,
corresponding to papain-like protease (PLpro)—undergo autocleavage.
Once generated, 3CLpro and PLpro cleave the polyprotein, including
RNA-dependent RNA polymerase (RdRp), helicase (Hel), exonuclease (exoN),
endoribonuclease (NendoU), and an S-adenosyl-methionine-dependent
ribose 2′-O-methyltransferase (2′-O-MT). Because of their positions within the polyprotein,
PLpro is responsible for the formation of Nsp1–3, whereas 3CLpro
accounts for the formation of the other Nsps (Nsp4–16).[163]Although all structural and nonstructural
proteins could be of
potential interest for drug repurposing (see, e.g., the potential
binding mechanism of remdesivir to RdRp)[164] due to their fundamental relevance in the viral replication process,
the following sections are dedicated to the modeling and simulation
of drugs that possibly inhibit 3CLpro and PLpro functions. Furthermore,
it is established that 3CLpro acts as a homodimer, although only one
of the monomers contains the active binding site.[165] On the contrary, PLpro, although proposed by crystallography[23] to be accessible in different aggregation states,
was shown to biologically act only as a monomer.
3CLpro Structure, Domains,
And Catalytic Active Site
As already mentioned, 3CLpro is
responsible for the catalytic cleavage
of at least 11 polyprotein sites, whereas PLpro cleaves only 3 of
them (Figure ). Because 3CLpro cleaves at more sites than
PLpro, the former has been the object of more structural and computational
studies and is a much better characterized target for anticoronaviral
agents.[166−168]SARS-CoV-2 genome structure, including structural
and nonstructural
proteins.To rationalize the design of potential
antiviral agents targeting
3CLpro, it is essential to define the protease structural details
such as its catalytic active site, domains, and quaternary structure.
In this regard, SARS-CoV-23CLpro significantly resembles the analogous
SARS-CoV enzyme, sharing a 96% sequence identity.[24] The active 3CLpro homodimer consists of two protomers,
each of them composed of fewer than 310 residues (Figure a). Each protomer can be divided
in three different domains (Figure b).[165,167,169,170] The dimerization takes place
via the interaction between the N-finger and domain II of the protomers
and is essential to ensure the catalytic activity.[165,169,171] Indeed, the shape of the substrate
binding site is strongly influenced by the dimerization. As for the
catalytic site, it consists of a dyad composed of His41 and Cys145
lying in the cleft between domains I and II and bears strong similarities
with the one described for the 3CLpro of SARS-CoV.[165,167,169,170]
Figure 9
(a) 3CLpro crystal structure (PDB ID 6LU7). Protomers A and B are depicted in blue
and purple, respectively. The inhibitor placed in the binding pocket
of each protomer is depicted in van der Waals representation and in
yellow. (b) Representation of one protomer, highlighting its three
domains, the loop connecting domains II and III, and the N-finger.
(a) 3CLpro crystal structure (PDB ID 6LU7). Protomers A and B are depicted in blue
and purple, respectively. The inhibitor placed in the binding pocket
of each protomer is depicted in van der Waals representation and in
yellow. (b) Representation of one protomer, highlighting its three
domains, the loop connecting domains II and III, and the N-finger.Because of the well-characterized structure and
its appeal as drug
target, numerous potential 3CLpro antiviral agents have already been
proposed.[24,159,172−174] In this regard, docking studies have been
widely used to rapidly screen a large number of lead compounds based
on their affinity with the 3CLpro binding pocket,[175−181] whereas MD has mainly been performed to confirm the stability of
the selected docking poses for each antiviral.[175,182−185] Some simulation studies have been reported that focus on rationalizing
the drug–3CLpro interactions by quantifying the strength of
the binding process via free-energy calculations.[173,174,181,186,187]The most used[173,181−183,188−190] 3CLPro crystal structure is the dimer complexed with an inhibitor
and resolved by Jin et al.[191] (Figure a); the crystal
structure of the apo form has been recently released by Zhang et al.[24]
3CLpro Antivirals Screening through Molecular
Docking
Molecular docking has become an increasingly important
tool to assist
and stimulate drug discovery because it can model the interaction
between a small molecule and a protein at the atomic level, allowing
a fast and computationally cheap prescreening of drug candidates.
It also allows the prediction of the ligand conformation as well as
its position and orientation within the binding sites (usually referred
to as the pose) and binding affinities. The first studies faced the
problem that the crystallographic structure of SARS-CoV-23CLpro was
not available and hence relied on homology modeling;[175,176] the experimental resolution of the 3CLpro structure boosted a huge
increase in molecular docking studies.[176−180,192]When
the binding site is known, selecting a reasonably small docking box
around the hot spot (i.e., focused docking) reduces the computational
time. However, other competitive binding sites can be artificially
neglected when using this strategy. Focused docking has been used
to perform the structure-based virtual screening of a library of 1000
covalent protease inhibitors and 16 United States Food and Drug Administration
(FDA)-approved protease inhibitors. Three compounds, including paritaprevir
and simeprevir (Figure ), were identified as potential 3CLpro inhibitors.[179]
Figure 10
Chemical structure of commercial drugs used
as potential inhibitors
of the SARS-CoV-2 main protease 3CLpro.
Chemical structure of commercial drugs used
as potential inhibitors
of the SARS-CoV-2 main protease 3CLpro.The focused-docking approach has also been used by Peele et al.[178] to test 24 plant compounds with antiviral properties,
22 FDA-approved antiviral drugs, and 16 antimalarial drugs. Lopinavir
(an anti-HIV drug), amodiaquine (an antimalarial drug), and theaflavin
digallate (a plant-based phenol derivative) were selected as potential
SARS-CoV-2 therapeutics (Figure ) due to the good affinity for the active site of 3CLpro.[178]Although the binding site of 3CLpro is
known, blind-docking approaches,
targeting the whole protein, have also been considered. In this regard,
Yu et al.[177] have studied the possible
interactions of ribavirin and remdesivir (antiviral drugs), chloroquine
(antimalarial), and luteolin (flavonoid) with 3CLpro (Figure ). Luteolin was found to bind
the SARS-CoV-23CLpro with high affinity, occupying the same binding
site as the standard N3 inhibitor. Chloroquine was also found to stably
bind 3CLPro, suggesting that protease inhibition could represent a
secondary mechanism of action of this potential drug.[122] González-Paz et al.[192] performed a blind-docking study to predict the binding
of the B1a and B1b forms of the wide-range antiparasitic drug ivermectin
(Figure ) to 3CLpro,
suggesting that ivermectin B1a binds with a higher affinity than ivermectin
B1b. Lobo-Galo et al.[180] used a multiscale
approach in which blind docking was first used to pinpoint the principal
potential binding cavities, most often corresponding to the active
site, and this region was subsequently selected for further focused
docking. The authors screened thiol-reacting FDA-approved drugs including
captopril (antihypertensive drug) and found that disulfiram, used
to treat chronic alcoholism, has promising antiviral properties (Figure ). Macchiagodena
et al.[181] have applied an interesting computational
strategy that includes the combination of molecular docking and ligand
generative adversarial network (LIGANN) methods to generate new ligands
that match the shape and the chemical characteristics of the binding
pocket, thus resulting in a de novo drug design.
On the basis of molecular docking calculations, five noncommercially
available compounds have been identified that exhibit the highest
binding affinity toward 3CLpro (Figure ).
Figure 11
Chemical structure of noncommercial compounds
used as potential
inhibitors of the SARS-CoV-2 main protease 3CLpro.[181]
Chemical structure of noncommercial compounds
used as potential
inhibitors of the SARS-CoV-2 main protease 3CLpro.[181]On the one hand, both focused-
and blind-docking calculations can
be followed by MD simulations (see the next section) or by binding
free-energy calculations to further refine the relevance of the docked
poses. The latter strategy was adopted for carfilzomib (an approved
anticancer drug acting as a proteasome inhibitor) in the interaction
with 3CLpro, in which the binding free energy of the most relevant
docking pose was found to be −13.8 kcal/mol.[143]On the other hand, instead of preselecting molecular
ligand structures
based on the biological function or chemical intuition, docking methods
can be applied as an extensive virtual screening tool, scanning up
to 1.3 billion compounds, as was performed in the identification of
potential 3CLpro inhibitors.[193,194]
Molecular Dynamics
Simulations to Evaluate the Potential Activity
of 3CLpro Antivirals
In addition to the screening provided
by docking, further studies including dynamic sampling are essential
to evaluate the favorable drug–3CLpro interactions and their
stability along the simulation time. Indeed, MD studies were performed
to first assess the molecular bases of the SARS-CoV-23CLpro catalytic
function, including the impact of the enzyme dimerization[195] and the definition of the ligand anchor site
within its pocket.[183]Different families
of drugs targeting 3CLpro have been studied through dynamics simulation
approaches with mainly two aims: (i) identifying the most promising
antiviral drugs based on their interaction with 3CLpro, belonging
to families of compounds whose inhibition activity against SARS-CoV-2
has not yet been proved, and (ii) rationalizing the interaction patterns
and possibly the mechanisms of action for agents whose anti-SARS-CoV-2
activity has already been proposed. The first approach has mainly
concerned studies based on drugs with significant activity against
other viruses. Examples include the natural phytochemicals α-ketoamide(11r),
baicalin, cyanidine 3-glucoside, glabridin, and hypericin.[196−199] A comprehensive MD study of rutin (see Figure A) has also been performed to understand
its interaction inside the 3CLpro binding pocket, allowing the proposition
of rutin analogues with an enhanced interaction.[187] Regarding natural products, marine compounds have also
been proposed as 3CLpro inhibitors and have been studied through modeling
approaches, hence identifying the derivatives with the most favorable
and stable interactions.[184] Apart from
natural products, other research groups have focused on FDA-approved
antimicrobial drugs (such as viomycin) or other already known drugs
(such as carfilzomib, eravacycline, valrubicin, lopinavir, elbasvir,
streptomycin, and oftasceine, among others) to showcase, thanks to
MD simulations, those with the highest potential.[188]Regarding the works focused on investigating the
3CLpro inhibitors
that already show promising experimental results against SARS-CoV-2,
the combination of experimental data and modeling results allows us
to determine the binding mechanism and the specific interactions that
enhance the stability of the drug inside the 3CLpro pocket. For instance,
entecavir and nelfinavir revealed their structural differences and
the concomitant influence on the binding affinity and stability inside
the 3CLpro pocket,[183] showing that the
physical–chemical knowledge derived from simulation studies
is quite valuable for future proposals of drugs or derivatives. Another
example is the work of Sk et al.,[186] in
which, taking advantage of the available crystal structures of 3CLpro
in the interaction with two inhibitors (α-ketoamide and Z31792168),
the authors disentangled through MD simulations the molecular origin
(electrostatic, van der Waals, polar, or nonpolar interactions) of
the binding mode providing structural stability to the drug−protein
complex.[186]Most of the published
studies making use of MD simulations evaluate
the stability of the initial binding poses through the analysis of:
(i) the RMSD of both the 3CLpro and the drug, (ii) the root-mean-square
fluctuation (RMSF), and (iii) the radius of gyration (Rg). These three descriptors are essential in evaluating
the pose stability, as the RMSD determines the structural stability,
the RMSF reveals the drug effect on the system flexibility and fluctuations,
and Rg is indicative of the compactness
of the global structure (refs (143, 173, 175, 184, 185, 187, 188, 190, and 200)). In addition, more specific
criteria, such as the development of particular hydrogen bonds and
hydrophobic interactions, are crucial to discriminate between potential
or not efficient drugs because they ultimately determine the position
of the drug with respect to the 3CLpro. Indeed, to show inhibitory
activity, the ligand should be firmly placed inside the binding pocket,
lay close to the catalytic dyad, and possibly interact with the S1,
S2, and S4 units, as shown in Figure .[183,184,186−188]
Figure 12
3CLpro binding site defined by the S1, S2,
and S4 units. The catalytic
dyad composed of His41 and Cys145 is also depicted.
3CLpro binding site defined by the S1, S2,
and S4 units. The catalytic
dyad composed of His41 and Cys145 is also depicted.For instance, Huynh et al.[183] revealed
that the clinically approved entecavir drifts from the starting docking
pose to a novel stable location (Figure b). In the initial conformation, the drug
establishes hydrogen bonds with residues Thr26, His41, and Gly143
(Figure b), whereas
in the final conformation, the drug keeps the interaction with His41
but establishes two novel hydrogen-bonds with Glu166 and Ala189 (Figure b). As a matter
of fact, the final drug location corresponds to a secondary pose already
obtained by the docking study. This work constitutes a glaring example
of the fact that the stability of the docking poses should be, when
possible, confirmed by MD simulations because the explicit consideration
of the solvent and the inherent protein flexibility, as well as the
time evolution, could influence the interactions between the protein
and the drug, changing the rank order of the docking poses.
Figure 13
(a) Green
points depict the center of mass of the drug during the
MD simulation. Hydrogen bonds between the drug and the 3CLpro binding
pocket for (b) the starting structure (most stable docking pose) and
(c) the pose after MD simulation. Figure adapted from ref (183).
(a) Green
points depict the center of mass of the drug during the
MD simulation. Hydrogen bonds between the drug and the 3CLpro binding
pocket for (b) the starting structure (most stable docking pose) and
(c) the pose after MD simulation. Figure adapted from ref (183).Another example is the work published by Mahanta et al. focused
on the study of some FDA-approved drugs for antiviral repurposing.[188] In the case of viomycin, the analysis of the
hydrogen-bond interactions within the 3CLpro binding site reveals
that the drug moves deeply inside the binding pocket, reaching a quite
stable position and developing robust interactions at the end of the
MD simulation. The comparison of the hydrogen-bond interactions of
viomycin placed at the bottom of the pocket with the ones observed
for the already reported reference N3 demonstrates that the binding
of viomycin is significantly stronger.To better quantify the
precise positioning of the drug in the binding
pocket and the balance of noncovalent interactions leading to its
stability, different analyses can be performed. In particular, the
position of the drug and its extension inside the cavity can be analyzed
through the evaluation of solvent-accessible surface area (SASA) and,
based on this concept, through the contact area between the drug and
the 3CLpro binding pocket. As described by Huynh et al.,[183] the comparison of the contact area computed
along the MD simulation for two different drugs allowed us to conclude
that one of them does not completely fill the 3CLpro binding pocket
due to its small size, hence indicating a weaker interaction. In other
cases, the calculation of the SASA along the MD simulation has allowed
differentiation of the expansion or the shrinking of 3CLpro when binding
to different drugs.[182,186,200]Huynh et al. studied the behavior of rutin (Figure A) as a possible 3CLpro inhibitor.[187] After analyzing the MD simulation, it was concluded
that hydrophobic interactions between the drug and binding site, essential
for drug stabilization, were missing for rutin due to the large number
of hydroxyl groups. The authors used this knowledge to design a more
hydrophobic analogue of rutin, in which two methyl groups replaced
two hydroxyl groups in each sugar moiety. This analogue showed much
stronger hydrophobic interactions with the binding site, placing the
drug deeply into the 3CLpro pocket.When performing molecular
simulations involving many different
drugs, the principal component analysis (PCA) is useful to classify
the compounds in different groups depending on the evolution of the
drug–protein interaction along the simulation time.[182,186] For instance, the PCA based on the MD simulation of five phytochemicals
revealed that the three most stable drugs (α-ketoamide(11r),
cyanidin 3-glucoside, and baicalin) share similar dynamic and interaction
patterns inside the 3CLpro binding site.[182]Going beyond equilibrium MD, free-energy methods can be used
to
obtain the binding free energy of the drug–protein complex.
However, free-energy methods are diverse, and, as usual, high accuracy
is invariably accompanied by a high computational cost. As an example,
Sk et al.[186] have estimated the binding
free energy of 3CLpro with two drugs (α-ketoamide and Z31792168),
resorting to the highly approximated but computationally inexpensive
molecular mechanics Poisson–Boltzmann surface area (MM-PBSA)[201−208] strategy. The decomposition of the total free energy in its different
energetic components increases the details of the description. By
analyzing the results presented in Figure a, the authors concluded that the binding
is mostly driven by van der Waals (ΔEVdW) and electrostatic (ΔEelec) interactions
and nonpolar solvation free energy (ΔGnp), indicating ΔEVdW as
the leading contribution. Conversely, the polar solvation free energy
(ΔGpol) and the configurational
entropy (−TΔS) destabilize
the interaction.
Figure 14
(a) Total binding free energy (ΔGbind) and the van der Waals (ΔEVdW),
electrostatic (ΔEelec) interactions,
polar solvation free energy (ΔGpol), nonpolar solvation free energy (ΔGnp), and the configurational entropy (−TΔS) components for both drugs under study.
(b) Contribution of each residue to the total binding free energy
for both drugs, α-ketoamide and Z31792168, in the left and right
panels, respectively. Figure adapted with permission from ref (186). Copyright 2020 Informa
UK Limited.
(a) Total binding free energy (ΔGbind) and the van der Waals (ΔEVdW),
electrostatic (ΔEelec) interactions,
polar solvation free energy (ΔGpol), nonpolar solvation free energy (ΔGnp), and the configurational entropy (−TΔS) components for both drugs under study.
(b) Contribution of each residue to the total binding free energy
for both drugs, α-ketoamide and Z31792168, in the left and right
panels, respectively. Figure adapted with permission from ref (186). Copyright 2020 Informa
UK Limited.The authors pushed the analysis
to understand the contribution
of some specific amino acids on the binding free energy by using the
molecular mechanics/generalized Born surface area (MM-GBSA) approach,[209−213] which provides a per-residue decomposition of the binding free energy.
Their results (Figure b,c) evidence that for α-ketoamide, the number of residues
significantly contributing to the binding free energy is larger than
that for Z31792168, which is in line with the more negative total
binding free energy computed for the former drug (Figure a). Similar binding free-energy
analyses have been reported by Wang et al.,[143] in this case focusing on neutral and charged clinically approved
drugs, such as carfilzomib, lopinavir, elbasvir, and streptomycin,
among others, finally selecting carfilzomib as the most efficient
inhibitor.MD simulations coupled to a regression model correlating
the binding
energy and molecular descriptors have also been proposed to establish
the 3CLpro-inhibiting capabilities of the long-debated chloroquine
and hydroxychloroquine drugs. The interactions with some active-site
residues (Cys145, His41) show that aminoquinoline analogs should result
in more suitable choices.[214]
Hybrid Quantum
Mechanics/Molecular Mechanics Studies of 3CLpro
If the chemical
reactivity plays an important role in protein–substrate
systems, as is the case for catalytic activity, then the region of
the molecular system undergoing chemical reactions must be described
by quantum mechanics (QM) methods, surrounded by the rest of the macromolecular
environment, treated by molecular mechanics (MM), to get insights
into the reaction free-energy profile. This results in the so-called
QM/MM approaches, which have been widely adopted with different flavors
in biochemistry and biology in the last several years.[215,216]In the context of the properties of SARS-CoV-23CLpro, QM/MM
studies revealed the possibility to discriminate between reactive
and nonreactive enzyme–substrate complexes.[217] Also, these multiscale simulation methods allowed it to
be firmly established that the free-energy landscape is associated
not only with the substrate binding but also, more importantly, with
different steps of the proteolysis reaction, as described by the corresponding
transition states.[218] Moreover, thishelped
in elucidating how SARS-CoV-23CLpro differs in the mechanism of action
compared with other cysteine proteases.[219]
Papain-like Protease Catalytic Site and Ubiquitin-like Domains:
A Multifunctional Protein
The papain-like protease (PLpro)
is recognized as an attractive target for antivirals,[23] although it is much less studied as compared with 3CLpro.
Like 3CLpro, PLpro is primarily a cysteine protease (Figure c). PLpro recognizes the tetrapeptide
Leu-X-Gly-Gly, a motif that is found at the Nsp 1/2, Nsp 2/3, and
Nsp 3/4 borders.[220,221] Even though PLpro and 3CLpro
share the same primary function, some in vitro studies
have outlined that PLpro has two additional proteolytic functions:
the removal of ubiquitin (Ub) and the removal of a Ub-like protein
named interferon-stimulated gene product 15 (ISG15).[161,162,222−224] These two additional functions are available due to the presence
of two Ub and Ub-like binding subsites on the protein surface (SUb1
and SUb2 in Figure b,d). Hence, the PLpro function is not only restricted to the initiation
of the viral replication process through the cysteine protease catalytic
activity, as deubiquitinating and deISGylating activities also compete
with the response of the immune system.[225−227] Indeed, the infected host cell can stimulate, through ubiquitination
and ISGylation, the production of interferon-stimulated gene products
as cytokines and chemokines, known for their antiviral properties.[228,229] Indeed, ISGylation can fight the virus through the sequestration
and degradation of viral proteins.[230−232] Hence, the activity
of PLpro can substantially reduce the secondary immune response, conferring
to SARS-type viruses an additional survival mechanism.
Figure 15
(a) Papain-like
protease (PLpro, PDB ID: 6W9C). The structures
of (b) SUb1, in blue, (c) the active site, in red, and (d) SUb2, in
yellow, are highlighted.
(a) Papain-like
protease (PLpro, PDB ID: 6W9C). The structures
of (b) SUb1, in blue, (c) the active site, in red, and (d) SUb2, in
yellow, are highlighted.Structurally, PLpro
offers some advantages for the in silico design of
anti-SARS-CoV-2 drugs. This enzyme is active as a monomer,
and hence no information on aggregation states is required. Moreover,
a monomer implies a reduced number of atoms compared with a dimer
or a trimer, allowing longer simulation times to be reached, which
are usually required to efficiently sample flexible protein structures
and their interactions with the proposed drugs. Also, the presence
of three different binding domains as possible antiviral targets (the
cysteine protease active site, SUb1 and SUb2) makes PLpro a highly
attractive multifunctional target.Despite these favorable characteristics,
some challenges need to
be faced when proposing PLpro inhibitory mechanisms: The protease
substrate is composed of a conventional catalytic triad (Cys112–His273–Asp287)[23] that is partially solvent-exposed, with side
chains hampering any contact with the catalytic triad. Indeed, a nearby
gate has to be accessed by the polypeptide undergoing catalytic cleavage,
identified as Leu-X-Gly-Gly subsites (Figure a). On the basis of the prescreening of
a library of >50 000 compounds, also followed by synthetic
optimization, it was indeed found that drugs can be designed to close
such a gate by inducing a loop closure that shuts down catalysis at
the active site.[233]SUb1 and SUb2
binding subsites are much less studied and hence
are less characterized. Therefore, a deeper understanding of these
additional PLpro proteolytic functions is envisaged to improve the
specificity of the inhibitor design. From a functional point of view,
whereas the deubiquitinating activity is more established, the deISGylating
activity is not completely understood,[234,235] although
its effects on the immune response during a viral infection are documented.[236] From a structural point of view, when looking
at the SARS-CoV-2PLpro’s X-ray structure, a majority of hydrophobic
exposed side chains can be identified at both the SUb1 and SUb2 surfaces.
Similar to the previous SARS-CoVPLpro, the presence of SUb1 as a
C-terminal domain and SUb2 as a N-terminal domain is confirmed, with
both domains being distant from the catalytic site and both being
active for deubiquitination and deISGylation.[222] The flexibility and thus specificity of SUb1 and SUb2 subdomains
was demonstrated in mouse hepatitis virus (MHV) PLpro by altering
the binding of Ub and ISG15 through the change of a single amino acid
(Asp to Ala), in turn leading to pathogenic modifications.[234−236] Comparing the pp1ab polyprotein of SARS-CoV and SARS-CoV-2, a moderate
83% sequence similarity is observed in the PLpro region, hence raising
the question of whether SARS-CoVPLpro antivirals can be directly
used to inhibit SARS-CoV-2PLpro. In more detail, focusing on SUb1,
differences in six amino acids are found (SARS-CoV residues in parentheses),
Ser170(Thr), Tyr171(His), Tyr216(Leu), Gln195(Lys), Thr225(Val), and
Lys232(Gln), suggesting electrostatic and steric modifications.[163]
PLpro Inhibitors of the Catalytic Active
Site
The vast
majority of the in silico search for potential antivirals
against the action of PLpro is dedicated to the inhibition of the
catalytic active site. This is mainly due to the fact that 3CLpro
and PLpro share the same basic features of a cysteine protease. Hence,
the same antiviral compound could, in principle, interact with the
same type of protein subdomain. Nevertheless, we know that 3CLpro
and PLpro have different inhibiting subsites as entry gates for the
cleavable polypeptides, so this represents a challenging goal. From
a methodological point of view, such multitarget studies were usually
performed by applying molecular-docking techniques followed, in most
of the cases, by virtual screening based on the calculated binding
energy. By using these approaches, luteolin was proposed to bind with
high affinity to 3CLpro (ca. −5.5 kcal/mol) and even more to
PLpro (ca. −7.0 kcal/mol) due to the formation, in the latter
case, of three hydrogen bonds with Asn180, Arg105, and Phe185.[177]Evidently, the low computational cost
offered by docking can be of high interest when comparing a large
set of ligands, including compounds proposed by experimentalists or
previously used to treat other infections. For example, Hosseini et
al. proposed a set of promising inhibitors showing a possibly stronger
effect with respect to the known and highly debated remdesivir, lopinavir,
and ritonavir (acting against 3CLpro).[237,238] The authors
previewed a higher inhibition effect by docking PLpro with paritaprevir,
glecaprevir, velpatasvir, amaryl, trypan blue, raltegravir, ledipasvir,
and simeprevir (all of them within a really narrow 0.5 kcal/mol energy
difference). This long list also shows how docking techniques are
proficient for proposing alternative compounds, although they lack
in the identification of a clearly emerging lead compound. A similar
approach was employed by Wu et al., considering an even larger set
of drugs,[239] divided according to their
use: Ribavirin, valganciclovir, and thymidine (antiviral drugs); chloramphenicol,
cefamandole, and tigecycline (antibacterial drugs); chlorphenesin
carbamate (muscle relaxant drug); and levodropropizine (antitussive
drug) were all found to have high binding affinity to PLpro.In a similar study, the focus was instead placed on the different
activities of the drugs, and the most suitable target was finally
proposed depending on the type of interaction with the protease.[240] In this case, ritonavir, lopinavir, and darunavir
(anti-HIV drugs) were considered, and it was finally suggested that
darunavir should be more selective for recognizing PLpro instead of
3CLpro. Nevertheless, additional docking studies suggest that darunavir
could also be a suitable choice to bind 3CLpro,[241,242] highlighting the relevance of this particular anti-HIV drug as a
potential anti-COVID-19 clinical drug.Different from molecular
docking, bioinformatic criteria, using
network proximity analyses of drug targets, can help to identify candidate
molecules, also including potential drug combinations, via the exploration
of not only one target but also the complete human interactome. As
an example, it was proposed that a combination of mercaptopurine and
melatonin can constitute a combination therapy for SARS-CoV-2 by targeting
PLpro, ACE2, c-Jun signaling, and anti-inflammatory pathways in a
synergistic way.[240]On the opposite
side of the spectrum, to have high-value chemical
insights about reliable potential antivirals, it is necessary to increase
the complexity of the molecular modeling technique. MM-GBSA was used
by Huang et al. to assess the interaction of omeprazole, methicillin,
and tolazamide with PLpro.[243] Omeprazole
was proposed as the best candidate, as strong interactions in inhibiting
the Leu-X-Gly-Gly region were identified (Figure a). In particular, a synergy between polar
and hydrophobic interactions was highlighted: hydrogen bonds with
Ser212, Glu214, and Lys217 and π stacking of omeprazole’s
phenyl group with Tyr213 (Figure b).
Figure 16
(a) Omeprazole, methicillin, and tolazamide PLpro binding
site.
(b) Insights into the interactions between PLpro and omeprazole. Reprinted
from ref (243). Distributed
under the terms and conditions of the Creative Commons
Attribution (CCBY) license.
(a) Omeprazole, methicillin, and tolazamidePLpro binding
site.
(b) Insights into the interactions between PLpro and omeprazole. Reprinted
from ref (243). Distributed
under the terms and conditions of the Creative Commons
Attribution (CCBY) license.The same MM-GBSA method was also performed on a different set of
drugs, and it was found that elbasvir binds stably to RdRp, Hel, and
PLpro.[244]Definitively much more
accurate, from a physicochemical point of
view, are fully atomistic enhanced sampling nonequilibrium simulations
such as alchemical transformations, which were employed to measure
the dissociation constant of hydroxychloroquine from 3CLpro, PLpro,
and RdRp. It was concluded that the drug could act as a mild inhibitor
for all three targets and especially for PLpro.[245] The proposed drug interacts with the Leu-X-Gly-Gly inhibition
region close to the catalytic site. In more detail, the chosen computational
approach is based on the Hamiltonian replica exchange method (HREM)
coupled to nonequilibrium alchemical MD simulations (i.e., HREM/NE
methodology).[246,247]
PLpro Inhibitors of the
Ubiquitin-like Domains
Very
few studies focus on the possible inhibition of SUb1 and SUb2 domains,
which also happen to be the PLpro region showing the largest sequence
differences when comparing SARS-CoV and SARS-CoV-2, hence increasing
the difficulty of the task. Some structural insights of the interaction
of SARS-CoV-2PLpro with Lys48-linked diubiquitin (Lys48-Ub2) and ISG15 were proposed by MD studies.[248] Through extensive simulations (1 μs), it was found that mISG15
binds more tightly to the same PLpro subsite than Lys48-Ub2. On the basis of this result, the authors proposed GRL-0617 to be
the inhibitor, which was already identified as a noncovalent inhibitor
of PLpro in previous SARS viruses.[248] This
assumption was based on the observation that the conserved Tyr268
could bind to GRL-0617, as in previously identified PLpros, blocking
the entry of the ISG15 C-terminus while approaching the subsite. Interestingly,
GRL-0617 does not inhibit PLpro’s involved in Middle East respiratory
syndrome (MERS) infections because Tyr268 is replaced by a Thr residue.
Other SARS-CoV-2 Targets
In addition to the previously mentioned SARS-CoV-2 targets, common
to diverse coronaviruses, four other biomolecular targets have been
proposed to inhibit SARS-CoV-2. These are (i) the SUD,[25,26,239,249−252] (ii) the RdRp,[15,16,239,253,254] (iii) the Hel,[239,255,256] and (iv) the N-terminal RNA binding domain of the nucleocapsid protein.[257,258] Interestingly, the SUD, RdRp, and Hel all belong to coronavirus
nonstructural proteins. Apart from the molecular-docking approaches
dealing with RdRp[244,253,259−261] and Hel,[239] little
interest has been dedicated to MD simulations,[256] and hence we expect those targets to become the objects
of further studies.A different role can be assigned to the
SUD, which is able to interact
with oligo(G) nucleic acids of the host, like G-quadruplexes, to hamper
the defensive response of the host, thus favoring viral infection
of human cells.[25,250,251] Moreover, the SUD has recently been reported to promote the stabilization
of the E3 ubiquitin ligaseRCHY1, which degrades the antiviral factor
p53.[252]The key role of the SUD in
coronavirus activity has motivated Hognon
et al.[262] to study its binding with a RNA
G-quadruplex to explain the role of the noncanonical nucleic acid
structure in the dimerization process. The investigation was performed
using a combination of equilibrium and enhanced sampling classic MD
simulations together with the evaluation of the 2D free-energy profile
(Figure ).[262] Two stable interaction modes between the SUD
and a model RNA G-quadruplex have been evidenced, in which the oligonucleotide
resides either at the interface between the SUD units (dimeric mode, Figure b) or close to
one of the monomers (monomeric mode, Figure c). The 2D free-energy profile has shown
not only that the dimeric mode is slightly favored, leading to a binding
free energy of ∼6 kcal/mol, but also that the presence of the
RNA avoids the opening of the SUD units to lead to an open inactive
conformation (Figure d). Overall, these results highlight the importance of the G-quadruplex
interaction to stabilize the dimeric conformation of SUD and pave
the way for the rational design of efficient therapeutic agents aiming
at perturbing this interaction, hence enhancing the host defenses
against the virus.
Figure 17
(a) 2D free-energy profile for the interaction between
a SUD dimer
(SUDA, SUDB) and an RNA G4 structure, including
(b) dimeric and (c) monomeric binding modes. (d) SUD conformation.
Reprinted from ref (262).
(a) 2D free-energy profile for the interaction between
a SUD dimer
(SUDA, SUDB) and an RNA G4 structure, including
(b) dimeric and (c) monomeric binding modes. (d) SUD conformation.
Reprinted from ref (262).
Conclusions
Molecular modeling and
simulation techniques have achieved a high
degree of maturity that allows their use to answer key biological
questions, especially concerning rational drug design development.
This assumption has been proven particularly true in the case of the
global fight against the COVID-19 pandemic. Indeed, modeling and simulation
have complemented experimental studies, offering a clear picture of
the molecular bases of the main SARS-CoV-2 protein functions and highlighting
possible targets to reach their inhibition. The use of multiscale
modeling, in which the level of theory used is carefully tailored
to the physical–chemical property that should be determined
and which can be progressively increased, has allowed large libraries
of existing compounds to be screened. This combined approach is fundamental
in speeding up the quest for efficient antiviral treatments that are
effectively capable of counteracting the high SARS-CoV-2 transmissibility.
Furthermore, key protein targets may be considered, related to either
to the cell infection (S protein) or the viral material maturation
(proteases). The role of other nonstructural proteins and, in particular,
their effect on eluding the host immune response, can also be efficiently
tackled by molecular modeling and simulation techniques. If molecular
docking was largely popular and mainstream for drug-design purposes,
then the necessity to include long-time-scale MD simulations, also
coupled to enhanced sampling techniques to obtain free-energy profiles,
would have proven its fundamental role.In the present Review,
we have focused on classical-based molecular
simulations. However, QM or QM/MM approaches are expected to be of
great utility in providing information on the specific chemical mechanisms
of the protease enzymes or in the development of covalent-based inhibitors.
Bioinformatic procedures were also considered for their capacity to
estimate multi-target-based or synergic therapeutic protocols. The
computational techniques reviewed here are therefore crucial actors
of the fight that the scientific community has undertaken against
SARS-CoV-2, especially when applied in tight collaboration with structural,
molecular, and cellular biology researchers. This has been made possible
by the great advancement in the methodological development experienced
in the last several years, which has allowed us to treat complex systems
and come closer to the real biological conditions while maintaining
an atomistic resolution. It has also been facilitated by the mobilization
of the computer centers of many countries and research agencies, which
has provided an unprecedented computational power to the effort. With
this Review, we have aimed at collecting and rationalizing the huge
amount of data produced. Because the field is extremely dynamic and
constantly evolving, the completeness of this Review cannot be granted;
however, we are confident that we have identified some of the most
important and relevant studies from either a methodological or an
applicative point of view. Clearly, in answering the call against
SARS-CoV-2, molecular modeling and simulation have demonstrated once
more their status and their role as a fundamental and complementary
tool to preview, analyze, and rationalize complex biological phenomena
happening in complex environments. As such, they have reinforced their
key position in the scientific panorama and in their capacity to tackle
fundamental societal issues.
Authors: Yue Ma-Lauer; Javier Carbajo-Lozoya; Marco Y Hein; Marcel A Müller; Wen Deng; Jian Lei; Benjamin Meyer; Yuri Kusov; Brigitte von Brunn; Dev Raj Bairad; Sabine Hünten; Christian Drosten; Heiko Hermeking; Heinrich Leonhardt; Matthias Mann; Rolf Hilgenfeld; Albrecht von Brunn Journal: Proc Natl Acad Sci U S A Date: 2016-08-12 Impact factor: 11.205
Authors: Daniel Wrapp; Nianshuang Wang; Kizzmekia S Corbett; Jory A Goldsmith; Ching-Lin Hsieh; Olubukola Abiona; Barney S Graham; Jason S McLellan Journal: Science Date: 2020-02-19 Impact factor: 47.728