Dibyendu Mondal1, Arieh Warshel1. 1. Department of Chemistry, University of Southern California, 3620 McClintock Avenue, Los Angeles, California 90089, United States.
Abstract
The development of reliable ways of predicting the binding free energies of covalent inhibitors is a challenge for computer-aided drug design. Such development is important, for example, in the fight against the SARS-CoV-2 virus, in which covalent inhibitors can provide a promising tool for blocking Mpro, the main protease of the virus. This work develops a reliable and practical protocol for evaluating the binding free energy of covalent inhibitors. Our protocol presents a major advance over other approaches that do not consider the chemical contribution of the binding free energy. Our strategy combines the empirical valence bond method for evaluating the reaction energy profile and the PDLD/S-LRA/β method for evaluating the noncovalent part of the binding process. This protocol has been used in the calculations of the binding free energy of an α-ketoamide inhibitor of Mpro. Encouragingly, our approach reproduces the observed binding free energy. Our study of covalent inhibitors of cysteine proteases indicates that in the choice of an effective warhead it is crucial to focus on the exothermicity of the point on the free energy surface of a peptide cleavage that connects the acylation and deacylation steps. Overall, we believe that our approach should provide a powerful and effective method for in silico design of covalent drugs.
The development of reliable ways of predicting the binding free energies of covalent inhibitors is a challenge for computer-aided drug design. Such development is important, for example, in the fight against the SARS-CoV-2 virus, in which covalent inhibitors can provide a promising tool for blocking Mpro, the main protease of the virus. This work develops a reliable and practical protocol for evaluating the binding free energy of covalent inhibitors. Our protocol presents a major advance over other approaches that do not consider the chemical contribution of the binding free energy. Our strategy combines the empirical valence bond method for evaluating the reaction energy profile and the PDLD/S-LRA/β method for evaluating the noncovalent part of the binding process. This protocol has been used in the calculations of the binding free energy of an α-ketoamide inhibitor of Mpro. Encouragingly, our approach reproduces the observed binding free energy. Our study of covalent inhibitors of cysteine proteases indicates that in the choice of an effective warhead it is crucial to focus on the exothermicity of the point on the free energy surface of a peptide cleavage that connects the acylation and deacylation steps. Overall, we believe that our approach should provide a powerful and effective method for in silico design of covalent drugs.
From December 2019, the whole world has been facing the problem of a highly contagious
pulmonary disease, coronavirus disease 2019 (COVID-19).[1] Almost 41
million people in the world have been infected by this virus so far. The first case of this
global pandemic was reported in the city of Wuhan, China.[2] The
coronavirus strain severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)[3] is responsible for this global pandemic. So far, no vaccine or antiviral drug
has been approved to prevent the spread of the SARS-CoV-2 system. Many proteins in
SARS-CoV-2 have been targeted in the design of new drugs or the repurposing of known
drugs,[4] and the main protease of SARS-CoV-2 (SARS-CoV-2Mpro, also called 2CLpro)[5] is one of those.
SARS-CoV-2Mpro is a cysteine protease (CP) that takes part in the viral
replication process. This protein cleaves the polyprotein pp1a and pp1ab (translated from
the viral RNA) at 16 different positions to generate important structural (spike, envelope,
membrane, and nucleocapsid proteins) as well as nonstructural proteins (NSPs).[6] Thus, hindering the normal action of Mpro can stop the spread of
SARS-CoV-2. SARS-CoV-2Mpro has a unique recognition sequence
[Leu-Gln↓(Ser, Ala, Gly)], and the cleavage site (denoted by ↓) is between the
Gln and the next small amino acid (Ser, Ala, or Gly).[7] No human proteases
have this cleavage specificity, and as a result, inhibitors for SARS-CoV-2Mpro
are less likely to be toxic. This makes the Mpro an excellent target for drug
design. Some crystal structures[7−10] of inhibitor-bound SARS-CoV-2Mpro have been determined
recently, and have immensely helped in the identificaton of important protein residues near
the inhibitors. Most of those published crystal structures contain covalent inhibitors.Generally, covalent inhibitors are more potent than their noncovalent analogues, because
they form covalent bonds with the proteins. In fact, there are many examples of covalent
inhibitors, particularly for protease enzymes.[11] For example, very
recently a few potential broad-spectrum covalent inhibitors against alphacoronavirus,
betacoronavirus, and enterovirus were reported.[12] These inhibitors bind
specifically to the main proteases of those viruses. Unfortunately, most of these designs of
covalent inhibitors are solely based on experimental studies, and computational research is
yet to play a significant role. Reasonably accurate computational
methods[13,14] are
available for obtaining relative binding free energies of noncovalent inhibitors, but the
main hurdle in developing computational approaches for designing covalent inhibitors is the
simulation of the formation of the covalent bond. Unlike noncovalent inhibitors, the process
of binding of a covalent inhibitor depends not only on the correct structural
complementarity between the protein and the inhibitor but also the appropriate chemical
reactivity of the inhibitor and the protein environment that stabilizes the covalent
complex. Thus, designing good covalent inhibitors requires understanding the energy
contributions of different steps in the covalent complex formation, which includes both the
noncovalent binding free energy and the reaction free energies. In the past few years,
several interesting computational studies have been reported,[15−18] where free energy
perturbation (FEP)-based alchemical transformations were used in calculating the relative
binding free energies of various covalent inhibitors. While in most of these works the
noncovalent and covalent states were considered, the authors of ref (17) used only the covalent state in their calculations. As
pointed out in ref (19), the choice of considering
just the covalent state is reasonable only when the contribution of the covalent state to
the total binding free energy is at least −5.5 kcal/mol greater than that of the
noncovalent state. Unfortunately, knowing a priori the contributions from
the covalent and noncovalent states is not possible. Thus, even for relative covalent
binding free energy calculations, both contributions should be calculated. The situation
gets even further complicated when one tries to calculate the absolute covalent binding free
energies, because in this case the possibility of error cancelation (as we might expect in
relative free energy calculations) is negligible. One might still be able to calculate the
absolute binding free energies by considering one inhibitor as a reference and following the
thermodynamic cycle used in ref (19) to avoid the
expensive quantum mechanics (QM)-based calculations. On the contrary, a complete mechanistic
understanding of the covalent inhibition process is not possible from such analysis. In
fact, sometimes the inhibition is correlated with the transition state binding energy.
Furthermore, comparison of different covalent warheads requires explicit evaluation of the
chemical bonding process. Thus, it seems to us that there is a need for a protocol that is
computationally less expensive but accurate enough [in terms of both the convergence and the
reliability of the quantum mechanics/molecular mechanics (QM/MM) approach] in calculating
both noncovalent binding free energies and reaction free energies.We have previously combined the semimicroscopic version of the protein dipole Langevin
dipole method in the linear response approximation with a scaled non-electrostatic term
(PDLD/S-LRA and PDLD/S-LRA/β)[20] and the empirical valence bond
(EVB)[21] method to investigate the drug resistance of the hepatitis C
virus (HCV).[22] In that project, the PDLD/S-LRA method was used to
calculate the noncovalent binding free energies of the substrate and inhibitors whereas EVB
was employed to investigate the peptide hydrolysis mechanism. Thus, it is likely that we can
combine these methods also to calculate absolute covalent binding free energies of
inhibitor–protein complexes.Considering the current pandemic situation, it is also very timely to try to use computer
simulations to gain a detailed atomistic understanding of the inhibitor binding to
SARS-CoV-2Mpro. Such simulations can accelerate the design of new drugs to
combat COVID-19 or other similar viruses that may be developed in the future.In this paper, we have used the case of an α-ketoamide inhibitor (a reversible
covalent inhibitor of Mpro) to explore the mechanism of inhibition of
Mpro. Recently, the mechanism of proteolysis of the wild type substrate
(polypeptide chain) by SARS-CoV-2Mpro has been explored
computationally.[23,24]
These studies are very informative for designing new drugs, but understanding the mechanism
of inhibition of SARS-CoV-2Mpro by a covalent inhibitor would surely help more
in the inhibitor design process.The main protease of SARS-CoV-2 contains a cysteine (Cys145) and a histidine (His41) as a
catalytic dyad [numbering based on the protomer chain A of the Protein Data Bank (PDB) entry
6Y2G(7)]. There are
some proposals that the dyads of cysteine protease systems are already in the
thiolate–imidazolium ion pair state in the apo form[25] of the
proteins, and thus, upon substrate (or inhibitor) binding, the activated Cys can attack the
most electrophilic center of the substrate (or inhibitor). An alternative proposal is that
the Cys-His catalytic dyad remains in neutral form in the apo form of the protein and upon
substrate (or inhibitor) binding the histidine residue activates the cysteine by abstracting
the proton from its Sγ atom.[26,27] The activated cysteine then attacks the substrate (or inhibitor). In ref
(23), a third mechanistic proposal was reported,
where after substrate (or inhibitor) binding the proton transfer can occur concomitant with
the next nucleophilic attack step.[23] All of the proposed mechanistic
options mentioned above were considered in this work to investigate the inhibition process
and are depicted in Figure . In schemes A and B,
the nucleophilic attack (NA) occurs after the first proton transfer step (PT1) is completed,
and those two steps occur in a concerted manner in scheme C. The last proton transfer step
is the same for all of the schemes. The difference between schemes A and B is in the
ordering of the PT1 and the inhibitor binding process (see Figure ). To calculate the change in free energy for different steps in
Figure , the PDLD/S-LRA/β or EVB method
was utilized (depending upon the step). Furthermore, our calculated results have been
compared to the experimental results and the most exothermic step in the entire binding
process was also identified. This result can help in designing new covalent inhibitors for
SARS-CoV-2Mpro, especially by focusing on increasing the exothermicity of that
step.
Figure 1
Various mechanistic schemes of formation of a covalent complex between a peptidomimetic
inhibitor and a cysteine protease. The inhibitor is represented in generic form as
RR′CO, where R and R′ can be any aliphatic or aromatic chain. The reacting
molecular fragments are represented in boxes. The green, red, and blue arrows denote
schemes A–C, respectively. The difference among schemes A–C is discussed
in detail in the text.
Various mechanistic schemes of formation of a covalent complex between a peptidomimetic
inhibitor and a cysteine protease. The inhibitor is represented in generic form as
RR′CO, where R and R′ can be any aliphatic or aromatic chain. The reacting
molecular fragments are represented in boxes. The green, red, and blue arrows denote
schemes A–C, respectively. The difference among schemes A–C is discussed
in detail in the text.
Computational Methods
Preparation of the Simulation System
The catalytically active form of Mpro is a homodimer (Figure
A). Therefore, a homodimer of the main protease (PDB entry
6Y2G)[7] was
taken in all of our free energy calculations. Chain A of the homodimer was used as the
main simulation system. The inhibitor in PDB entry 6Y2G (named 13b in ref (7)) (see Figure B) was used as the α-ketoamide inhibitor in our study. The covalent bond
between the inhibitor and the sulfhydryl group of Cys145 was removed before starting any
simulation. All of the nonprotein atoms except the inhibitor were removed from the PDB
file. The protein was solvated using MOLARIS-XG[28] to generate a water
sphere based on the surface constraint all-atom solvent model (SCAAS).[29] The simulation system was then energy minimized to remove any bad contacts in the
protein system while keeping the coordinates of the inhibitor frozen. The partial charges
of the inhibitor were calculated at the B3LYP/6-31+G** level of theory using Gaussian
09.[30] The charges obtained from the QM calculations were fitted using
restrained electrostatic potential (RESP)[31] procedure to obtain the
RESP-based charges. In all of the simulations, the polarizable force field ENZYMIX[32] was used unless otherwise mentioned.
Figure 2
Structures of dimeric SARS-CoV-2 Mpro and its α-ketoamide inhibitor
13b. (A) Protomers A and B are shown as orange and blue ribbons,
respectively. The inhibitor and catalytic residues in protomer A are shown as purple
and green sticks, respectively. (B) The most electrophilic center of 13b
is highlighted with a black arrow.
Structures of dimeric SARS-CoV-2Mpro and its α-ketoamide inhibitor
13b. (A) Protomers A and B are shown as orange and blue ribbons,
respectively. The inhibitor and catalytic residues in protomer A are shown as purple
and green sticks, respectively. (B) The most electrophilic center of 13b
is highlighted with a black arrow.
Empirical Valence Bond Simulations
In the EVB simulations, the reacting atoms are defined as region I atoms, while the rest
of the protein atoms (up to a user-defined radius) are defined as region II. In some of
our EVB simulations in which a few atoms of the inhibitor were included in region I, the
geometric center of the inhibitor was considered as the center of the simulation system.
On the contrary, in the simulations in which the atoms of the inhibitor were not part of
region I, the geometric center of all of the protein atoms included in region I was used
as the center. The protein–inhibitor system was immersed in a 22 Å sphere of
water molecules. The water molecules at the boundary of the solvent sphere were subject to
polarization and radial restraints by using the surface-constrained all-atom solvent
(SCAAS) model. This simulation sphere was further surrounded by a 2 Å spherical shell
of Langevin dipoles and then a bulk continuum. The long-range electrostatic effect was
treated using the local reaction field (LRF) method.[33] All protein
atoms beyond the 22 Å sphere were fixed at their initial coordinate positions as in
the crystal structure. The electrostatic interaction from outside of the sphere was not
included in the energy calculations. The partial charges of all region I atoms were
calculated at the B3LYP/6-31+G** level of theory using Gaussian 09, and the partial
charges and all other EVB parameters are provided in the Supporting Information.Before the free energy surface (FES) was calculated using the EVB approach, the
simulation system was equilibrated thoroughly. The temperature of the simulation system
was increased slowly from 1 to 300 K in increments of 40 K (except from 280 to 300 K,
where the increment was 20 K) for a total simulation time of 200 ps. During heating, the
EVB region I atoms were constrained at 50 kcal/mol energy. This constraint was then
released in six steps to 0.3 kcal/mol in an additional 100 ps. Finally, three different
starting geometries were generated from an equilibration simulation of 300 ps. It is worth
mentioning that the equilibration of a system is considerably fast with spherical boundary
conditions. Additionally, we observed minimal fluctuation of the root-mean-square
deviation (RMSD) of the positions of the heavy atoms in region II during the last 300 ps
equilibrium simulations. In the EVB approach, the free energy surface is calculated using
the free energy perturbation/umbrella sampling (FEP/US) method, and for that, we ran
simulations of 51 frames with each frame simulated for 10 ps. For each reaction, the
simulations of the reference reaction (a reaction in water without any protein atoms from
region II) were also performed to obtain the necessary EVB parameters (gas phase shift and
coupling constants). The definition of these parameters can be found in the Supporting Information. The reported energies in this work were obtained
from the averaging of three independent EVB simulations.
PDLD/S-LRA/β Simulations
The simulation setup for the noncovalent binding energy was the same as that for the EVB
simulations, except in the PDLD/S-LRA/β simulations, the whole inhibitor was
included in region I and all atoms (including atoms in region I) in the simulation were
represented with the ENZYMIX force field. Five different simulations (starting with
different initial structures) were performed for each noncovalent binding free energy
calculation. The presented results were averages of those simulation results. In every
PDLD/S-LRA/β simulation, the linear response approximation (LRA) calculation was
performed on 10 different protein configurations. A brief description of the PDLD/S-LRA/
β method can be found in the Supporting Information.
Results and Discussion
A generic binding process for a covalent inhibitor can be described formally as a two-step
process, where the first step is formation of a noncovalent protein–inhibitor complex
and the second step is the chemical reaction that forms a covalent protein–inhibitor
complex (see Figure ). As we can see from Figure , the chemical reaction can consist of multiple
steps. Thus, the absolute binding free energy of a covalent complex can be a combination of
many free energy terms. To compare the calculated covalent binding free energies with the
corresponding experimental results, we need reliable calculations of each of those steps. In
the following sections, we will discuss the calculations of the free energy changes for
different steps, and finally, we will compare our calculated results with experimental
results of inhibitor 13b.
Figure 3
General free energy profile that explains the formation of a covalent protein
(P)–inhibitor (I) complex. P···I, [P---I]‡, and
P---I denote the noncovalent, activated complex(es) during chemical reaction(s) and the
covalent binding state, respectively. ΔGnoncov,
ΔGchem , and ΔGcov
represent the noncovalent (P···I) binding energy, reaction free energy,
and total covalent (P---I) binding free energy, respectively.
General free energy profile that explains the formation of a covalent protein
(P)–inhibitor (I) complex. P···I, [P---I]‡, and
P---I denote the noncovalent, activated complex(es) during chemical reaction(s) and the
covalent binding state, respectively. ΔGnoncov,
ΔGchem , and ΔGcov
represent the noncovalent (P···I) binding energy, reaction free energy,
and total covalent (P---I) binding free energy, respectively.
Noncovalent Binding Free Energy (ΔGnoncov)
Calculations
Mpro with different ionization combinations was used to calculate the
ΔGnoncov of two types of noncovalent complexes. In one
case, we ionized all ionizable residues within 20 Å of the geometric center of the
inhibitor as well as His41 (doubly protonated) and Cys145 (negatively charged), whereas in
another case, the His41-Cys145 pair was kept in its neutral form. We have used the Monte
Carlo Proton Transfer (MCPT)[34] method to assess the most probable
ionization state of the titratable residues. The neutral form corresponds to the situation
in which the inhibitor binding occurs before the proton transfer between Cys145 and His41.
In this way, we have tested the hypothesis related to the ordering of the first proton
transfer and inhibitor binding (schemes A and B in Figure ). The calculated ΔGnoncov values
are listed in Table . Interestingly, for
inhibitor 13b in both cases we obtained similar binding free energies. These
results probably suggest that the ionization of Cys145 and His41 is not important for the
initial formation of the inhibitor–protein noncovalent complex. Crystal structures
of the dimeric complexes of the mutated (C145A[35,36] or H41A[37]) Mpro of
SARS-CoV-2 with bound C-terminal and N-terminal autocleavage substrates can be found in
the literature. A bound substrate in those mutated (C145A or H41A) proteins indicates that
the electrostatic influence of Cys145 and His41 might be nominal during the formation of
the noncovalent complex. On the contrary, a more surprising part is such low values of
ΔGnoncov for a large α-ketoamide inhibitor. Here
we mention that the structures that were used to represent the noncovalent states were
derived from coordinates of the covalent hemiketal form of the inhibitors. It is possible
that in those structures the protein preorganization (which is mostly for stabilizing the
transition state of covalent bond formation) is not sufficiently conducive to stabilize
the noncovalent states. In other words, there is a possibility that the inhibitor binds in
a mode slightly different than what has been simulated. At any rate, if we consider the
chemical reaction step(s) along with this step, we could understand how strongly the
exothermicity of the reactions drives the covalent complex formation despite the small
ΔGnoncov contributions.
Table 1
Calculated Binding Free Energies (ΔGnoncov) for
the Noncovalent SARS-CoV-2 Mpro–13b Complexes
system
ΔGnoncov
(kcal/mol)
13b (before proton transfer)
–3.5 ± 0.5
13b (after proton transfer)
–3.6 ± 0.5
Calculating the Reaction Free Energy of the First Proton Transfer (PT1) Step
The investigation of the mechanism of covalent bond formation was started by evaluating
the reaction free energy (ΔGPT1) of the PT1 step. The
PT1 reaction was simulated in the apo and holo forms of the protein. It should be noted
that the calculations for the respective PT1 reactions in water (reference reactions) were
also performed in the presence or absence of the inhibitor. The
ΔGPT1,obs of 7.0 kcal/mol in water (in the presence of
the inhibitor) was taken from ref (38) to compare
it with our calculated ΔGPT1 in water. In this case, it
was assumed that the effect of the slightly different warheads of the inhibitors (reported
in ref (38) and in this work) on the PT reaction is
very similar. The ΔGPT1,obs in water (in the absence of
the inhibitor) was calculated from the pKa values of
ethanethiol (pKa = 10.6) and imidazole
(pKa = 6.9). Here, the pKa of
ethanethiol instead of cysteine was taken, because in the case of cysteine, there is a
main chain zwitterionic effect. Thus, the value of
ΔGPT1,obs in water (in absence of the ligand) is 1.38
× (10.6 – 6.9) = 5.1 kcal/mol. We used this value to compare our calculated
EVB profile (in absence of the ligand) as well as to fit EVB parameters. The PT1 profiles
obtained for the reactions in protein were different in the apo and holo forms of the
protein (see Table ). Table
shows that the His-Cys ion pair state is less favorable in
the holo form of the protein than in the apo form, probably because the ion pair that
forms at the end of the PT1 step is more stabilized in more polar environment like in the
apo form (with a vacant inhibitor binding pocket) than in the holo form. Our results also
match with those of ref (24), in which the
proteolysis mechanism of SARS-CoV-2 was studied using multiscale DFT/MM simulation methods
(see Table ).
Table 2
Calculated Reaction Free Energies (ΔGPT1) of
the First Proton Transfer (PT1) Step in Water and SARS-CoV-2 Mpro
system
ΔGPT1 (kcal/mol)
ΔGPT1 (kcal/mol)a
scheme A
water
5.1
–
protein (apo)
2.9 ± 0.2
2.9
scheme B
water
7.5
–
protein (holo)
7.3 ± 1.7
4.8
The calculated reaction free energy from ref (24). Please note that in ref (24)
a peptide substrate was used.
The calculated reaction free energy from ref (24). Please note that in ref (24)
a peptide substrate was used.
Calculation of the Activation and Reaction Free Energies of the Nucleophilic Attack
(NA) Step
In the nucleophilic attack step, the activated anionic Sγ atom of Cys145 attacks
the α-keto group of the ketoamides and forms an anionic tetrahedral complex. In this
case, the information about the observed activation (ΔGNA,obs⧧) and reaction free
energy (ΔGNA,obs) of the reference water reaction was
taken from the free energy profiles reported in ref (38). It is worth mentioning that in ref (38) an amide system was used to study the free energy profile, whereas we have
a ketoamide group. Thus, we also performed a QM-based energy calculation to estimate
ΔGNA,obs in water. From the QM calculations, we
obtained a ΔGNA,obs value of 12.8 kcal/mol (after adding
the thermal corrections). The QM-calculated ΔGNA,obs is
close to the value reported in ref (38), and thus,
we have used the exact PES as in ref (38) to
parametrize our EVB simulations for the NA step. The calculated
ΔGNA⧧ and ΔGNA are reported in Table .
Table 3
Calculated Activation (ΔGNA⧧) and Reaction
(ΔGNA) Free Energies of the Nucleophilic Attack (NA)
Step in Water and SARS-CoV-2 Mpro
system
ΔGNA⧧ (kcal/mol)
ΔGNA (kcal/mol)
scheme A
water
17.7
12.6
protein
13.5 ± 2.3
2.9 ± 2.2
scheme B
water
17.6
12.4
protein
12.5 ± 0.9
–0.5 ± 1.7
In Table , two different values of
ΔGNA and ΔGNA⧧ are reported, as we have explored two
different stepwise mechanistic schemes (A and B). In scheme A, the PT1 step happens before
the inhibitor binding, and we did not include His145 when we studied the NA step in the
water environment (because this would help us to account for the effect of the protonated
histidine residue in the protein reaction only). On the contrary, for scheme B, we
performed a continuation of a run from the previous PT1 run. Thus, the His145 residue was
included in water and protein reactions for the NA step. Thus, the effect of His145 was
incorporated slightly differently in the simulations of these schemes (A and B). Although
the representations of the simulation systems are slightly different, the total free
energy change of the three steps (ΔGnoncov +
ΔGPT1 + ΔGNA) should
converge for both schemes. The calculated free energy changes after the first three steps
are 3.3 and 2.2 kcal/mol for schemes A and B, respectively. Thus, the process after the
first three steps is endothermic even in the protein. This indicates that the major
exothermicity observed due to the formation of the covalent inhibitor–protein
complex should come from the last proton transfer step.At this point, we discuss the third mechanistic scheme considered in this study. In ref
(23), the author proposed that the first proton
transfer occurs concomitantly with the nucleophilic attack step. For our PT1 free energy
surface calculations, it was also observed that in the presence of the inhibitor the
Cys-His ion pair (product of PT1) was more unstable compared to that when the inhibitor
was absent. Thus, it is quite possible that the nucleophilic attack happens almost
immediately after the formation of the less stable ion pair in the holo form of the
protein. Therefore, a concerted PT1 and NA step is a legitimate option to check.
Calculation of Activation and Reaction Free Energies of the Concerted PT1 and NA
Step
The results for the concerted step are presented in Table . To parametrize the EVB simulation, here we also performed EVB
simulations of the reference reaction in water. For the reference reaction, it was assumed
that the observed activation barrier (ΔGPT1–NA,obs⧧) and reaction free energy
(ΔGPT1–NA,obs) of the concerted reaction are
the same as those of the stepwise reaction discussed in ref (38), and values of 24 and 19 kcal/mol, respectively, were used for the free
energy changes. The calculated barrier of the concerted reaction (in scheme C) is 13.6
kcal/mol (Table ), whereas for the stepwise
process, the combined barriers (including PT1 and NA) are 16.4 and 19.8 kcal/mol for
schemes A and B, respectively (Tables and 3). Thus, we can conclude that scheme C provides a more plausible mechanism.
On the contrary, the total changes in reaction free energies
(ΔGnoncov +
ΔGPT1–NA or
ΔGnoncov + ΔGPT1 +
ΔGNA) for the formation of the anionic tetrahedral
complex are comparable in all three schemes. This confirms the consistency in our
simulations.
Table 4
Calculated Activation (ΔGPT1–NA⧧) and Reaction
(ΔGPT1–NA) Free Energies of the Concerted
PT1 and NA Step in Water and SARS-CoV-2 Mpro
system
ΔGPT1–NA⧧ (kcal/mol)
ΔGPT1–NA
(kcal/mol)
scheme C
water
24.4
18.6
protein
13.6 ± 0.2
6.1 ± 0.8
Calculation of the Reaction Free Energy of the Second Proton Transfer (PT2)
Step
The last proton transfer step is common for all three mechanistic schemes. To obtain the
observed reaction free energy (ΔGPT2,obs) for the
reaction in water, here we have also used the information about the
pKa’s of the reacting groups. The obtained
ΔGPT2,obs in water was −5.6 kcal/mol
considering the pKa of imidazole (6.9) and the
pKa of the tetrahedral intermediate[38]
(∼11.0). The calculated results for the reaction in the protein and water are
listed in Table .
Table 5
Calculated Reaction Free Energies (ΔGPT2) of
the Second Proton Transfer Step (PT2) in Water and SARS-CoV-2 Mpro
system
ΔGPT2 (kcal/mol)
water
–5.6
protein
–11.4 ± 0.9
The results in Table suggest that the proton
affinity of the tetrahedral complex is very important (along with the electrophilicity of
the reacting group of the inhibitor) in designing an inhibitor that is better than that of
13b, because the PT2 step contributes to the maximum exothermicity during
covalent bond formation. Recently, two mechanistic studies[39,40] of the inhibition of
Mpro have been reported, where a peptidyl Michael acceptor N3 was used as the
covalent inhibitor. In both works, the last proton transfer step has been reported to be
highly exothermic, which is in agreement with our study. Additionally, the reported
activation free energy of the covalent bond formation step in ref (39) is close to what we have obtained here. In another
study, Arafet et al.[41] reported that the protonated hemiketal
intermediate of halomethyl ketone inhibitors of cruzain (a cysteine protease) was highly
stable compared to that of the anion forms. Although in their study the second proton
transfer preceded or happened concurrently with the nucleophilic attack, it produced the
same conclusion about the contribution of the second proton transfer step to the overall
exothermicity, as we have found here.
Calculation of the Binding Free Energy of the Covalent SARS-CoV-2 Inhibitor
Complex
Finally, we can add all of the calculated free energy contributions to obtain the binding
free energy (ΔGcov) of the covalent complex between
Mpro and inhibitor 13b. The experimental IC50 value
for 13b is 0.67 μM.[7] Therefore, the experimental
binding free energy at 300 K can be estimated as −8.5 kcal/mol. The total free
energy change for each reaction scheme is reported in Table , and the overall free energy profiles for the formation of the
covalent complex between SARS-CoV-2Mpro and inhibitor 13b
(following different mechanistic schemes) are depicted in Figure . It can be seen from Table that the calculated ΔGcov is close to the
experimental ΔGcov.
Table 6
Calculated (ΔGcovcal) and Experimental (ΔGcovexp) Binding Free Energies of
the Covalent SARS-CoV-2 Mpro–Inhibitor 13b
Complex
scheme
ΔGcovcal (kcal/mol)
ΔGcovexp (kcal/mol)a
A
–9.2
–8.5
B
–8.1
C
–8.9
ΔGcovexp is calculated from the IC50 value reported in ref (7).
Figure 4
Free energy surface (FES) of formation of the covalent protein ligand (SARS-CoV-2
Mpro–13b) complex. The boxes near the surfaces
represent the reaction coordinate at the corresponding position of the FES.
ΔGcovexp is calculated from the IC50 value reported in ref (7).Free energy surface (FES) of formation of the covalent protein ligand (SARS-CoV-2Mpro–13b) complex. The boxes near the surfaces
represent the reaction coordinate at the corresponding position of the FES.However, it is worth mentioning that the IC50 value is not the best metric for
evaluating the accuracy of a protocol for absolute binding free energy calculations.
Despite that, due to the lack of an experimentally reported Ki
(inhibition constant) value for the α-ketoamide inhibitor 13b, we had
to use the IC50 value, an indirect metric of binding affinity, for our study.
It is known that the relationship between Ki and
IC50 depends on the experimental condition and on the binding affinity of the
substrate (KM). In general, the Ki
value is equal or less than the IC50 value;[42] thus, the
actual binding affinity of 13b should be even lower than −8.5
kcal/mol. On the contrary, 13b is known to be a reversible covalent
inhibitor. For similar reversible α-ketoamide inhibitors of calpains and other
cysteine proteases, the reported Ki values[43] were around 15 nM and the corresponding binding free energy will be −11
kcal/mol. Therefore, the expected value may be 3–4 kcal/mol more negative than what
we can obtain from the IC50 value of 13b. At any rate, we think
our estimated absolute binding free energy of the Mpro–13b
covalent complex is not significantly over- or underestimated and we can also account for
the fact that 13b is not an irreversible inhibitor.The reversibility of a reaction depends generally on the barrier of the reverse reaction.
For two reactions with comparable activation barriers, the reaction that is more
exothermic is also more irreversible. Thus, the calculated reaction free energies of two
reactions can be used to determine whether a covalent inhibitor is reversible.[44] On the contrary, reactions with high reverse reaction barriers are
irreversible even if the exothermicity is small. It is known that reversible covalent
inhibitors can have a residence time (τ = 10 h)[44.5] which is
equivalent to a reverse reaction barrier of around 23.5 kcal/mol. We have obtained a
reverse reaction barrier of around 22 kcal/mol combining the NA and PT2 steps, which
supports the idea that the protonated tetrahedral complex of 13b can undergo
a reverse reaction to form the noncovalent complex. At this point, we caution that the
time dependence of the binding process has not been provided in the experimental work and
thus it is not completely clear that our assumption about the thermodynamic equilibrium is
fully justified.
Conclusions
The great potential of covalently bound drugs has been demonstrated in recent
years.[45−48] Still, most current approaches to in
silico design of such inhibitors are not fully consistent for absolute binding
free energy calculations, because they do not evaluate the chemical contribution of the
binding energy. We developed a general way to calculate the binding free energies of
covalent drugs, including all of the relevant energy contributions. That is, our strategy is
based on using the EVB to evaluate the chemical contribution for covalent binding and using
the PDLD/S-LRA/β method to determine the noncovalent contribution. This approach is
examined by calculating the binding free energy of an α-ketoamide inhibitor of the
main protease of SARS-CoV-2 (Mpro.). Our calculations considered multiple
reaction pathways for inhibitor–protease complex formation and produced a very
reasonable estimate of the observed binding free energy. This demonstrated the potential of
our strategy in general drug design and particularly in looking for effective inhibitors for
SARS-CoV-2.
Authors: Luis Heriberto Vázquez-Mendoza; Humberto L Mendoza-Figueroa; Juan Benjamín García-Vázquez; José Correa-Basurto; Jazmín García-Machorro Journal: Int J Mol Sci Date: 2022-04-03 Impact factor: 5.923
Authors: Guillem Macip; Pol Garcia-Segura; Júlia Mestres-Truyol; Bryan Saldivar-Espinoza; María José Ojeda-Montes; Aleix Gimeno; Adrià Cereto-Massagué; Santiago Garcia-Vallvé; Gerard Pujadas Journal: Med Res Rev Date: 2021-10-26 Impact factor: 12.388