We here investigate the mechanism of SARS-CoV-2 3CL protease inhibition by one of the most promising families of inhibitors, those containing an aldehyde group as a warhead. These compounds are covalent inhibitors that inactivate the protease, forming a stable hemithioacetal complex. Inhibitor 11a is a potent inhibitor that has been already tested in vitro and in animals. Using a combination of classical and QM/MM simulations, we determined the binding mode of the inhibitor into the active site and the preferred rotameric state of the catalytic histidine. In the noncovalent complex, the aldehyde group is accommodated into the oxyanion hole formed by the NH main-chain groups of residues 143 to 145. In this pose, P1-P3 groups of the inhibitor mimic the interactions established by the natural peptide substrate. The reaction is initiated with the formation of the catalytic dyad ion pair after a proton transfer from Cys145 to His41. From this activated state, covalent inhibition proceeds with the nucleophilic attack of the deprotonated Sγ atom of Cys145 to the aldehyde carbon atom and a water-mediated proton transfer from the Nε atom of His41 to the aldehyde oxygen atom. Our proposed reaction transition-state structure is validated by comparison with X-ray data of recently reported inhibitors, while the activation free energy obtained from our simulations agrees with the experimentally derived value, supporting the validity of our findings. Our study stresses the interplay between the conformational dynamics of the inhibitor and the protein with the inhibition mechanism and the importance of including conformational diversity for accurate predictions about the inhibition of the main protease of SARS-CoV-2. The conclusions derived from our work can also be used to rationalize the behavior of other recently proposed inhibitor compounds, including aldehydes and ketones with high inhibitory potency.
We here investigate the mechanism of SARS-CoV-2 3CL protease inhibition by one of the most promising families of inhibitors, those containing an aldehyde group as a warhead. These compounds are covalent inhibitors that inactivate the protease, forming a stable hemithioacetal complex. Inhibitor 11a is a potent inhibitor that has been already tested in vitro and in animals. Using a combination of classical and QM/MM simulations, we determined the binding mode of the inhibitor into the active site and the preferred rotameric state of the catalytic histidine. In the noncovalent complex, the aldehyde group is accommodated into the oxyanion hole formed by the NH main-chain groups of residues 143 to 145. In this pose, P1-P3 groups of the inhibitor mimic the interactions established by the natural peptide substrate. The reaction is initiated with the formation of the catalytic dyad ion pair after a proton transfer from Cys145 to His41. From this activated state, covalent inhibition proceeds with the nucleophilic attack of the deprotonated Sγ atom of Cys145 to the aldehydecarbon atom and a water-mediated proton transfer from the Nε atom of His41 to the aldehyde oxygen atom. Our proposed reaction transition-state structure is validated by comparison with X-ray data of recently reported inhibitors, while the activation free energy obtained from our simulations agrees with the experimentally derived value, supporting the validity of our findings. Our study stresses the interplay between the conformational dynamics of the inhibitor and the protein with the inhibition mechanism and the importance of including conformational diversity for accurate predictions about the inhibition of the main protease of SARS-CoV-2. The conclusions derived from our work can also be used to rationalize the behavior of other recently proposed inhibitor compounds, including aldehydes and ketones with high inhibitory potency.
The impact and rapid
expansion of COVID-19, caused by the coronavirusSARS-CoV-2, has urged the research to find therapeutic remedies.[1] There are, in principle, two main strategies
to be used against this disease: the development of vaccines and antiviral
drugs. While vaccines have the advantage of preventing the disease,
antivirals could be beneficial not only to fight against SARS-CoV-2
but also for other related coronaviruses, including those that could
infect human beings in a near future. While some antiviral drugs developed
to fight against other viruses, such as remdesivir, lopinavir, or
favipiravir, have been already tested to treat COVID-19, there are
no clear evidences of their efficiency.[2,3] Therefore,
there is an urgent need to develop new antiviral drugs that are both
effective and selective for SARS-CoV-2.One of the strategies
for the development of antiviral drugs is
to target the inhibition of one of those enzymes that are essential
for the vital cycle of the virus. When SARS-CoV-2 infects a cell,
it utilizes the transcription machinery to translate the viral genome
into two long polyproteins. These long chains must be cleaved at specific
sites to produce the nonstructural proteins that the virus needs for
replication and transcription of its genome. This key function is
performed by two proteases: the 3C-like (3CL) or main protease and
the Papain-like (PL) protease. The former cleaves the polyprotein
at 11 positions targeting for the Gln-(Ser/Ala/Gly) peptide bond,
a sequence preference that is not used by human proteases.[4] This characteristic makes the 3CL protease an
excellent candidate as a drug target because those compounds designed
to bind and block the active site of this protein have less chances
to interact with proteases of the host.[5] The 3CL protease is a cysteine protease and its active site is formed
by a catalytic dyad, His41 and Cys145, in charge of the hydrolysis
of peptide bonds. This process takes place in two main steps:[6,7] acylation and deacylation. During the acylation step, a peptide
fragment is released while the other forms an acyl–enzyme complex
by means of a covalent bond with the Sγ atom of the catalytic
cysteine. During deacylation, the second fragment is released by the
action of a water molecule, recovering the enzyme for a new catalytic
cycle.Until now, several families of inhibitors have been proposed
and
tested in vitro against the 3CL protease of SARS-CoV-2, including
Michael acceptors,[8] α-ketoamides,[9] aldehyde derivatives,[10,11] and ketones.[12] These compounds first
bind to the active site of the protease forming a noncovalent complex
(E–I) and then react with the thiol group of the catalytic
cysteine to form a stable covalent acyl–enzyme complex (E–I),
see Figure a. The
design and improvement of these compounds is usually guided by the
information provided by the X-ray structure of the covalent complex.
However, as we will stress in this work, this information can be incomplete
if the inhibitors and/or the enzyme present conformational diversity.
This diversity could play a critical role in a proper understanding
of the general inhibition process.
Figure 1
Inhibitors of the SARS-CoV-2 3CL protease
presenting an aldehyde
group as a warhead. (1a) Aldehyde derivatives bind to
the active site forming the noncovalent E–I complex and then
react with the thiol group of the catalytic cysteine to yield a hemithioacetal
(the E–I covalent complex). (1b) Overlap of the
X-ray structures of the (S)-hemithioacetal complexes
formed between the protease and two aldehyde derivatives, GC373, where carbon atoms are shown in brown color, and 11a, with carbons atoms in light blue. The PDB codes are 6WTT and 6LZE, respectively.
Inhibitors of the SARS-CoV-2 3CL protease
presenting an aldehyde
group as a warhead. (1a) Aldehyde derivatives bind to
the active site forming the noncovalent E–I complex and then
react with the thiol group of the catalytic cysteine to yield a hemithioacetal
(the E–I covalent complex). (1b) Overlap of the
X-ray structures of the (S)-hemithioacetal complexes
formed between the protease and two aldehyde derivatives, GC373, where carbon atoms are shown in brown color, and 11a, with carbons atoms in light blue. The PDB codes are 6WTT and 6LZE, respectively.The aldehyde group seems a promising warhead for
the development
of antiviral drugs to fight COVID-19, as it can react with the catalytic
cysteine to form hemithioacetal complexes (the E–I complex).[13] Several aldehyde derivatives have been shown
to have large inhibitory properties against the 3CL protease of SARS-CoV-2
during in vitro assays.[14] Among these compounds,
two of them are potent nanomolar inhibitors that have already been
tested in animals: GC373 (or its prodrug GC376) and 11a (see Figure a). The former was developed for the treatment of feline
infectious peritonitis, a disease caused by a coronavirus and it has
been shown to be also an inhibitor of SARS-CoV-2 main protease.[11] The structurally similar 11a was
initially designed analyzing the substrate-binding pocket of the ortholog
main protease of SARS-CoV.[10] The structures
of their hemithioacetal products have been deposited in the Protein
Data Bank with codes 6WTK and 6WTT (for GC373)[14] and 6LZE (for 11a).[10] In both cases, the inhibitor adopts
a similar pose in the active site of the main protease. The distance
between the Sγ atom of the catalytic cysteine and the aldehydecarbon atom is of about 1.8 °, corresponding to the formation
of an S–C covalent bond. The 6LZE and 6WTK PDB[15] structures
shown in Figure b
correspond to the (S) configuration of the hemithioacetal,
where the hydroxyl oxygen is pointing to the oxyanion hole formed
by main-chain NH groups of Cys145, Ser144, and Gly143.[10] The main difference between the two structures
showed in Figure b
is the rotameric state of the catalytic histidine. In 6WTK, the Nε atom
of His41 is pointing toward the binding site, while in 6LZE, the Nδ atom
is the one pointing to the inhibitor. These two configurations will
be hereafter denoted as ε- and δ-rotamers, respectively.
The rotameric state of His41 can be relevant for the reaction mechanism
because the Nε atom in the ε-rotamer is better oriented
to serve as a proton donor to the aldehydic oxygen atom of the inhibitor.Computational simulations of SARS-CoV-2 3CL protease have been
devoted to study its reactivity with peptide substrates, including
the acylation and deacylation steps.[16,17] Regarding
covalent inhibition, QM/MM methods have been also used to analyze
the reaction with irreversible Michael acceptors[18,19] and α-ketoamide inhibitors.[20] This
last study provides a complete evaluation of the binding free energy
of the covalent inhibitor as the sum of the noncovalent binding and
the reaction steps. In this work, we use classical and hybrid QM/MM
molecular dynamics simulations to explore the inactivation mechanism
of SARS-CoV-2 3CL protease by an aldehyde derivative, 11a. We provide atomistic details of the inhibition process, including
the description of the interactions established in the noncovalent
complex (E–I) and the reaction mechanism leading to the covalent
product (the hemithioacetal complex or E–I). Our simulations
stress on the interplay between conformational changes of the protein
and/or the substrate and the possible reaction mechanisms. The most
stable configuration for His41 both in the apo form and in the noncovalent
complex formed with the inhibitor is the ε-rotamer. From this
complex, the reaction proceeds via proton transfer from Cys145 to
His41 that gives rise to a catalytic dyad ion pair (IP). This is the
first step needed to initiate the reactions catalyzed by the 3CL protease,
as observed both in experimental[7] and computational
studies.[16,17] The process continues with the nucleophilic
attack of the Sγ atoms on the aldehydic carbon atom and the
proton transfer from the catalytic histidine to the aldehyde oxygen
atom through a water molecule to form the (S)-hemithioacetal.
This mechanism is confirmed by comparison with the structures of recently
proposed inhibitor PF-00835231[12] that mimic
the transition state of protease inhibition by aldehydes. Our simulations
are also useful to rationalize the behavior of other proposed inhibitors,
considering possible conformational changes of the inhibitor warhead
in the active site and how the reaction mechanism can change depending
on the conformation of the inhibitor. Other reaction mechanisms, involving
different proton transfer routes and/or leading to the (R) form of the product, can be feasible if other protein and/or inhibitor
conformations are populated. All in all, the simulations presented
here reveal the detailed reaction mechanism for the inhibition of
SARS-CoV-2 3CL protease with a promising family of inhibitors, the
aldehyde derivates, and also with related compounds, such as ketones.
Methodology
Classical
Molecular Dynamics Simulations
To build the
Michaelis complex (MC), the PDB accession code 6LZE structure was used.
This is the crystal structure of COVID-19 main protease forming a
covalent complex with the 11a inhibitor (with a resolution
of 1.5 Å).[10]11a was
parameterized following the nonstandard residue parameterization procedure
implemented in Amber using the Antechamber program[21] from the AmberTools18[22] package.
Atomic charges were obtained using the restrained electrostatic potential
(RESP) method[23] at the HF/6-31G* level.Tleap tool from abertools[22] was used
to build the system, with the ff14SB forcefield[24] to describe the canonical amino acids. PROPKA3.0[25] was used to determine the most likely protonation
state of every residue at pH 7.4. Na+ ions were added to
neutralize the charge of the protein-inhibitor complex. The protein-inhibitor
complex was solvated using a box of TIP3P water molecules in such
a way that any protein-inhibitor atom is not closer than 12 Å
to the limits of the simulation box. A minimization was made using
500 steps of the steepest descent method followed by the conjugate
gradient method until the root mean square of the gradient was below
10–3 kcal·mol–1·Å–1. Heating was performed using a linear heating ramp,
rising the temperature from 0 to 300 K along 120 ps followed by 20
ps simulation at 300 K. During this process, the positions of the
heavy atoms of the protein backbone were restrained using a harmonic
potential with a force constant of 20 kcal·mol–1·Å–2. Then, the system was equilibrated
in the NPT ensemble (300 K and 1 bar). For this equilibration,
the force constants for the positional restraint were reduced from
15 to 0 kcal·mol–1·Å–2. So that, every 1.25 ns, the force constant was decreased by 3 units.
Then, after 6.25 ns, the positional restraint was completely removed.
Finally, the system was run restraint-free for another 1.25 ns. In
order to get enough sampling, 4 replicas of 1 μs of the noncovalent
enzyme inhibitor complex with the catalytic dyad residues (Cys145
and His41) in the neutral state and with these residues charged forming
an IP. In these simulations, the time step was 2 fs using SHAKE[26] to constraint bonds involving hydrogen atoms.
Long-range electrostatic interactions were described using the particle
mesh Ewald method,[27,28] while the cut-off radius to evaluate
the short-range interactions was 10 Å. Pressure and temperature
were controlled using a Berendsen barostat and Langevin thermostat,
respectively. For all classical molecular dynamic simulations, AMBER19
GPU version of PMEMD[29,30] was employed.To study
the free-energy profiles associated to conformational
changes of the catalytic histidine and of the aldehyde inhibitor,
classical potentials of mean force were traced using umbrella sampling.[31] For the change in the rotameric state of His41,
the dihedral angle formed by the Cα–Cβ–Cγ–Nδ
atoms of the residue was used as a distinguished coordinate. The dihedral
angle was evolved using increments of 5° with a total of 41 windows.
For the rotation of the aldehyde group of the inhibitor, the dihedral
angle formed by the N–C–Caldh–Oaldh was selected as the distinguished coordinate. In this
case, the dihedral angle was evolved using increments of 10°
during 21 simulation windows. In both cases, each simulation window
was minimized under the harmonic restraint with a force constant of
100 kcal·mol–1·rad–2 using 2000 steps of the steepest descent method followed by the
conjugate gradient method until the root mean square of the gradient
was below 10–3 kcal·mol–1·Å–1. Afterward, a total of 52 ns of
classical MD was performed for each window, the first 2 ns was run
for relaxation followed by 50 ns of production. Therefore, the total
simulation time for the production stage was longer than 2 μs
for each free-energy profile corresponding to His41 rotation and longer
than 1 μs for the rotation of the aldehyde group of the inhibitor.
The free-energy profiles were integrated using the weighted histogram
analysis method (WHAM).[32]
QM/MM Calculations
The adaptative string method (ASM)[33] developed in our group was used to explore the
free-energy landscape associated to the chemical reaction. This methodology
has the advantage to avoid the oversimplified picture of what a reaction
mechanism is; in a real system, a reaction mechanism involves many
reaction coordinates, and the free-energy pathway depends on all of
them. Using the ASM, we can find the minimal free-energy pathway (MFEP)
in a multidimensional free-energy surface not just by evaluating the
change in energy related to the variation of a couple of distances,
angles, or dihedrals (as made in most of computational studies) but
including all the needed degrees of freedom without implying an additional
computational expense.The mechanistic proposals were explored
using 96 replicas of the system (string nodes) to connect the reactant
and product structures along the MFEP in a space of arbitrary dimensionality
defined by the collective variables (CVs) shown in Scheme . Using QM/MM MD simulations,
nodes are moved according to their free-energy gradient and redistributed
equidistantly along the string, avoiding them to fall to the global
minima (reactants and products). This procedure is continued until
the string converges to the MFEP displaying an RMSD below 0.1 amu1/2·Å or at least 2 ps. Replica exchange between
nodes was attempted every 50 steps to ensure convergence. After convergence,
a path-CV (denoted as s) measuring the advancement
of the system along the MFEP from reactants to products is defined
and employed to trace the reaction free-energy profile. 10 ps of QM/MM
simulations were run for every node, and WHAM[32] was selected as the integration method. To ensure a probability
density distribution of the reaction coordinate as homogeneous as
possible, the values of the force constants employed to bias the ASM
simulations were determined on-the-fly.[33] The QM region was described using a B3LYP functional[34,35] with a 6-31+G* basis set and D3 dispersion corrections.[36] This is a good choice to describe both the acylation
of a peptide substrate[16] and an inhibitor[18] by the SARS-CoV-2 protease with activation energy
results in excellent agreement with the experiment. A systematic study
on the cysteine–histidine proton transfer also pointed out
to the B3LYP functional as the most adequate.[37] All QM/MM calculations were performed using a modified version of
Amber18[22,38] coupled to Gaussian16[39] for density functional theory calculations. The cutoff
radii used for all the QM/MM interactions was of 15 Å. For mechanistic
determinations, the QM region included the side chains of the catalytic
dyad (His41 and Cys145), the water molecule involved in the reaction
mechanism, and the backbone atoms of residues P1 and P1′ in
the 11a inhibitor. Any other atom was described as an
MM level as explained in the classical molecular dynamic section.
The integration time step in QM/MM simulations was of 1 fs. Because
hydrogen atoms are involved in the reaction mechanism, we checked
that the string method converged to the same MFEP when a time step
of 0.5 fs was used (see Figure S1).
Scheme 1
Representation of the CVs Used in the Exploration of the Reaction
Mechanism
For the study of the proton
transfer within the catalytic dyad,
that is, from Cys145 to His41, the antisymmetric combination of the
distances of the proton to the donor and acceptor atoms [d(Sγ–H)-d(Nε–H)] was employed to
trace the free-energy profile using umbrella sampling.[31] The integration was carried out using the WHAM
method. In this case, only the side chains of the two involved residues
were included in the QM region (described at the B3LYPD3/6-31+G* level
of theory with D3 corrections). A total of 40 windows evenly spaced
each at 0.06 Å were used to cover the whole range of the reaction
coordinate. A force constant of 600 kcal·mol–1Å–2 was employed to drive the reaction coordinate.
Remaining details of the simulations were as described previously.
Results and Discussion
Rotameric State of the Catalytic Histidine
Our simulations
of the noncovalent complex formed by 11a were carried
out using the X-ray coordinates of the hemithioacetal complex (PDB
file 6LZE) as
a starting point. In agreement with the X-ray structure, we simulated
the inhibitor in the active sites of the two protomers (A and B) of
the dimeric enzyme. The X-ray structure was modified lengthening the
Sγ-C distance between the enzyme and the inhibitor. The catalytic
dyad (Cys145 and His41) was modeled in the neutral state, which is
the most stable form for the Michaelis and E–I complexes.[16,18,37]As explained before, in
the 6LZE X-ray
structure, the catalytic histidine is found in the δ-rotameric
state. With the help of Pocketome,[40] we
explored 91 different X-ray structures of the 3CL main protease in
the apo form containing several covalent and noncovalent inhibitors
(see Table S1). In 87.5% of the active
sites, including all those corresponding to the apo enzyme and most
of the inhibited enzymes, His41 was found in the ε-rotameric
state, while less than 20% presented the δ-rotameric state.
We modeled the noncovalent complex of the 3CL protease with 11a with the two His41 rotamers (see Figure a,b) and ran 1 μs long MD simulations.
In principle, both rotameric states could be productive as far as
the two states His41 and Cys145 form a strong hydrogen bond, which
can eventually lead to a proton transfer and the formation of the
catalytic dyad IP. Figure c shows that the probability distribution of Nε(His41)-Sγ(Cys145)
distances obtained from MD simulations of the 11a noncovalent
complex with the two rotameric states are almost indistinguishable
and the most probable distance (3.4 Å) corresponded to a hydrogen-bonded
dyad. In order to determine the relative stability of the two rotameric
states, we traced the free-energy profile associated to the rotation
of His41 around the Cβ–Cγ bond in the apo form
and in the noncovalent complex (E–I) with 11a.
The free-energy profiles obtained after 2 μs of classical MD
simulations are shown in Figure d. According to these results, both in the apo form
and in the presence of the 11a inhibitor, the ε-rotamer
is more stable than the δ-rotamer by 2.2 and 3.2 kcal·mol–1, respectively. This result agrees with the observation
that the ε-rotamer predominates in the X-ray structures of the
3CL protease, as explained above. However, the free-energy difference
between the two rotameric states is not too high, in particular, for
the apo form, which could explain that the δ-rotamer is found
in 12.5% of the active sites of the SARS-CoV-2 3CL protease, as reported
in Table S1.
Figure 2
Rotameric state of the
catalytic His41. (2a) Representation
of the noncovalent complex of 11a (carbon atoms in green)
with SARS-CoV-2 3CL protease with His41 in the ε-rotameric state,
(2b) representation of the complex presenting the δ-rotameric
state of His41, and (2c) probability densities of the
distances from the Cys145-Sγ atom to the Nε atom of His41for
the ε- (blue) and δ-rotameric state (red). (2d) Free-energy profile associated to the rotation around the Cγ–Cβ
bond of His41, from the ε-rotamer (left) to the δ-rotamer
(right) in the apo enzyme (red) and in the noncovalent complex with 11a (blue).
Rotameric state of the
catalytic His41. (2a) Representation
of the noncovalent complex of 11a (carbon atoms in green)
with SARS-CoV-2 3CL protease with His41 in the ε-rotameric state,
(2b) representation of the complex presenting the δ-rotameric
state of His41, and (2c) probability densities of the
distances from the Cys145-Sγ atom to the Nε atom of His41for
the ε- (blue) and δ-rotameric state (red). (2d) Free-energy profile associated to the rotation around the Cγ–Cβ
bond of His41, from the ε-rotamer (left) to the δ-rotamer
(right) in the apo enzyme (red) and in the noncovalent complex with 11a (blue).
Noncovalent Enzyme–Inhibitor
Complex
According
to the results of the previous section, we selected the ε-rotameric
state for all subsequent simulations of the complex formed between
the 3CL protease of SARS-CoV-2 and 11a. The resulting
MD simulations of the noncovalent E–I complex (4 replicas of
1 μs each) were stable in all cases (see RMSD time evolutions
in Figure S2), showing a binding pose consistent
with the X-ray structure (see Figure a). In this pose, P1–P3 sites of the inhibitor
present an interaction pattern similar to that of a peptide substrate
with the sequence -Val-Leu-Gln|Ser- (the vertical line indicates the
scissile bond).[16]Figure b compares the fraction of hydrogen bonds
between inhibitor/peptide sites and enzymatic residues during the
MD simulations of the inhibitor and of the peptide substrate. The
γ-lactam ring at the P1 position is frequently found in inhibitors
of SARS-CoV-2 and SARS-CoV main proteases because this moiety is expected
to mimic Gln at the P1 position of the peptide substrate, which is
a requirement of these enzymes. This γ-lactam ring can form
hydrogen bonds with His163, Glu166, and Phe140, displaying an interaction
pattern similar to that of Gln-P1. The cyclohexyl group at the P2
position of the inhibitor stacks with the imidazole ring of His41
and also presents interactions with other nearby residues, such as
Met165. Inhibitors of the SARS-CoV-2 main protease present hydrophobic
groups at the P2 position, similarly to the side chain of Leu-P2 in
the peptide substrate. Finally, the indole group at P3 is exposed
to the solvent and stabilized by hydrogen-bond interactions with main-chain
atoms of Glu166.
Figure 3
Noncovalent complex formed between the aldehyde derivative 11a and the 3CL protease of SARS-CoV-2. (3a)
Binding pose of the inhibitor in the active site of the protease,
showing the location of the catalytic dyad and the oxyanion hole.
Note that the aldehyde oxygen points to the oxyanion hole. (3b) Fraction of hydrogen-bond contacts between residues of 11a and a peptide substrate[16] and
those of the protease. A hydrogen-bond contact is counted when the
donor–acceptor distance is <3.8 Å and the hydrogen-bond
angle is >120°. (3c) Probability densities of
the
distances from the Cys145-Sγ atom to the C carbon atom of the
substrate, in blue, and from the Nε atom of His41 to the aldehyde
oxygen atom, in red.
Noncovalent complex formed between the aldehyde derivative 11a and the 3CL protease of SARS-CoV-2. (3a)
Binding pose of the inhibitor in the active site of the protease,
showing the location of the catalytic dyad and the oxyanion hole.
Note that the aldehyde oxygen points to the oxyanion hole. (3b) Fraction of hydrogen-bond contacts between residues of 11a and a peptide substrate[16] and
those of the protease. A hydrogen-bond contact is counted when the
donor–acceptor distance is <3.8 Å and the hydrogen-bond
angle is >120°. (3c) Probability densities of
the
distances from the Cys145-Sγ atom to the C carbon atom of the
substrate, in blue, and from the Nε atom of His41 to the aldehydeoxygen atom, in red.The chemical step in
SARS-CoV-2 3CL protease requires a proton
transfer from Cys145 to His41 to form a catalytic dyad IP.[16,37] As presented above, in the case of the noncovalent E–I complex,
these two residues are kept at hydrogen-bond distances during a significant
fraction of the simulation (see Figure c for one replica; results for other replicas are presented
in Figure S3). After this proton transfer,
the covalent inhibition of the enzyme should proceed with the formation
of the corresponding hemithioacetal (the E–I complex) that
requires the nucleophilic attack of the Sγ atom to the aldehydecarbon atom and the protonation of the oxygen atom. Figure c shows the distribution of
distances between the Cys145 Sγ atom and the aldehydecarbon
atom. The distribution shows a bimodal shape, with two peaks centered
at 3.3 and 5.1 Å that correspond to the trans and gauche conformations of the Cys145 side chain,
respectively. Thus, in this pose, the aldehydecarbon atom of the
inhibitor can be found already at short enough distances to suffer
the nucleophilic attack by the Sγ atom of Cys145. The position
of the aldehyde oxygen atom is stabilized by means of hydrogen-bond
interactions with the main-chain NH groups of Cys145 (2.4 ± 0.3
Å), Ser144 (2.8 ± 0.4 Å), and Gly143 (2.3 ± 0.3
Å), as seen in Figure a. These three residues form the oxyanion hole, placed in
a U-turn of the loop connecting β10 and β11 and that closes
one of the sides of the active site. These interactions are also observed
in the X-ray structure of the (S)-hemithioacetal
product (see Figure b).[10] In this pose, the aldehyde oxygen
atom is placed far from the Nε atom of His41, as confirmed by
the probability distribution of Nε–O distances, peaked
at 6.3 Å (see Figure c). This separation precludes a direct proton transfer between
these two atoms after the formation of the IP. However, as discussed
below, once the IP is formed, a water molecule can be accommodated
in between the catalytic histidine and the aldehyde group of the inhibitor,
facilitating a water-mediated proton transfer from His41 to the aldehydeoxygen atom to form the hemithioacetal.The proposed mechanism,
with a water-mediated proton transfer,
could also explain the reactivity of other aldehyde inhibitors to
form the (S)-hemithioacetal, which is the enantiomer
more frequently observed in the X-ray structures (see Table S1). X-ray structures of other (S)-hemithioacetal complexes show a similar pose for all
of them, with the aldehyde oxygen atom pointing to the oxyanion hole
and thus far enough from the catalytic histidine for a direct proton
transfer. However, there is one case where the (R) enantiomer of the product has been observed: the 6WTT X-ray structure
contains a (R)-hemithioacetal in one of the three
protomers of the asymmetrical unit and the (S) configuration
in the other two.[14] The (R) enantiomeric form can be obtained if the aldehyde group of the
inhibitor rotates around the C–C(O) bond before the nucleophilic
attack by Cys145. In that case, the aldehyde oxygen atom would point
toward the catalytic His41 instead of toward the oxyanion hole and
the distance to the Nε atom would become small enough to allow
a direct proton transfer between them (see Figure a). While this binding pose of the inhibitor
simplifies the reaction mechanism, it is not the most stable for 11a, as demonstrated by the free-energy profile obtained for
the rotation of the aldehyde group around the C–C(O) bond in
the active site. First, in order to test the reliability of our parametrization
to describe this rotation of the substrate, we performed gas-phase
B3LYPD3/6-31+G* single-point energy calculations on minimum energy
structures extracted from a dihedral potential energy scan. In the
gas phase, the pro-(R) structure was found to be
1.1 kcal·mol–1 more stable than the pro-(S) because of an intramolecular interaction between the
aldehyde oxygen and the amide NH group, while at the MM level, the
energy difference was found to be very similar, 1.6 kcal·mol–1. This calculation confirms the adequacy of the forcefield
to represent the isomerization process. Then, we obtained the isomerization
free-energy profile in the enzyme (see Figure b), as described in the Methodology section. According to this, the pro-(S) conformation is 5.5 kcal·mol–1 more stable
than the pro-(R) conformer in the enzyme because
in the first case, the aldehyde oxygen atom is accommodated in the
oxyanion hole, while in the second one, the aldehyde oxygen atom points
toward His41. It must be noticed that rotation of the aldehyde group
of the inhibitor in the enzymatic active site could be facilitated
after the formation of the IP, when the Nε atom of His41 is
protonated. This rotation would explain that the (R)-hemithioacetal product can also be formed, as discussed above.
Note that the preference for one or another orientation of the oxygen
atom can be modulated through changes in the chemical structure of
the inhibitor, for example, substituting the aldehydehydrogen atom
by other groups able to interact more favorably with the oxyanion
hole. This seems to be the case of boceprevir, a α-ketoamide
inhibitor where the aldehydehydrogen is substituted by an acetamide
group. In this case, the X-ray structure (PDB code 7BRP)[41] shows that the acetamide group is accommodated in the oxyanion
hole while the ketoneoxygen atom points to the catalytic His41, facilitating
a reaction mechanism with a direct proton transfer from the catalytic
histidine to the carbonyl oxygen atom.
Figure 4
Orientation of the aldehyde
group in the noncovalent complex. (4a) Representation
of the E–I complex with inhibitor 11a in the pro-(R) conformation. (4b) Classical free-energy
profile for the rotation of the aldehyde
group from the pro-(S) conformer (left) to the pro-(R) conformer (right).
Orientation of the aldehyde
group in the noncovalent complex. (4a) Representation
of the E–I complex with inhibitor 11a in the pro-(R) conformation. (4b) Classical free-energy
profile for the rotation of the aldehyde
group from the pro-(S) conformer (left) to the pro-(R) conformer (right).
Formation of the (S)-Hemithioacetal Complex
Chemical transformations in the active site of 3CL protease are
initiated with the proton transfer from Cys145 to His41 to form an
ionized catalytic dyad (the IP) from which the reaction proceeds.
We traced the QM/MM free-energy profile for the proton transfer from
Cys145 to His41 using as a distinguished coordinate the antisymmetric
combination of proton distances to the donor and acceptor atoms [d(Sγ–H)-d(Nε–H), see Figure a]. Starting from the E–I
complex, the free-energy cost of forming the IP is 9.3 kcal·mol–1, a value slightly larger than that obtained when
a peptide substrate is present in the active site (4.8 kcal·mol–1).[16] The value obtained
for 11a is similar to those found for the formation of
the IP under the presence of other inhibitors: 7.3 kcal·mol–1 with an α-ketoamide inhibitor[20] and 10.3 with a Michael acceptor.[18] These values are systematically larger than the free-energy cost
evaluated for the apo enzyme, which was found to be 2.9 kcal·mol–1 in two different studies,[16,20] indicating that desolvation of the active site upon ligand binding
can destabilize the IP form. Diminution of this energy penalty through
ligand design could be a promising strategy to improve inhibition
binding and kinetics.
Figure 5
Proton transfer from Cys145 to His41 in SARS-CoV-2 3CL
protease.
(5a) Free-energy profile for the transformation from
the neutral catalytic dyad (left) to the IP (IP, right) in the E–I
complex with 11a (gray line) and in the complex with
the peptide substrate[16] (blue line). The
reaction coordinate (RC) is the antisymmetric combination d(Sγ–H)-d(Nε–H). (5b) IP structure with 11a showing the presence of a water
molecule in between the protonated histidine and the inhibitor aldehyde
group.
Proton transfer from Cys145 to His41 in SARS-CoV-2 3CL
protease.
(5a) Free-energy profile for the transformation from
the neutral catalytic dyad (left) to the IP (IP, right) in the E–I
complex with 11a (gray line) and in the complex with
the peptide substrate[16] (blue line). The
reaction coordinate (RC) is the antisymmetric combination d(Sγ–H)-d(Nε–H). (5b) IP structure with 11a showing the presence of a water
molecule in between the protonated histidine and the inhibitor aldehyde
group.After the formation of the IP,
the inhibition reaction must proceed
with the attack of the activated nucleophile (the Sγ atom of
Cys145) on the aldehydecarbon atom and the proton transfer from the
Nε atom of His41 to the aldehyde oxygen atom. As explained before,
the distance between these two atoms in the IP is too large for a
direct proton transfer. However, MD simulations show that a water
molecule can be placed in between His41 and the aldehyde group, attracted
by the large dipole moment associated to the IP (see Figure b). This water molecule can
act as a proton relay, accepting a proton from His41 and donating
a proton to the aldehyde oxygen atom to form the hydroxyl group of
the hemithioacetal.Starting from the structure presented in Figure b, we obtained the
corresponding QM/MM MFEP
for the formation of the (S)-hemithioacetal product,
the enantiomer experimentally observed for inhibitor 11a. Figure a shows
the B3LYPD3/6-31+G*/MM free-energy profile associated to the reaction
from IP to the covalent E–I complex, while Figure b shows the evolution of key
distances along this MFEP (see also Scheme ). The reaction begins with the approach
of the Sγ atom to the aldehydecarbon atom, reducing the Sγ-C
distance from 2.95 to 2.18 Å at the first transition state (TS1,
shown in Figure S4). Then, the water molecule
is accommodated in between the proton donor His41 and the proton acceptor
aldehyde oxygen. The rate-limiting TS is TS2 (see Figure c) where the formation of the
S–C bond (presenting a distance of 1.94 Å) is accompanied
by lengthening of the aldehyde double bond (from 1.22 to 1.38 Å),
due to the charge transfer from Cys145 to the aldehyde oxygen atom
to form an oxyanion. This charge transfer triggers a concerted proton
transfer from the water molecule to the oxyanion and from the protonated
His41 to the water molecule. These coupled proton transfer events
are not completely synchronous because at the TS, the distance of
the transferred proton to the aldehyde oxygen is 1.20 Å, while
the distance from the proton transferred from His41 to the oxygenwater molecule is slightly longer, 1.38 Å. The process continues
downhill up to the products (see Figure d), completing the formation of the Sγ–C
bond (1.85 Å) and the lengthening of the aldehyde bond distance
up to a typical value for a single C–O bond (1.42 Å),
and the proton transfers from the water molecule to the hemithioacetal
and from His41 to the water molecule (see Figure d). The small barrier observed at the end
of the reaction path corresponds to the rotation of the hemithioacetal
hydroxyl group to fit into the oxyanion hole.
Figure 6
Formation of the (S)-hemithioacetal product from
the IP. (6a) B3LYPD3/6-31+G*/MM free-energy profile along
the path-CV for the formation of the covalent E–I complex from
the IP. (6b) Evolution of the selected CVs along the
MFEP. The color code corresponds to Scheme . (6c) Representation of the
rate-determining TS (TS2). The values of the distances correspond
(in Å) to the coordinates of the MFEP where the TS is located.
(6d) Representation of the (S)-hemithioacetal
complex. (6e) Overlap of the TS2 structure (balls and
sticks with carbon atoms in orange) with the X-ray structure 6XHM containing a hydroxymethylketone
inhibitor PF-00835231 (licorice with carbon atoms in light blue).
Note that the hydroxyl group of the inhibitor is placed at a position
equivalent to the water molecule in the TS of the inhibition by 11a. (6f). Overlap of the product structure obtained
from our simulations (balls and sticks with carbon atoms in orange)
with the X-ray structure 6LZE (licorice with carbon atoms in light blue).
Formation of the (S)-hemithioacetal product from
the IP. (6a) B3LYPD3/6-31+G*/MM free-energy profile along
the path-CV for the formation of the covalent E–I complex from
the IP. (6b) Evolution of the selected CVs along the
MFEP. The color code corresponds to Scheme . (6c) Representation of the
rate-determining TS (TS2). The values of the distances correspond
(in Å) to the coordinates of the MFEP where the TS is located.
(6d) Representation of the (S)-hemithioacetal
complex. (6e) Overlap of the TS2 structure (balls and
sticks with carbon atoms in orange) with the X-ray structure 6XHM containing a hydroxymethylketone
inhibitor PF-00835231 (licorice with carbon atoms in light blue).
Note that the hydroxyl group of the inhibitor is placed at a position
equivalent to the water molecule in the TS of the inhibition by 11a. (6f). Overlap of the product structure obtained
from our simulations (balls and sticks with carbon atoms in orange)
with the X-ray structure 6LZE (licorice with carbon atoms in light blue).The plausibility of the proposed water-mediated
mechanism can be
confirmed comparing the structure obtained for the reaction TS with
the X-ray structure of the protein with the PF-00835231 inhibitor
(PDB code 6XHM).[12] This recently proposed inhibitor
of the SARS-CoV-2 main protease presents a ketone group as a warhead
and the aldehydehydrogen atom has been substituted by a hydroxymethyl
group. As observed in Figure e, the hydroxyl moiety of the PF-00835231 inhibitor in the 6XHM structure occupies
exactly the same position as the bridging water molecule in our TS.
Then, the PF-00835231 inhibitor would be a perfect transition-state
analogue of protease inhibition by aldehydes. On the one hand, this
structural agreement confirms the role of the water molecule in the
inhibition process by aldehydes and, on the other hand, suggests that
in the case of the hydroxymethylketone inhibitor, the proton relay
role of the water molecule could be played by the hydroxyl group.
Finally, as a further confirmation of our mechanism, the structure
obtained for the final product nicely overlaps with that corresponding
to the (S)-hemithioacetal complex in the X-ray structure 6LZE (see Figure f).Combining the free-energy
profiles presented in Figures and 6, we can obtain a full picture
of the transformation from the noncovalent
E–I complex formed between the SARS-CoV-2 protease and the
inhibitor 11a to the covalent E–I complex, the
(S)-hemithioacetal. A representation of the complete
free-energy profile is given in Figure . The total reaction free energy can be obtained combining
the free-energy cost of forming the IP and the free-energy change
from the IP to the reaction product, E–I. The first term was
estimated to be 9.3 kcal·mol–1 (see Figure a), while the free-energy
difference between E–I and IP (Figure a) is −12.1 kcal·mol–1. Then, our simulations predict that the free energy of the covalently
bonded E–I complex relative to the noncovalent E–I complex
is −2.8 kcal·mol–1. This reaction free
energy is significantly smaller, in absolute value, than the value
obtained for the N3 inhibitor (−15.4 kcal·mol–1),[18] an irreversible inhibitor
of the 3CL protease.[8] Thus, according to
our simulations, inhibition of SARS-CoV-2 3CL protease with 11a would be significantly more reversible than that with N3. In fact, aldehyde derivative inhibitors have been proposed
to act via a reversible formation of the hemithioacetal,[11] which is now confirmed by our simulations. It
must be noticed that reversibility can be a desired feature of cysteine
protease inhibitors, in particular, when extended therapy periods
are required.[42]
Figure 7
Representation of the
complete free-energy profile for the inhibition
of 3CL SARS-CoV-2 protease with 11a, including IP formation
and the formation of the covalent E–I complex (product P in
the figure).
Representation of the
complete free-energy profile for the inhibition
of 3CL SARS-CoV-2 protease with 11a, including IP formation
and the formation of the covalent E–I complex (product P in
the figure).Regarding the kinetics of the
process, the first-order rate constant
for covalent inhibition of the main protease by 11a is
determined by the free-energy difference between the rate-determining
states,[43] the highest energy TS (TS2) and
the noncovalent E–I complex. Accordingly, our free-energy simulations
provide the activation free energy as the sum of two contributions:
the free-energy cost of forming the reactive IP from E–I (9.3
kcal·mol–1, Figure a) plus the free energy of the TS2 relative
to IP (9.2 kcal·mol–1, Figure a). This gives in a total activation free
energy of 18.5 kcal·mol–1, as shown in Figure . While the inhibition
rate constant of SARS-CoV-2 protease by 11a has not been
experimentally determined, the rate constant for the inhibition of
the protease with GC376 (the prodrug of the aldehyde
inhibitor GC373) has been measured to be 2.45 ×
10–3 s–1 at 30 °C.[14] According to transition state theory, this is
equivalent to an activation free energy of 21.1 kcal·mol–1. In spite of the differences between both inhibitors, 11a and GC373, they present the same warhead
and the same group at the P1 position (see Figure a), suggesting that the activation free energy
derived for GC373 could be a reasonable reference for 11a and thus for our simulations. The good agreement between
the experimental observations made for GC373 (21.1 kcal·mol–1) and our theoretical predictions for 11a (18.5 kcal·mol–1), together with the structural
evidences discussed above, strongly supports our mechanistic proposal
for the inhibition of SARS-CoV-2 3CL protease not only by 11a but also by other aldehyde and ketone derivatives.
Conclusions
We have used a combination of classical MM and hybrid QM/MM molecular
dynamics simulations to explore the inhibition mechanism of the SARS-CoV-2
3CL protease by an aldehyde derivative, 11a, selected
as an example of this promising family of inhibitors. Starting from
the X-ray structure of the hemithioacetal complex, we have explored
the binding mode of the 11a inhibitor in the noncovalent
E–I complex with the protease. In agreement with X-ray observations,
our MD simulations show that there are two possible rotamers for the
catalytic histidine, depending on the internal rotation around the
Cβ–Cγ bond. Both for the apo form and for the noncovalent
complex formed between the 3CL protease and the 11a inhibitor,
we found that the rotamer where the Nε atom of His41 lies closer
to the inhibitor is more stable. Regarding the inhibitor, the orientation
of the aldehyde group is determined by the interaction of the oxygen
atom with the oxyanion hole, favoring the formation of the (S)-enantiomer of the product. This configuration of the
protein and the inhibitor facilitates a reaction mechanism where,
after the formation of the catalytic dyad IP, a proton is transferred
from His41 to the aldehyde oxygen atom through a water molecule. Other
reaction mechanisms, involving different proton transfer routes and/or
leading to the (R) form of the product, can be feasible
if other conformations are populated. This conclusion is supported
by the analysis of the X-ray structures available for the inhibited
3CL protease.Our simulations of the noncovalent complex with 11a show that P1–P3 groups of the inhibitor establish
well-defined
interactions in the active site that closely mimic those of the peptide
substrate. The aldehyde group is placed close to Cys145, facilitating
the nucleophilic attack of the Sγ atom on the aldehydecarbon
atom while the oxygen atom is stabilized by interactions with NH main-chain
groups that constitute the oxyanion hole. The reactivity of the SARS-CoV-2
main protease is triggered after the proton transfer from Cys145 to
His41, which are kept at hydrogen-bond distances in the E–I
complex. After IP formation, the inhibition reaction proceeds by means
of the nucleophilic attack of the activated nucleophile, the unprotonated
Cys145, and a water-mediated proton transfer from the protonated His41,
yielding the (S)-hemithioacetal product. In our QM/MM
minimum free-energy profile, the activation free energy is 18.5 kcal·mol–1, in good agreement with the value experimentally
derived for a similar aldehyde inhibitor (21.1 kcal·mol–1). In addition, our reaction profile indicates that the process is
moderately exergonic, in concordance with the proposed reversibility
for the protease inhibition by aldehydes.This study illustrates
the importance of molecular simulations
to unravel the reaction mechanisms and to guide the design of possible
inhibitors. The binding pose observed in X-ray structures can be not
enough to figure out the mechanistic details of the inhibition process
when conformational rearrangements in the protein, the solvent, and/or
the inhibitor are required for the process to take place. In addition,
apparently small modifications in the chemical structure of the inhibitor
can change the binding pose, offering different mechanistic ways for
the inactivation of the enzyme. The consideration of the mutual coupling
observed between the active site, solvent, and inhibitor dynamics
along the reaction path is an undeniable challenge for the design
of selective and efficient inhibitors that can be approached by means
of adequate simulation methods.
Authors: Carlos A Ramos-Guzmán; José Luis Velázquez-Libera; J Javier Ruiz-Pernía; Iñaki Tuñón Journal: J Chem Theory Comput Date: 2022-05-13 Impact factor: 6.578
Authors: Guillem Macip; Pol Garcia-Segura; Júlia Mestres-Truyol; Bryan Saldivar-Espinoza; María José Ojeda-Montes; Aleix Gimeno; Adrià Cereto-Massagué; Santiago Garcia-Vallvé; Gerard Pujadas Journal: Med Res Rev Date: 2021-10-26 Impact factor: 12.388