Literature DB >> 34192090

Multiscale Simulations of SARS-CoV-2 3CL Protease Inhibition with Aldehyde Derivatives. Role of Protein and Inhibitor Conformational Changes in the Reaction Mechanism.

Carlos A Ramos-Guzmán¹, J Javier Ruiz-Pernía¹, Iñaki Tuñón¹.

Abstract

We here investigate the mechanism of SARS-CoV-2 3CL protease inhibition by one of the most promising families of inhibitors, those containing an aldehyde group as a warhead. These compounds are covalent inhibitors that inactivate the protease, forming a stable hemithioacetal complex. Inhibitor 11a is a potent inhibitor that has been already tested in vitro and in animals. Using a combination of classical and QM/MM simulations, we determined the binding mode of the inhibitor into the active site and the preferred rotameric state of the catalytic histidine. In the noncovalent complex, the aldehyde group is accommodated into the oxyanion hole formed by the NH main-chain groups of residues 143 to 145. In this pose, P1-P3 groups of the inhibitor mimic the interactions established by the natural peptide substrate. The reaction is initiated with the formation of the catalytic dyad ion pair after a proton transfer from Cys145 to His41. From this activated state, covalent inhibition proceeds with the nucleophilic attack of the deprotonated Sγ atom of Cys145 to the aldehyde carbon atom and a water-mediated proton transfer from the Nε atom of His41 to the aldehyde oxygen atom. Our proposed reaction transition-state structure is validated by comparison with X-ray data of recently reported inhibitors, while the activation free energy obtained from our simulations agrees with the experimentally derived value, supporting the validity of our findings. Our study stresses the interplay between the conformational dynamics of the inhibitor and the protein with the inhibition mechanism and the importance of including conformational diversity for accurate predictions about the inhibition of the main protease of SARS-CoV-2. The conclusions derived from our work can also be used to rationalize the behavior of other recently proposed inhibitor compounds, including aldehydes and ketones with high inhibitory potency.

Entities: Chemical Disease Gene Species

Year: 2021 PMID： 34192090 PMCID： PMC8008790 DOI： 10.1021/acscatal.0c05522

Source DB: PubMed Journal: ACS Catal Impact factor: 13.084

Introduction

The impact and rapid expansion of COVID-19, caused by the coronavirus SARS-CoV-2, has urged the research to find therapeutic remedies.[1] There are, in principle, two main strategies to be used against this disease: the development of vaccines and antiviral drugs. While vaccines have the advantage of preventing the disease, antivirals could be beneficial not only to fight against SARS-CoV-2 but also for other related coronaviruses, including those that could infect human beings in a near future. While some antiviral drugs developed to fight against other viruses, such as remdesivir, lopinavir, or favipiravir, have been already tested to treat COVID-19, there are no clear evidences of their efficiency.[2,3] Therefore, there is an urgent need to develop new antiviral drugs that are both effective and selective for SARS-CoV-2. One of the strategies for the development of antiviral drugs is to target the inhibition of one of those enzymes that are essential for the vital cycle of the virus. When SARS-CoV-2 infects a cell, it utilizes the transcription machinery to translate the viral genome into two long polyproteins. These long chains must be cleaved at specific sites to produce the nonstructural proteins that the virus needs for replication and transcription of its genome. This key function is performed by two proteases: the 3C-like (3CL) or main protease and the Papain-like (PL) protease. The former cleaves the polyprotein at 11 positions targeting for the Gln-(Ser/Ala/Gly) peptide bond, a sequence preference that is not used by human proteases.[4] This characteristic makes the 3CL protease an excellent candidate as a drug target because those compounds designed to bind and block the active site of this protein have less chances to interact with proteases of the host.[5] The 3CL protease is a cysteine protease and its active site is formed by a catalytic dyad, His41 and Cys145, in charge of the hydrolysis of peptide bonds. This process takes place in two main steps:[6,7] acylation and deacylation. During the acylation step, a peptide fragment is released while the other forms an acyl–enzyme complex by means of a covalent bond with the Sγ atom of the catalytic cysteine. During deacylation, the second fragment is released by the action of a water molecule, recovering the enzyme for a new catalytic cycle. Until now, several families of inhibitors have been proposed and tested in vitro against the 3CL protease of SARS-CoV-2, including Michael acceptors,[8] α-ketoamides,[9] aldehyde derivatives,[10,11] and ketones.[12] These compounds first bind to the active site of the protease forming a noncovalent complex (E–I) and then react with the thiol group of the catalytic cysteine to form a stable covalent acyl–enzyme complex (E–I), see Figure a. The design and improvement of these compounds is usually guided by the information provided by the X-ray structure of the covalent complex. However, as we will stress in this work, this information can be incomplete if the inhibitors and/or the enzyme present conformational diversity. This diversity could play a critical role in a proper understanding of the general inhibition process.

Figure 1

Inhibitors of the SARS-CoV-2 3CL protease presenting an aldehyde group as a warhead. (1a) Aldehyde derivatives bind to the active site forming the noncovalent E–I complex and then react with the thiol group of the catalytic cysteine to yield a hemithioacetal (the E–I covalent complex). (1b) Overlap of the X-ray structures of the (S)-hemithioacetal complexes formed between the protease and two aldehyde derivatives, GC373, where carbon atoms are shown in brown color, and 11a, with carbons atoms in light blue. The PDB codes are 6WTT and 6LZE, respectively. The aldehyde group seems a promising warhead for the development of antiviral drugs to fight COVID-19, as it can react with the catalytic cysteine to form hemithioacetal complexes (the E–I complex).[13] Several aldehyde derivatives have been shown to have large inhibitory properties against the 3CL protease of SARS-CoV-2 during in vitro assays.[14] Among these compounds, two of them are potent nanomolar inhibitors that have already been tested in animals: GC373 (or its prodrug GC376) and 11a (see Figure a). The former was developed for the treatment of feline infectious peritonitis, a disease caused by a coronavirus and it has been shown to be also an inhibitor of SARS-CoV-2 main protease.[11] The structurally similar 11a was initially designed analyzing the substrate-binding pocket of the ortholog main protease of SARS-CoV.[10] The structures of their hemithioacetal products have been deposited in the Protein Data Bank with codes 6WTK and 6WTT (for GC373)[14] and 6LZE (for 11a).[10] In both cases, the inhibitor adopts a similar pose in the active site of the main protease. The distance between the Sγ atom of the catalytic cysteine and the aldehyde carbon atom is of about 1.8 °, corresponding to the formation of an S–C covalent bond. The 6LZE and 6WTK PDB[15] structures shown in Figure b correspond to the (S) configuration of the hemithioacetal, where the hydroxyl oxygen is pointing to the oxyanion hole formed by main-chain NH groups of Cys145, Ser144, and Gly143.[10] The main difference between the two structures showed in Figure b is the rotameric state of the catalytic histidine. In 6WTK, the Nε atom of His41 is pointing toward the binding site, while in 6LZE, the Nδ atom is the one pointing to the inhibitor. These two configurations will be hereafter denoted as ε- and δ-rotamers, respectively. The rotameric state of His41 can be relevant for the reaction mechanism because the Nε atom in the ε-rotamer is better oriented to serve as a proton donor to the aldehydic oxygen atom of the inhibitor. Computational simulations of SARS-CoV-2 3CL protease have been devoted to study its reactivity with peptide substrates, including the acylation and deacylation steps.[16,17] Regarding covalent inhibition, QM/MM methods have been also used to analyze the reaction with irreversible Michael acceptors[18,19] and α-ketoamide inhibitors.[20] This last study provides a complete evaluation of the binding free energy of the covalent inhibitor as the sum of the noncovalent binding and the reaction steps. In this work, we use classical and hybrid QM/MM molecular dynamics simulations to explore the inactivation mechanism of SARS-CoV-2 3CL protease by an aldehyde derivative, 11a. We provide atomistic details of the inhibition process, including the description of the interactions established in the noncovalent complex (E–I) and the reaction mechanism leading to the covalent product (the hemithioacetal complex or E–I). Our simulations stress on the interplay between conformational changes of the protein and/or the substrate and the possible reaction mechanisms. The most stable configuration for His41 both in the apo form and in the noncovalent complex formed with the inhibitor is the ε-rotamer. From this complex, the reaction proceeds via proton transfer from Cys145 to His41 that gives rise to a catalytic dyad ion pair (IP). This is the first step needed to initiate the reactions catalyzed by the 3CL protease, as observed both in experimental[7] and computational studies.[16,17] The process continues with the nucleophilic attack of the Sγ atoms on the aldehydic carbon atom and the proton transfer from the catalytic histidine to the aldehyde oxygen atom through a water molecule to form the (S)-hemithioacetal. This mechanism is confirmed by comparison with the structures of recently proposed inhibitor PF-00835231[12] that mimic the transition state of protease inhibition by aldehydes. Our simulations are also useful to rationalize the behavior of other proposed inhibitors, considering possible conformational changes of the inhibitor warhead in the active site and how the reaction mechanism can change depending on the conformation of the inhibitor. Other reaction mechanisms, involving different proton transfer routes and/or leading to the (R) form of the product, can be feasible if other protein and/or inhibitor conformations are populated. All in all, the simulations presented here reveal the detailed reaction mechanism for the inhibition of SARS-CoV-2 3CL protease with a promising family of inhibitors, the aldehyde derivates, and also with related compounds, such as ketones.

Methodology

Classical Molecular Dynamics Simulations

To build the Michaelis complex (MC), the PDB accession code 6LZE structure was used. This is the crystal structure of COVID-19 main protease forming a covalent complex with the 11a inhibitor (with a resolution of 1.5 Å).[10]11a was parameterized following the nonstandard residue parameterization procedure implemented in Amber using the Antechamber program[21] from the AmberTools18[22] package. Atomic charges were obtained using the restrained electrostatic potential (RESP) method[23] at the HF/6-31G* level. Tleap tool from abertools[22] was used to build the system, with the ff14SB forcefield[24] to describe the canonical amino acids. PROPKA3.0[25] was used to determine the most likely protonation state of every residue at pH 7.4. Na+ ions were added to neutralize the charge of the protein-inhibitor complex. The protein-inhibitor complex was solvated using a box of TIP3P water molecules in such a way that any protein-inhibitor atom is not closer than 12 Å to the limits of the simulation box. A minimization was made using 500 steps of the steepest descent method followed by the conjugate gradient method until the root mean square of the gradient was below 10–3 kcal·mol–1·Å–1. Heating was performed using a linear heating ramp, rising the temperature from 0 to 300 K along 120 ps followed by 20 ps simulation at 300 K. During this process, the positions of the heavy atoms of the protein backbone were restrained using a harmonic potential with a force constant of 20 kcal·mol–1·Å–2. Then, the system was equilibrated in the NPT ensemble (300 K and 1 bar). For this equilibration, the force constants for the positional restraint were reduced from 15 to 0 kcal·mol–1·Å–2. So that, every 1.25 ns, the force constant was decreased by 3 units. Then, after 6.25 ns, the positional restraint was completely removed. Finally, the system was run restraint-free for another 1.25 ns. In order to get enough sampling, 4 replicas of 1 μs of the noncovalent enzyme inhibitor complex with the catalytic dyad residues (Cys145 and His41) in the neutral state and with these residues charged forming an IP. In these simulations, the time step was 2 fs using SHAKE[26] to constraint bonds involving hydrogen atoms. Long-range electrostatic interactions were described using the particle mesh Ewald method,[27,28] while the cut-off radius to evaluate the short-range interactions was 10 Å. Pressure and temperature were controlled using a Berendsen barostat and Langevin thermostat, respectively. For all classical molecular dynamic simulations, AMBER19 GPU version of PMEMD[29,30] was employed. To study the free-energy profiles associated to conformational changes of the catalytic histidine and of the aldehyde inhibitor, classical potentials of mean force were traced using umbrella sampling.[31] For the change in the rotameric state of His41, the dihedral angle formed by the Cα–Cβ–Cγ–Nδ atoms of the residue was used as a distinguished coordinate. The dihedral angle was evolved using increments of 5° with a total of 41 windows. For the rotation of the aldehyde group of the inhibitor, the dihedral angle formed by the N–C–Caldh–Oaldh was selected as the distinguished coordinate. In this case, the dihedral angle was evolved using increments of 10° during 21 simulation windows. In both cases, each simulation window was minimized under the harmonic restraint with a force constant of 100 kcal·mol–1·rad–2 using 2000 steps of the steepest descent method followed by the conjugate gradient method until the root mean square of the gradient was below 10–3 kcal·mol–1·Å–1. Afterward, a total of 52 ns of classical MD was performed for each window, the first 2 ns was run for relaxation followed by 50 ns of production. Therefore, the total simulation time for the production stage was longer than 2 μs for each free-energy profile corresponding to His41 rotation and longer than 1 μs for the rotation of the aldehyde group of the inhibitor. The free-energy profiles were integrated using the weighted histogram analysis method (WHAM).[32]

QM/MM Calculations

The adaptative string method (ASM)[33] developed in our group was used to explore the free-energy landscape associated to the chemical reaction. This methodology has the advantage to avoid the oversimplified picture of what a reaction mechanism is; in a real system, a reaction mechanism involves many reaction coordinates, and the free-energy pathway depends on all of them. Using the ASM, we can find the minimal free-energy pathway (MFEP) in a multidimensional free-energy surface not just by evaluating the change in energy related to the variation of a couple of distances, angles, or dihedrals (as made in most of computational studies) but including all the needed degrees of freedom without implying an additional computational expense. The mechanistic proposals were explored using 96 replicas of the system (string nodes) to connect the reactant and product structures along the MFEP in a space of arbitrary dimensionality defined by the collective variables (CVs) shown in Scheme . Using QM/MM MD simulations, nodes are moved according to their free-energy gradient and redistributed equidistantly along the string, avoiding them to fall to the global minima (reactants and products). This procedure is continued until the string converges to the MFEP displaying an RMSD below 0.1 amu1/2·Å or at least 2 ps. Replica exchange between nodes was attempted every 50 steps to ensure convergence. After convergence, a path-CV (denoted as s) measuring the advancement of the system along the MFEP from reactants to products is defined and employed to trace the reaction free-energy profile. 10 ps of QM/MM simulations were run for every node, and WHAM[32] was selected as the integration method. To ensure a probability density distribution of the reaction coordinate as homogeneous as possible, the values of the force constants employed to bias the ASM simulations were determined on-the-fly.[33] The QM region was described using a B3LYP functional[34,35] with a 6-31+G* basis set and D3 dispersion corrections.[36] This is a good choice to describe both the acylation of a peptide substrate[16] and an inhibitor[18] by the SARS-CoV-2 protease with activation energy results in excellent agreement with the experiment. A systematic study on the cysteine–histidine proton transfer also pointed out to the B3LYP functional as the most adequate.[37] All QM/MM calculations were performed using a modified version of Amber18[22,38] coupled to Gaussian16[39] for density functional theory calculations. The cutoff radii used for all the QM/MM interactions was of 15 Å. For mechanistic determinations, the QM region included the side chains of the catalytic dyad (His41 and Cys145), the water molecule involved in the reaction mechanism, and the backbone atoms of residues P1 and P1′ in the 11a inhibitor. Any other atom was described as an MM level as explained in the classical molecular dynamic section. The integration time step in QM/MM simulations was of 1 fs. Because hydrogen atoms are involved in the reaction mechanism, we checked that the string method converged to the same MFEP when a time step of 0.5 fs was used (see Figure S1).

Scheme 1

Representation of the CVs Used in the Exploration of the Reaction Mechanism

For the study of the proton transfer within the catalytic dyad, that is, from Cys145 to His41, the antisymmetric combination of the distances of the proton to the donor and acceptor atoms [d(Sγ–H)-d(Nε–H)] was employed to trace the free-energy profile using umbrella sampling.[31] The integration was carried out using the WHAM method. In this case, only the side chains of the two involved residues were included in the QM region (described at the B3LYPD3/6-31+G* level of theory with D3 corrections). A total of 40 windows evenly spaced each at 0.06 Å were used to cover the whole range of the reaction coordinate. A force constant of 600 kcal·mol–1Å–2 was employed to drive the reaction coordinate. Remaining details of the simulations were as described previously.

Results and Discussion

Rotameric State of the Catalytic Histidine

Our simulations of the noncovalent complex formed by 11a were carried out using the X-ray coordinates of the hemithioacetal complex (PDB file 6LZE) as a starting point. In agreement with the X-ray structure, we simulated the inhibitor in the active sites of the two protomers (A and B) of the dimeric enzyme. The X-ray structure was modified lengthening the Sγ-C distance between the enzyme and the inhibitor. The catalytic dyad (Cys145 and His41) was modeled in the neutral state, which is the most stable form for the Michaelis and E–I complexes.[16,18,37] As explained before, in the 6LZE X-ray structure, the catalytic histidine is found in the δ-rotameric state. With the help of Pocketome,[40] we explored 91 different X-ray structures of the 3CL main protease in the apo form containing several covalent and noncovalent inhibitors (see Table S1). In 87.5% of the active sites, including all those corresponding to the apo enzyme and most of the inhibited enzymes, His41 was found in the ε-rotameric state, while less than 20% presented the δ-rotameric state. We modeled the noncovalent complex of the 3CL protease with 11a with the two His41 rotamers (see Figure a,b) and ran 1 μs long MD simulations. In principle, both rotameric states could be productive as far as the two states His41 and Cys145 form a strong hydrogen bond, which can eventually lead to a proton transfer and the formation of the catalytic dyad IP. Figure c shows that the probability distribution of Nε(His41)-Sγ(Cys145) distances obtained from MD simulations of the 11a noncovalent complex with the two rotameric states are almost indistinguishable and the most probable distance (3.4 Å) corresponded to a hydrogen-bonded dyad. In order to determine the relative stability of the two rotameric states, we traced the free-energy profile associated to the rotation of His41 around the Cβ–Cγ bond in the apo form and in the noncovalent complex (E–I) with 11a. The free-energy profiles obtained after 2 μs of classical MD simulations are shown in Figure d. According to these results, both in the apo form and in the presence of the 11a inhibitor, the ε-rotamer is more stable than the δ-rotamer by 2.2 and 3.2 kcal·mol–1, respectively. This result agrees with the observation that the ε-rotamer predominates in the X-ray structures of the 3CL protease, as explained above. However, the free-energy difference between the two rotameric states is not too high, in particular, for the apo form, which could explain that the δ-rotamer is found in 12.5% of the active sites of the SARS-CoV-2 3CL protease, as reported in Table S1.

Figure 2

Rotameric state of the catalytic His41. (2a) Representation of the noncovalent complex of 11a (carbon atoms in green) with SARS-CoV-2 3CL protease with His41 in the ε-rotameric state, (2b) representation of the complex presenting the δ-rotameric state of His41, and (2c) probability densities of the distances from the Cys145-Sγ atom to the Nε atom of His41for the ε- (blue) and δ-rotameric state (red). (2d) Free-energy profile associated to the rotation around the Cγ–Cβ bond of His41, from the ε-rotamer (left) to the δ-rotamer (right) in the apo enzyme (red) and in the noncovalent complex with 11a (blue).

Noncovalent Enzyme–Inhibitor Complex

According to the results of the previous section, we selected the ε-rotameric state for all subsequent simulations of the complex formed between the 3CL protease of SARS-CoV-2 and 11a. The resulting MD simulations of the noncovalent E–I complex (4 replicas of 1 μs each) were stable in all cases (see RMSD time evolutions in Figure S2), showing a binding pose consistent with the X-ray structure (see Figure a). In this pose, P1–P3 sites of the inhibitor present an interaction pattern similar to that of a peptide substrate with the sequence -Val-Leu-Gln|Ser- (the vertical line indicates the scissile bond).[16]Figure b compares the fraction of hydrogen bonds between inhibitor/peptide sites and enzymatic residues during the MD simulations of the inhibitor and of the peptide substrate. The γ-lactam ring at the P1 position is frequently found in inhibitors of SARS-CoV-2 and SARS-CoV main proteases because this moiety is expected to mimic Gln at the P1 position of the peptide substrate, which is a requirement of these enzymes. This γ-lactam ring can form hydrogen bonds with His163, Glu166, and Phe140, displaying an interaction pattern similar to that of Gln-P1. The cyclohexyl group at the P2 position of the inhibitor stacks with the imidazole ring of His41 and also presents interactions with other nearby residues, such as Met165. Inhibitors of the SARS-CoV-2 main protease present hydrophobic groups at the P2 position, similarly to the side chain of Leu-P2 in the peptide substrate. Finally, the indole group at P3 is exposed to the solvent and stabilized by hydrogen-bond interactions with main-chain atoms of Glu166.

Figure 3

Noncovalent complex formed between the aldehyde derivative 11a and the 3CL protease of SARS-CoV-2. (3a) Binding pose of the inhibitor in the active site of the protease, showing the location of the catalytic dyad and the oxyanion hole. Note that the aldehyde oxygen points to the oxyanion hole. (3b) Fraction of hydrogen-bond contacts between residues of 11a and a peptide substrate[16] and those of the protease. A hydrogen-bond contact is counted when the donor–acceptor distance is <3.8 Å and the hydrogen-bond angle is >120°. (3c) Probability densities of the distances from the Cys145-Sγ atom to the C carbon atom of the substrate, in blue, and from the Nε atom of His41 to the aldehyde oxygen atom, in red. The chemical step in SARS-CoV-2 3CL protease requires a proton transfer from Cys145 to His41 to form a catalytic dyad IP.[16,37] As presented above, in the case of the noncovalent E–I complex, these two residues are kept at hydrogen-bond distances during a significant fraction of the simulation (see Figure c for one replica; results for other replicas are presented in Figure S3). After this proton transfer, the covalent inhibition of the enzyme should proceed with the formation of the corresponding hemithioacetal (the E–I complex) that requires the nucleophilic attack of the Sγ atom to the aldehyde carbon atom and the protonation of the oxygen atom. Figure c shows the distribution of distances between the Cys145 Sγ atom and the aldehyde carbon atom. The distribution shows a bimodal shape, with two peaks centered at 3.3 and 5.1 Å that correspond to the trans and gauche conformations of the Cys145 side chain, respectively. Thus, in this pose, the aldehyde carbon atom of the inhibitor can be found already at short enough distances to suffer the nucleophilic attack by the Sγ atom of Cys145. The position of the aldehyde oxygen atom is stabilized by means of hydrogen-bond interactions with the main-chain NH groups of Cys145 (2.4 ± 0.3 Å), Ser144 (2.8 ± 0.4 Å), and Gly143 (2.3 ± 0.3 Å), as seen in Figure a. These three residues form the oxyanion hole, placed in a U-turn of the loop connecting β10 and β11 and that closes one of the sides of the active site. These interactions are also observed in the X-ray structure of the (S)-hemithioacetal product (see Figure b).[10] In this pose, the aldehyde oxygen atom is placed far from the Nε atom of His41, as confirmed by the probability distribution of Nε–O distances, peaked at 6.3 Å (see Figure c). This separation precludes a direct proton transfer between these two atoms after the formation of the IP. However, as discussed below, once the IP is formed, a water molecule can be accommodated in between the catalytic histidine and the aldehyde group of the inhibitor, facilitating a water-mediated proton transfer from His41 to the aldehyde oxygen atom to form the hemithioacetal. The proposed mechanism, with a water-mediated proton transfer, could also explain the reactivity of other aldehyde inhibitors to form the (S)-hemithioacetal, which is the enantiomer more frequently observed in the X-ray structures (see Table S1). X-ray structures of other (S)-hemithioacetal complexes show a similar pose for all of them, with the aldehyde oxygen atom pointing to the oxyanion hole and thus far enough from the catalytic histidine for a direct proton transfer. However, there is one case where the (R) enantiomer of the product has been observed: the 6WTT X-ray structure contains a (R)-hemithioacetal in one of the three protomers of the asymmetrical unit and the (S) configuration in the other two.[14] The (R) enantiomeric form can be obtained if the aldehyde group of the inhibitor rotates around the C–C(O) bond before the nucleophilic attack by Cys145. In that case, the aldehyde oxygen atom would point toward the catalytic His41 instead of toward the oxyanion hole and the distance to the Nε atom would become small enough to allow a direct proton transfer between them (see Figure a). While this binding pose of the inhibitor simplifies the reaction mechanism, it is not the most stable for 11a, as demonstrated by the free-energy profile obtained for the rotation of the aldehyde group around the C–C(O) bond in the active site. First, in order to test the reliability of our parametrization to describe this rotation of the substrate, we performed gas-phase B3LYPD3/6-31+G* single-point energy calculations on minimum energy structures extracted from a dihedral potential energy scan. In the gas phase, the pro-(R) structure was found to be 1.1 kcal·mol–1 more stable than the pro-(S) because of an intramolecular interaction between the aldehyde oxygen and the amide NH group, while at the MM level, the energy difference was found to be very similar, 1.6 kcal·mol–1. This calculation confirms the adequacy of the forcefield to represent the isomerization process. Then, we obtained the isomerization free-energy profile in the enzyme (see Figure b), as described in the Methodology section. According to this, the pro-(S) conformation is 5.5 kcal·mol–1 more stable than the pro-(R) conformer in the enzyme because in the first case, the aldehyde oxygen atom is accommodated in the oxyanion hole, while in the second one, the aldehyde oxygen atom points toward His41. It must be noticed that rotation of the aldehyde group of the inhibitor in the enzymatic active site could be facilitated after the formation of the IP, when the Nε atom of His41 is protonated. This rotation would explain that the (R)-hemithioacetal product can also be formed, as discussed above. Note that the preference for one or another orientation of the oxygen atom can be modulated through changes in the chemical structure of the inhibitor, for example, substituting the aldehyde hydrogen atom by other groups able to interact more favorably with the oxyanion hole. This seems to be the case of boceprevir, a α-ketoamide inhibitor where the aldehyde hydrogen is substituted by an acetamide group. In this case, the X-ray structure (PDB code 7BRP)[41] shows that the acetamide group is accommodated in the oxyanion hole while the ketone oxygen atom points to the catalytic His41, facilitating a reaction mechanism with a direct proton transfer from the catalytic histidine to the carbonyl oxygen atom.

Figure 4

Orientation of the aldehyde group in the noncovalent complex. (4a) Representation of the E–I complex with inhibitor 11a in the pro-(R) conformation. (4b) Classical free-energy profile for the rotation of the aldehyde group from the pro-(S) conformer (left) to the pro-(R) conformer (right).

Formation of the (S)-Hemithioacetal Complex

Chemical transformations in the active site of 3CL protease are initiated with the proton transfer from Cys145 to His41 to form an ionized catalytic dyad (the IP) from which the reaction proceeds. We traced the QM/MM free-energy profile for the proton transfer from Cys145 to His41 using as a distinguished coordinate the antisymmetric combination of proton distances to the donor and acceptor atoms [d(Sγ–H)-d(Nε–H), see Figure a]. Starting from the E–I complex, the free-energy cost of forming the IP is 9.3 kcal·mol–1, a value slightly larger than that obtained when a peptide substrate is present in the active site (4.8 kcal·mol–1).[16] The value obtained for 11a is similar to those found for the formation of the IP under the presence of other inhibitors: 7.3 kcal·mol–1 with an α-ketoamide inhibitor[20] and 10.3 with a Michael acceptor.[18] These values are systematically larger than the free-energy cost evaluated for the apo enzyme, which was found to be 2.9 kcal·mol–1 in two different studies,[16,20] indicating that desolvation of the active site upon ligand binding can destabilize the IP form. Diminution of this energy penalty through ligand design could be a promising strategy to improve inhibition binding and kinetics.

Figure 5

Proton transfer from Cys145 to His41 in SARS-CoV-2 3CL protease. (5a) Free-energy profile for the transformation from the neutral catalytic dyad (left) to the IP (IP, right) in the E–I complex with 11a (gray line) and in the complex with the peptide substrate[16] (blue line). The reaction coordinate (RC) is the antisymmetric combination d(Sγ–H)-d(Nε–H). (5b) IP structure with 11a showing the presence of a water molecule in between the protonated histidine and the inhibitor aldehyde group. After the formation of the IP, the inhibition reaction must proceed with the attack of the activated nucleophile (the Sγ atom of Cys145) on the aldehyde carbon atom and the proton transfer from the Nε atom of His41 to the aldehyde oxygen atom. As explained before, the distance between these two atoms in the IP is too large for a direct proton transfer. However, MD simulations show that a water molecule can be placed in between His41 and the aldehyde group, attracted by the large dipole moment associated to the IP (see Figure b). This water molecule can act as a proton relay, accepting a proton from His41 and donating a proton to the aldehyde oxygen atom to form the hydroxyl group of the hemithioacetal. Starting from the structure presented in Figure b, we obtained the corresponding QM/MM MFEP for the formation of the (S)-hemithioacetal product, the enantiomer experimentally observed for inhibitor 11a. Figure a shows the B3LYPD3/6-31+G*/MM free-energy profile associated to the reaction from IP to the covalent E–I complex, while Figure b shows the evolution of key distances along this MFEP (see also Scheme ). The reaction begins with the approach of the Sγ atom to the aldehyde carbon atom, reducing the Sγ-C distance from 2.95 to 2.18 Å at the first transition state (TS1, shown in Figure S4). Then, the water molecule is accommodated in between the proton donor His41 and the proton acceptor aldehyde oxygen. The rate-limiting TS is TS2 (see Figure c) where the formation of the S–C bond (presenting a distance of 1.94 Å) is accompanied by lengthening of the aldehyde double bond (from 1.22 to 1.38 Å), due to the charge transfer from Cys145 to the aldehyde oxygen atom to form an oxyanion. This charge transfer triggers a concerted proton transfer from the water molecule to the oxyanion and from the protonated His41 to the water molecule. These coupled proton transfer events are not completely synchronous because at the TS, the distance of the transferred proton to the aldehyde oxygen is 1.20 Å, while the distance from the proton transferred from His41 to the oxygen water molecule is slightly longer, 1.38 Å. The process continues downhill up to the products (see Figure d), completing the formation of the Sγ–C bond (1.85 Å) and the lengthening of the aldehyde bond distance up to a typical value for a single C–O bond (1.42 Å), and the proton transfers from the water molecule to the hemithioacetal and from His41 to the water molecule (see Figure d). The small barrier observed at the end of the reaction path corresponds to the rotation of the hemithioacetal hydroxyl group to fit into the oxyanion hole.

Figure 6

Formation of the (S)-hemithioacetal product from the IP. (6a) B3LYPD3/6-31+G*/MM free-energy profile along the path-CV for the formation of the covalent E–I complex from the IP. (6b) Evolution of the selected CVs along the MFEP. The color code corresponds to Scheme . (6c) Representation of the rate-determining TS (TS2). The values of the distances correspond (in Å) to the coordinates of the MFEP where the TS is located. (6d) Representation of the (S)-hemithioacetal complex. (6e) Overlap of the TS2 structure (balls and sticks with carbon atoms in orange) with the X-ray structure 6XHM containing a hydroxymethylketone inhibitor PF-00835231 (licorice with carbon atoms in light blue). Note that the hydroxyl group of the inhibitor is placed at a position equivalent to the water molecule in the TS of the inhibition by 11a. (6f). Overlap of the product structure obtained from our simulations (balls and sticks with carbon atoms in orange) with the X-ray structure 6LZE (licorice with carbon atoms in light blue). The plausibility of the proposed water-mediated mechanism can be confirmed comparing the structure obtained for the reaction TS with the X-ray structure of the protein with the PF-00835231 inhibitor (PDB code 6XHM).[12] This recently proposed inhibitor of the SARS-CoV-2 main protease presents a ketone group as a warhead and the aldehyde hydrogen atom has been substituted by a hydroxymethyl group. As observed in Figure e, the hydroxyl moiety of the PF-00835231 inhibitor in the 6XHM structure occupies exactly the same position as the bridging water molecule in our TS. Then, the PF-00835231 inhibitor would be a perfect transition-state analogue of protease inhibition by aldehydes. On the one hand, this structural agreement confirms the role of the water molecule in the inhibition process by aldehydes and, on the other hand, suggests that in the case of the hydroxymethylketone inhibitor, the proton relay role of the water molecule could be played by the hydroxyl group. Finally, as a further confirmation of our mechanism, the structure obtained for the final product nicely overlaps with that corresponding to the (S)-hemithioacetal complex in the X-ray structure 6LZE (see Figure f). Combining the free-energy profiles presented in Figures and 6, we can obtain a full picture of the transformation from the noncovalent E–I complex formed between the SARS-CoV-2 protease and the inhibitor 11a to the covalent E–I complex, the (S)-hemithioacetal. A representation of the complete free-energy profile is given in Figure . The total reaction free energy can be obtained combining the free-energy cost of forming the IP and the free-energy change from the IP to the reaction product, E–I. The first term was estimated to be 9.3 kcal·mol–1 (see Figure a), while the free-energy difference between E–I and IP (Figure a) is −12.1 kcal·mol–1. Then, our simulations predict that the free energy of the covalently bonded E–I complex relative to the noncovalent E–I complex is −2.8 kcal·mol–1. This reaction free energy is significantly smaller, in absolute value, than the value obtained for the N3 inhibitor (−15.4 kcal·mol–1),[18] an irreversible inhibitor of the 3CL protease.[8] Thus, according to our simulations, inhibition of SARS-CoV-2 3CL protease with 11a would be significantly more reversible than that with N3. In fact, aldehyde derivative inhibitors have been proposed to act via a reversible formation of the hemithioacetal,[11] which is now confirmed by our simulations. It must be noticed that reversibility can be a desired feature of cysteine protease inhibitors, in particular, when extended therapy periods are required.[42]

Figure 7

Representation of the complete free-energy profile for the inhibition of 3CL SARS-CoV-2 protease with 11a, including IP formation and the formation of the covalent E–I complex (product P in the figure). Regarding the kinetics of the process, the first-order rate constant for covalent inhibition of the main protease by 11a is determined by the free-energy difference between the rate-determining states,[43] the highest energy TS (TS2) and the noncovalent E–I complex. Accordingly, our free-energy simulations provide the activation free energy as the sum of two contributions: the free-energy cost of forming the reactive IP from E–I (9.3 kcal·mol–1, Figure a) plus the free energy of the TS2 relative to IP (9.2 kcal·mol–1, Figure a). This gives in a total activation free energy of 18.5 kcal·mol–1, as shown in Figure . While the inhibition rate constant of SARS-CoV-2 protease by 11a has not been experimentally determined, the rate constant for the inhibition of the protease with GC376 (the prodrug of the aldehyde inhibitor GC373) has been measured to be 2.45 × 10–3 s–1 at 30 °C.[14] According to transition state theory, this is equivalent to an activation free energy of 21.1 kcal·mol–1. In spite of the differences between both inhibitors, 11a and GC373, they present the same warhead and the same group at the P1 position (see Figure a), suggesting that the activation free energy derived for GC373 could be a reasonable reference for 11a and thus for our simulations. The good agreement between the experimental observations made for GC373 (21.1 kcal·mol–1) and our theoretical predictions for 11a (18.5 kcal·mol–1), together with the structural evidences discussed above, strongly supports our mechanistic proposal for the inhibition of SARS-CoV-2 3CL protease not only by 11a but also by other aldehyde and ketone derivatives.

Conclusions

We have used a combination of classical MM and hybrid QM/MM molecular dynamics simulations to explore the inhibition mechanism of the SARS-CoV-2 3CL protease by an aldehyde derivative, 11a, selected as an example of this promising family of inhibitors. Starting from the X-ray structure of the hemithioacetal complex, we have explored the binding mode of the 11a inhibitor in the noncovalent E–I complex with the protease. In agreement with X-ray observations, our MD simulations show that there are two possible rotamers for the catalytic histidine, depending on the internal rotation around the Cβ–Cγ bond. Both for the apo form and for the noncovalent complex formed between the 3CL protease and the 11a inhibitor, we found that the rotamer where the Nε atom of His41 lies closer to the inhibitor is more stable. Regarding the inhibitor, the orientation of the aldehyde group is determined by the interaction of the oxygen atom with the oxyanion hole, favoring the formation of the (S)-enantiomer of the product. This configuration of the protein and the inhibitor facilitates a reaction mechanism where, after the formation of the catalytic dyad IP, a proton is transferred from His41 to the aldehyde oxygen atom through a water molecule. Other reaction mechanisms, involving different proton transfer routes and/or leading to the (R) form of the product, can be feasible if other conformations are populated. This conclusion is supported by the analysis of the X-ray structures available for the inhibited 3CL protease. Our simulations of the noncovalent complex with 11a show that P1–P3 groups of the inhibitor establish well-defined interactions in the active site that closely mimic those of the peptide substrate. The aldehyde group is placed close to Cys145, facilitating the nucleophilic attack of the Sγ atom on the aldehyde carbon atom while the oxygen atom is stabilized by interactions with NH main-chain groups that constitute the oxyanion hole. The reactivity of the SARS-CoV-2 main protease is triggered after the proton transfer from Cys145 to His41, which are kept at hydrogen-bond distances in the E–I complex. After IP formation, the inhibition reaction proceeds by means of the nucleophilic attack of the activated nucleophile, the unprotonated Cys145, and a water-mediated proton transfer from the protonated His41, yielding the (S)-hemithioacetal product. In our QM/MM minimum free-energy profile, the activation free energy is 18.5 kcal·mol–1, in good agreement with the value experimentally derived for a similar aldehyde inhibitor (21.1 kcal·mol–1). In addition, our reaction profile indicates that the process is moderately exergonic, in concordance with the proposed reversibility for the protease inhibition by aldehydes. This study illustrates the importance of molecular simulations to unravel the reaction mechanisms and to guide the design of possible inhibitors. The binding pose observed in X-ray structures can be not enough to figure out the mechanistic details of the inhibition process when conformational rearrangements in the protein, the solvent, and/or the inhibitor are required for the process to take place. In addition, apparently small modifications in the chemical structure of the inhibitor can change the binding pose, offering different mechanistic ways for the inactivation of the enzyme. The consideration of the mutual coupling observed between the active site, solvent, and inhibitor dynamics along the reaction path is an undeniable challenge for the design of selective and efficient inhibitors that can be approached by means of adequate simulation methods.

11 in total

Review 1. Methodology-Centered Review of Molecular Modeling, Simulation, and Prediction of SARS-CoV-2.

Authors: Kaifu Gao; Rui Wang; Jiahui Chen; Limei Cheng; Jaclyn Frishcosy; Yuta Huzumi; Yuchi Qiu; Tom Schluckbier; Xiaoqi Wei; Guo-Wei Wei
Journal: Chem Rev Date: 2022-05-20 Impact factor: 72.087

Review 2. The SARS-CoV-2 main protease (M^pro): Structure, function, and emerging therapies for COVID-19.

Authors: Qing Hu; Yuan Xiong; Guang-Hao Zhu; Ya-Ni Zhang; Yi-Wen Zhang; Ping Huang; Guang-Bo Ge
Journal: MedComm (2020) Date: 2022-07-14

3. An intermolecular salt bridge linking substrate binding and P1 substrate specificity switch of arterivirus 3C-like proteases.

Authors: Qian Chen; Junwei Zhou; Zhixiang Yang; Jiahui Guo; Zimin Liu; Xinyi Sun; Qingshi Jiang; Liurong Fang; Dang Wang; Shaobo Xiao
Journal: Comput Struct Biotechnol J Date: 2022-06-30 Impact factor: 6.155

Review 4. Progress on SARS-CoV-2 3CLpro Inhibitors: Inspiration from SARS-CoV 3CLpro Peptidomimetics and Small-Molecule Anti-Inflammatory Compounds.

Authors: Jiajie Zhu; Haiyan Zhang; Qinghong Lin; Jingting Lyu; Lu Lu; Hanxi Chen; Xuning Zhang; Yanjun Zhang; Keda Chen
Journal: Drug Des Devel Ther Date: 2022-04-08 Impact factor: 4.319

5. Key dimer interface residues impact the catalytic activity of 3CLpro, the main protease of SARS-CoV-2.

Authors: Juliana C Ferreira; Samar Fadl; Wael M Rabeh
Journal: J Biol Chem Date: 2022-05-11 Impact factor: 5.486

6. Testing Affordable Strategies for the Computational Study of Reactivity in Cysteine Proteases: The Case of SARS-CoV-2 3CL Protease Inhibition.

Authors: Carlos A Ramos-Guzmán; José Luis Velázquez-Libera; J Javier Ruiz-Pernía; Iñaki Tuñón
Journal: J Chem Theory Comput Date: 2022-05-13 Impact factor: 6.578

7. Insights into the binding and covalent inhibition mechanism of PF-07321332 to SARS-CoV-2 M^pro.

Authors: Son Tung Ngo; Trung Hai Nguyen; Nguyen Thanh Tung; Binh Khanh Mai
Journal: RSC Adv Date: 2022-01-28 Impact factor: 3.361

8. Antiviral activities of natural compounds and ionic liquids to inhibit the Mpro of SARS-CoV-2: a computational approach.

Authors: Kandhan Palanisamy; S M Esther Rubavathy; Muthuramalingam Prakash; Ramasamy Thilagavathi; Maryam S Hosseini-Zare; Chelliah Selvam
Journal: RSC Adv Date: 2022-01-28 Impact factor: 3.361

9. Inhibition Mechanism of SARS-CoV-2 Main Protease with Ketone-Based Inhibitors Unveiled by Multiscale Simulations: Insights for Improved Designs*.

Authors: Carlos A Ramos-Guzmán; J Javier Ruiz-Pernía; Iñaki Tuñón
Journal: Angew Chem Int Ed Engl Date: 2021-11-03 Impact factor: 15.336

Review 10. Haste makes waste: A critical review of docking-based virtual screening in drug repurposing for SARS-CoV-2 main protease (M-pro) inhibition.

Authors: Guillem Macip; Pol Garcia-Segura; Júlia Mestres-Truyol; Bryan Saldivar-Espinoza; María José Ojeda-Montes; Aleix Gimeno; Adrià Cereto-Massagué; Santiago Garcia-Vallvé; Gerard Pujadas
Journal: Med Res Rev Date: 2021-10-26 Impact factor: 12.388