Literature DB >> 35116157

Unbinding ligands from SARS-CoV-2 Mpro via umbrella sampling simulations.

Nguyen Minh Tam1,2, Trung Hai Nguyen2,3, Vu Thi Ngan4, Nguyen Thanh Tung5,6, Son Tung Ngo2,3.   

Abstract

The umbrella sampling (US) simulation is demonstrated to be an efficient approach for determining the unbinding pathway and binding affinity to the SARS-CoV-2 Mpro of small molecule inhibitors. The accuracy of US is in the same range as the linear interaction energy (LIE) and fast pulling of ligand (FPL) methods. In detail, the correlation coefficient between US and experiments does not differ from FPL and is slightly smaller than LIE. The root mean square error of US simulations is smaller than that of LIE. Moreover, US is better than FPL and poorer than LIE in classifying SARS-CoV-2 Mpro inhibitors owing to the reciever operating characteristic-area under the curve analysis. Furthermore, the US simulations also provide detailed insights on unbinding pathways of ligands from the binding cleft of SARS-CoV-2 Mpro. The residues Cys44, Thr45, Ser46, Leu141, Asn142, Gly143, Glu166, Leu167, Pro168, Ala191, Gln192 and Ala193 probably play an important role in the ligand dissociation. Therefore, substitutions at these points may change the mechanism of binding of inhibitors to SARS-CoV-2 Mpro.
© 2022 The Authors.

Entities:  

Keywords:  SARS-CoV-2 Mpro; SMD; free energy; umbrella sampling; unbinding pathway

Year:  2022        PMID: 35116157      PMCID: PMC8790385          DOI: 10.1098/rsos.211480

Source DB:  PubMed          Journal:  R Soc Open Sci        ISSN: 2054-5703            Impact factor:   2.963


Introduction

SARS-CoV-2 effectuates the COVID-19 pandemic. The first human infection with severe acute respiratory syndromes was reported in Wuhan, Hubei Province, China in December 2019 [1-4]. Since then, it has caused millions of deaths worldwide. Bats have been suggested to be the carrier of SARS-CoV-2 before it jumped to and quickly transfected among humans [5]. The spreading rate is extremely high because it can be airborne [6]. In just 1 year, more than 130 million people have been infected despite international effort to contain the spread of the virus. Therefore, it is of great urgency to develop an effective therapy for community health. The approval of remdesivir as an anti-viral treatment for COVID-19 [7] may be considered as a rather controversial decision [8] because it showed poor efficacy in clinical trials [9,10]. Moreover, although two vaccines for SARS-CoV-2 have recently been approved for emergency use by the US Food and Drug Administration [11], using the vaccines to create COVID-19 herd immunity may not be possible [12]. Furthermore, especially, recent reports on new variants of SARS-CoV-2 appearing in the UK (B1.1.7) and South Africa (B1.351) can escape the neutralizing antibodies [13,14]. The main protease (Mpro) of SARS-CoV-2, a 3-chymotrypsin-like protease (3CLpro), is characterized as a pivotal enzyme associated with viral proliferation and replication [15,16]. The protease cleaves the polyproteins to polypeptides corresponding to the formation of 13 non-structural proteins (nsp) [17]. Therefore, the SARS-CoV-2 Mpro is indicated as the ultra-potent target for drug design against SARS-CoV-2 because inhibiting enzymatic activity of the protease can stop the virus from replicating. The SARS-CoV-2 Mpro is a homodimer enzyme comprising of two monomers, each monomer consists of 306 residues separating into three domains I, II and III (cf figure 1a) [18]. In particular, the substrate binding site of the SARS-CoV-2 Mpro contains two residues, His41 and Cys145 located in the cleft formed by domains I and II [19,20]. Strong binding inhibitors to the protease normally adopt rigid contacts to these residues [21]. Furthermore, the residues Thr26, Ser46, Asn142, Gly143, His164, Glu166 and Gln189 were found to be crucial for the ligand-binding process [21]. Furthermore, although the two monomers interact with each other via domain II and crucially associate with the catalytic activity since the active site conformation strongly depends on the dimerization [18], the ligand-binding affinity of ligands to the Mpro in silico could reliably obtained from calculations performed with only the monomer instead of the dimer [22].
Figure 1

(a) The monomeric SARS-CoV-2 Mpro + Narlaprevir (PDB ID: 7JYC), in which, the protease was highlighted with three domains I, II and III; (b) starting shapes of SARS-CoV-2 Mpro + 11b in SMD/US simulations.

(a) The monomeric SARS-CoV-2 Mpro + Narlaprevir (PDB ID: 7JYC), in which, the protease was highlighted with three domains I, II and III; (b) starting shapes of SARS-CoV-2 Mpro + 11b in SMD/US simulations. Computer-aided drug design (CADD) has emerged as a promising approach for screening and optimizing potential inhibitors for the biological activity of an enzyme [23-25]. In CADD, many computational approaches have been employed to estimate the binding free energy, ΔGbind, of ligands to proteins [26-48], which is associated with the experimental inhibition constant/half-maximal inhibitory concentration metrics [23,49-51]. Determination of the ΔGbind thus is a very important issue in CADD [52,53]. In molecular dynamics (MD) simulations, a computational free energy approach is normally selected based on the balance between the computation accuracy and the consumed central processing unit resources [54-58]. Popular calculation approaches include end-to-end schemes such as the linear interaction energy (LIE) [59], molecular mechanism/Poisson-Boltzmann surface area (MM/PBSA) [34], the free energy perturbation [41], and thermodynamic integration [60] methods. These methods directly provide the binding free energy. However, the binding/unbinding pathways also provide crucial insights into the binding process of ligands to proteins. Therefore, in this work, a combination of the fast pulling of ligand (FPL) [36] and umbrella sampling (US) [61,62] simulations were employed to unbind a ligand from SARS-CoV-2 Mpro. During the simulations, the ΔGbind value and details of unbinding pathway are characterized. In particular, the metric ΔGbind is calculated as the difference in the potential of mean force (PMF) between bound and unbound states of the complexes. The PMF value is obtained by the weighted histogram analysis method (WHAM) [63] calculations. The unbinding pathway is able to estimate via the collective-variable free energy landscape (FEL) [64]. The representative structures of the complexes can be determined using the clustering method [65]. The obtained results revealed that the combination of FPL/US simulations was useful for accurately evaluating potential inhibitors for SARS-CoV-2 Mpro.

Material and methods

Structure of receptor and ligand

Twenty four three-dimensional structures of the solvated SARS-CoV-2 Mpro + inhibitor were reported in the previous work [22], in which the structure of SARS-CoV-2 Mpro was downloaded from the protein data bank with identification of 6Y2F [66] and structure of ligands downloaded from the PubChem database [67]. Details of the ligands are reported in the electronic supplementary material, table S1.

Atomistic simulations

GROMACS v. 5.1.5 [68] was used to carry out atomistic simulations. As mentioned above, the initial conformations of complexes (cf. figure 1b) were obtained from previous work [22]. Force field parameters for the protein + ions, water molecules and ligand were taken from the Amber99SB-ILDN force field [69], TIP3P water model [70] and general Amber force field [71], respectively. The MD parameters were chosen according to the previous work [21]. Particularly, the integration was carried out with a timestep of 2 fs. Cut-off for non-bonded interaction was set at a distance of 0.9 nm. The electrostatic interactions were computed using the Particle Mesh Ewald method [72], whereas the van der Waals interactions were calculated with the cut-off scheme. The solvated complexes were relaxed using energy minimization, constant number of particles, volume and temperature (NVT) and constant number of particles, volume and pressure (NPT) simulations before the steered-MD was applied.

Fast pulling of ligand simulations

The NPT-equilibrated complex conformation was used as a starting structure of FPL calculations. During an FPL simulation the centre of mass of the ligand was pulled along the reaction coordinate, ξ, via an external-harmonic force. In particular, the spring constant of pulling force is k = 600 kJ mol−1 nm−2 and the pulling velocity is v = 0.005 nm ps−1. An FPL calculation was during an interval of 0.7 ns. The pulling parameters were referred to in the previous works [36,73]. The systemic coordinate, pulling force and pulling work were recorded every 0.1 ps.

Umbrella sampling simulations

The optimized FPL trajectory, whereas the deviation of the pulling work to the mean is the smallest value, was used to generate US windows. The displacement of ligands between neighbouring windows is ca 0.1 nm. The PMF was estimated from US simulations with 21 windows and length of 2 ns per window. It should be noted that NPT simulations were performed with length of 0.1 ns to evade any staring bias. Moreover, the C atoms were positionally restrained via a weak harmonic force to prevent the enzymic reorientation and translation during both FPL and US simulations. It helped reduce the computational effort during US simulations because we didn't have to thoroughly sample the whole protein conformational space which may be huge. The approach is reasonable because the protein conformation does not change significantly upon ligand binding.

Analysed tools

The two-dimensional FEL was constructed using the number of contacts within 0.45 nm between protein-ligand and ligand displacement as two reaction coordinates [64,65]. Representative structures of complexes localizing in the FEL minima were predicted using the clustering methods [65]. The WHAM [74] was used to estimate the PMF. The computed error was estimated using a bootstrapping method [75]. The Scikit-Learn library [76] was used to compute the receiver operating characteristic–area under the curve (ROC-AUC). In particular, the set of ligands were sorted and split into two groups according to their experimental binding free energy, ΔGEXP. Namely, strong binders are ligands having ΔGEXP < −7.60 kcal mol−1 whereas weak binders are those with ΔGEXP ≥ −7.60. The Python code for estimating errors using the bootstraping method and for calculating ROC-AUC is included in the electronic supplementary material.

Results

We can roughly classify proteins into two groups based on the geomorphology of their active sites. The first group includes proteins having a ‘close’ active site which requires protein structural change to allow a ligand to bind/unbind. The second one consists of proteins having an ‘open’ active site permitting its ligand to move in/out without protein conformational change. SARS-CoV-2 Mpro belongs to the second group with an ‘open’ active site that we can use pathway-based binding affinity approaches, such as FPL [36], US [62] and non-equilibrium molecular dynamics (NEMD) [39,40,77] simulations, to evaluate the ligand-binding affinity besides end-to-end free energy approaches. FPL was shown to be suitable for rapidly estimating the binding affinity of inhibitors to SARS-CoV-2 Mpro [22,58,73]. However, the obtained values cannot directly estimate the binding free energy, ΔG. The approach also cannot evaluate the unbinding pathway of ligands owing to the limited sampling that was generated. ΔG and details of the unbinding pathway can be calculated via NEMD, but the approach requires huge computing resources. US simulations, an enhanced sampling method, emerges as a potential technique to complete this task force. Initial conformations used for US simulations were generated by FPL simulations with a window spacing of ca 0.1 nm. During FPL, the ligand displacement was recorded as shown in figure 2. Inhibitors remain in the binding cavity until the pulling force reaches a maximum value, called rupture force. Inhibitors then constantly dissociate from the binding cleft. In particular, more details about FPL results are described in the electronic supplementary material, tables S2–S4. The results obtained in this study are consistent with the recent works [21,22].
Figure 2

Displacement of ligand 11b during FPL calculations.

Displacement of ligand 11b during FPL calculations. Coordinates of solvated complexes were recorded during FPL simulations every 0.1 ps that the US windows were then extracted over the optimized FPL trajectory. The PMF values along the reaction coordinate, ξ, (cf. figure 3; electronic supplementary material, table S5) were calculated using the WHAM [74], whereas the US histograms are reported in the electronic supplementary material, table S6. The shape of the free energy profile is very consistent with previous works [48,78]. In particular, the PMF features a deep minimum at short distance and a plateau region at large distance where the non-bonded interactions between protein and ligand vanish. The observation is in good agreement with the FEL analysis as mentioned below. A free energy profile is given in figure 3 and the values of calculated binding free energy are given in table 1.
Figure 3

(a) The free energy profile via US simulations during the unbinding process of inhibitor 11b out of SARS-CoV-2 Mpro; (b) pulling force via SMD simulations during the unbinding process of inhibitor 11b out of SARS-CoV-2 Mpro.

Table 1

US binding free energy in comparison with experiments.

N0nameΔGUSΔGEXPa
17J−6.08 ± 0.61−8.69
211a−7.81 ± 0.91−9.96
311b−8.07 ± 1.29−10.13
411r−8.63 ± 0.78−9.23
513a−6.22 ± 0.58−7.70
613b−4.56 ± 0.66−8.45
7calpain inhibitor I−4.51 ± 1.01−6.94
8calpain inhibitor II−4.88 ± 0.50−8.23
9calpain inhibitor XII−5.20 ± 0.70−8.69
10calpeptin−5.43 ± 1.11−6.81
11candesartan cilexetil−5.61 ± 0.80−7.60
12carmofur−1.89 ± 1.07−7.86
13chloroquine−2.94 ± 0.43−7.41
14dipyridamole−5.60 ± 0.81−8.52
15disulfiram−2.68 ± 0.43−6.89
16GC-373−3.74 ± 0.78−8.76
17hydroxychloroquine−1.80 ± 0.65−7.58
18MG-115−3.14 ± 0.71−7.53
19MG-132−2.57 ± 0.88−7.41
20narlaprevir−2.41 ± 1.13−6.40
21omeprazole−3.61 ± 1.23−6.60
22oxytetracycline−7.29 ± 0.75−7.18
23PX-12−2.45 ± 0.70−6.39
24shikonin−2.96 ± 0.62−6.58

aThe experimental binding affinities were calculated from the IC50 value, [19,20,66,79–82] assuming that IC50 equals the inhibition constant k. The unit is kcal mol−1.

(a) The free energy profile via US simulations during the unbinding process of inhibitor 11b out of SARS-CoV-2 Mpro; (b) pulling force via SMD simulations during the unbinding process of inhibitor 11b out of SARS-CoV-2 Mpro. US binding free energy in comparison with experiments. aThe experimental binding affinities were calculated from the IC50 value, [19,20,66,79-82] assuming that IC50 equals the inhibition constant k. The unit is kcal mol−1. The calculated binding free energy ranges from −8.63 to −1.80 kcal mol−1, with the mean value of . The calculated mean underestimates the experimental mean, which is of . The computed error bars were estimated by the standard error of the mean. The underestimation of US calculations was similar to the results when the US approach was applied to the cathepsin K (CTSK) system [62]. The inaccuracy of the force field in describing the interaction between biomolecules [83,84], which included the SARS-CoV-2 Mpro, inhibitor, water molecule, and counter ion, may cause the computed binding free energies to deviate from experimental values. The US calculations adopted an appropriate Pearson correlation with (figure 4), which is similar to the FPL results, , but is lower than that of the LIE, [21,22]. Moreover, the accuracy of computational approaches was also investigated via root mean square error (RMSE) calculation. The obtained result of US is of . The is slightly smaller than that from LIE calculation, and is significantly smaller than that from MM/PBSA, [21]. Furthermore, because the initial complexes obtained via the docking approach is probably far from the native structure, the US accuracy is not very high. Therefore, the systems need to relax in the MD simulations for long enough for the system to overcome local minima and reach equilibrium states [21]. The accuracy of the FPL approach was increased according to the strategy [21]. Otherwise, increasing the US window interval would probably resolve the issues since the complex in the bound state can reach equilibrium states. The issues should be carefully investigated in further work.
Figure 4

Linear correlation between and . The computing error was estimated using the bootstrapping method.

Linear correlation between and . The computing error was estimated using the bootstrapping method. In order to characterize the insights into the unbinding pathway of ligand dissociation, the collective-variable FEL was constructed. Two coordinates are the number of protein-ligand contacts at a cut-off distance of 0.45 nm and the ligand displacement. The two-dimensional FEL is given in figure 5 and the electronic supplementary material, table S7. Consistent with the PMF curve (figure 3a), the number of contacts between protein-ligand rapidly decreases when the ligand is displaced from 1.1 to 2.5 nm. Along the dissociation pathway, numerous local minima were observed. The representative structures of SARS-CoV-2 Mpro + inhibitor, which corresponded to these minima, were characterized using a data clustering algorithm with all-atom root mean square deviation chosen as pair distance and a cut-off of 0.2 nm. A representation of these conformations is described in figure 5. In particular, the SARS-CoV-2 Mpro + 11b formed six minima, which were denoted as B, D1, D2, D3, D4 and D5. Structure B responds to the bound state of the complex, while structures D1–4 correspond to the transition states during the dissociated process of the ligand. Shape D5 responds to the completed dissociation state of the protein-ligand complex. Besides, the two-dimensional interaction diagram of SARS-CoV-2 Mpro + 11b is described in the electronic supplementary material, figure S1 mentioning more details of critical residues controlling the binding process of the compound 11b. The residues Cys44, Thr45, Ser46, Leu141, Asn142, Gly143, Glu166, Leu167, Pro168, Ala191, Gln192 and Ala193 probably are important elements over the ligand dissociation. These residues contribute a large number of contacts to the inhibitor over the dissociated process. The potential mutations at these points would probably alter the ligand binding pathway to SARS-CoV-2 Mpro, especially, the change in binding kinetics appeared.
Figure 5

The collective-variable FEL and representative structures of SARS-CoV-2 Mpro + 11b over US simulations.

The collective-variable FEL and representative structures of SARS-CoV-2 Mpro + 11b over US simulations. In addition, ROC-AUC is commonly used to assess the predictive power of a binary classifier in separating a dataset into two classes. If we separate the set of ligands into two sets according to their experimental binding free energy as being strong binders (low binding free energy) and weak binders (high binding free energy) using some arbitrary cut-off, we can use ROC-AUC, to assess the ability of our computed free energy in distinguishing strong binders against weak binders. More specifically, if we randomly select a ligand from the strong binder set and a ligand from the weak binder set, ROC-AUC gives us the probability that the selected strong binder will have lower calculated binding free energy than the weak binder. Therefore, ROC-AUC for SARS-CoV-2 Mpro was calculated. The obtained result is , indicating that US is weaker than LIE,, and better than FPL, , in classifying SARS-CoV-2 Mpro inhibitors [21].

Conclusion

We have demonstrated that the US approach provides a reliable estimate of binding free energies when compared with experimental data [19,20,66,79-82]. The correlation between US binding free energy and experiments is rather high with a value of , which is in the same range of correlation with FPL [22]. Moreover, the obtained RMSE of US simulations, , is slightly smaller than that of LIE [21]. Furthermore, US is better than FPL and poorer than LIE in classifying SARS-CoV-2 Mpro inhibitors owing to ROC-AUC analysis, . Clustering conformations found on local minima of the two-dimensional FEL revealed important insights into the unbinding pathway of the ligand from SARS-CoV-2 Mpro binding cavity. The obtained results imply that the residues Cys44, Thr45, Ser46, Leu141, Asn142, Gly143, Glu166, Leu167, Pro168, Ala191, Gln192 and Ala193 probably play an important role in the ligand dissociation. Therefore, substitutions appearing at these points possibly change the ligand binding mechanism of SARS-CoV-2 Mpro.
  58 in total

1.  Accurate Binding Free Energy Predictions in Fragment Optimization.

Authors:  Thomas B Steinbrecher; Markus Dahlgren; Daniel Cappel; Teng Lin; Lingle Wang; Goran Krilov; Robert Abel; Richard Friesner; Woody Sherman
Journal:  J Chem Inf Model       Date:  2015-10-21       Impact factor: 4.956

2.  Free-energy landscape, principal component analysis, and structural clustering to identify representative conformations from molecular dynamics simulations: the myoglobin case.

Authors:  Elena Papaleo; Paolo Mereghetti; Piercarlo Fantucci; Rita Grandori; Luca De Gioia
Journal:  J Mol Graph Model       Date:  2009-02-06       Impact factor: 2.518

3.  Improving the LIE Method for Binding Free Energy Calculations of Protein-Ligand Complexes.

Authors:  Williams E Miranda; Sergei Yu Noskov; Pedro A Valiente
Journal:  J Chem Inf Model       Date:  2015-08-18       Impact factor: 4.956

4.  Using the fast fourier transform in binding free energy calculations.

Authors:  Trung Hai Nguyen; Huan-Xiang Zhou; David D L Minh
Journal:  J Comput Chem       Date:  2017-12-22       Impact factor: 3.376

5.  Computational protein-ligand docking and virtual drug screening with the AutoDock suite.

Authors:  Stefano Forli; Ruth Huey; Michael E Pique; Michel F Sanner; David S Goodsell; Arthur J Olson
Journal:  Nat Protoc       Date:  2016-04-14       Impact factor: 13.491

6.  Large scale relative protein ligand binding affinities using non-equilibrium alchemy.

Authors:  Vytautas Gapsys; Laura Pérez-Benito; Matteo Aldeghi; Daniel Seeliger; Herman van Vlijmen; Gary Tresadern; Bert L de Groot
Journal:  Chem Sci       Date:  2019-12-02       Impact factor: 9.825

7.  Antibody resistance of SARS-CoV-2 variants B.1.351 and B.1.1.7.

Authors:  Pengfei Wang; Manoj S Nair; Lihong Liu; Sho Iketani; Yang Luo; Yicheng Guo; Maple Wang; Jian Yu; Baoshan Zhang; Peter D Kwong; Barney S Graham; John R Mascola; Jennifer Y Chang; Michael T Yin; Magdalena Sobieszczyk; Christos A Kyratsous; Lawrence Shapiro; Zizhang Sheng; Yaoxing Huang; David D Ho
Journal:  Nature       Date:  2021-03-08       Impact factor: 69.504

8.  First Case of 2019 Novel Coronavirus in the United States.

Authors:  Michelle L Holshue; Chas DeBolt; Scott Lindquist; Kathy H Lofy; John Wiesman; Hollianne Bruce; Christopher Spitters; Keith Ericson; Sara Wilkerson; Ahmet Tural; George Diaz; Amanda Cohn; LeAnne Fox; Anita Patel; Susan I Gerber; Lindsay Kim; Suxiang Tong; Xiaoyan Lu; Steve Lindstrom; Mark A Pallansch; William C Weldon; Holly M Biggs; Timothy M Uyeki; Satish K Pillai
Journal:  N Engl J Med       Date:  2020-01-31       Impact factor: 91.245

9.  SARS-CoV-2 variants B.1.351 and P.1 escape from neutralizing antibodies.

Authors:  Markus Hoffmann; Prerna Arora; Rüdiger Groß; Alina Seidel; Bojan F Hörnich; Alexander S Hahn; Nadine Krüger; Luise Graichen; Heike Hofmann-Winkler; Amy Kempf; Martin S Winkler; Sebastian Schulz; Hans-Martin Jäck; Bernd Jahrsdörfer; Hubert Schrezenmeier; Martin Müller; Alexander Kleger; Jan Münch; Stefan Pöhlmann
Journal:  Cell       Date:  2021-03-20       Impact factor: 41.582

10.  Identify potent SARS-CoV-2 main protease inhibitors via accelerated free energy perturbation-based virtual screening of existing drugs.

Authors:  Zhe Li; Xin Li; Yi-You Huang; Yaoxing Wu; Runduo Liu; Lingli Zhou; Yuxi Lin; Deyan Wu; Lei Zhang; Hao Liu; Ximing Xu; Kunqian Yu; Yuxia Zhang; Jun Cui; Chang-Guo Zhan; Xin Wang; Hai-Bin Luo
Journal:  Proc Natl Acad Sci U S A       Date:  2020-10-13       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.