Literature DB >> 36114175

Water regulates the residence time of Benzamidine in Trypsin.

Narjes Ansari1, Valerio Rizzi1, Michele Parrinello2.   

Abstract

The process of ligand-protein unbinding is crucial in biophysics. Water is an essential part of any biological system and yet, many aspects of its role remain elusive. Here, we simulate with state-of-the-art enhanced sampling techniques the binding of Benzamidine to Trypsin which is a much studied and paradigmatic ligand-protein system. We use machine learning methods to determine efficient collective coordinates for the complex non-local network of water. These coordinates are used to perform On-the-fly Probability Enhanced Sampling simulations, which we adapt to calculate also the ligand residence time. Our results, both static and dynamic, are in good agreement with experiments. We find that the presence of a water molecule located at the bottom of the binding pocket allows via a network of hydrogen bonds the ligand to be released into the solution. On a finer scale, even when unbinding is allowed, another water molecule further modulates the exit time.
© 2022. The Author(s).

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 36114175      PMCID: PMC9481606          DOI: 10.1038/s41467-022-33104-3

Source DB:  PubMed          Journal:  Nat Commun        ISSN: 2041-1723            Impact factor:   17.694


Introduction

The process of ligand-protein unbinding is crucial in biophysics and its full understanding would enhance not only our knowledge but also benefit the design of new drugs. In particular, two quantities are of great relevance, the ligand binding free energy and the inverse of the residence time koff[1]. Being able to compute reliably these two quantities would be of great help. Here we describe a strategy that makes it possible to calculate accurately both quantities. We demonstrate this assertion with a state-of-the-art simulation of the Trypsin-Benzamidine system[2-13] and we find that the unbinding process can be more complex than what one could have anticipated. Although Trypsin-Benzamidine is one of the simplest cases of protein-ligand systems, high-resolution crystallographic experiments have recently demonstrated the presence of an extensive water structure in the binding cavity[14]. Our study reveals that water has not only a structural role but a dynamical one, since it regulates the unbinding process via a complex rearrangement of hydrogen bonds (HBs). In particular, the presence of a water molecule at a specific position in the binding cavity allows the unbinding process to take place. As the ligand begins to leave its binding pose, the number of water molecules in the binding cavity increases, leading to a finer regulation in which water determines two possible escape pathways with a koff differing by one order of magnitude. Several technical advances have made this study possible. Ligand unbinding is a rare event that takes place on a timescale of milliseconds and thus its study requires the use of an enhanced sampling method. Here, we profit from the flexibility and efficiency of the On-the-fly Probability Enhanced Sampling (OPES) method[15], that is the latest evolution of Metadynamics[16,17]. Like umbrella sampling[18] and other enhanced sampling methods[19], OPES is based on the use of a set of collective variables (CVs) that are functions s(R) of the atomic coordinates R and describe the slow modes of the system. A good choice of s(R) is essential to obtain converged results in an affordable time. To determine good CVs we use two machine-learning-based tools that we recently developed, Deep Linear Discriminant Analysis (Deep-LDA)[20] and Deep Time-lagged Independent Component Analysis (Deep-TICA)[21]. In building these CVs, we shall pay great attention to the role of water and make use of the experience gained in ref. 22. Standard OPES is very efficient in calculating static properties such as binding free energies, but, in doing so, it alters the natural dynamics of the system so that a sensitive quantity like koff cannot be easily extracted. Nevertheless, it has been pointed out that if an enhanced sampling method can be engineered such that no bias is added to the transition region, the value of koff can still be computed[23-27]. Here we show that by an appropriate setting of the input parameters, OPES can be made to satisfy this condition. We call this approach OPES flooding (OPESf)[28] since it is inspired by the flooding approach[26] and use it to calculate the ligand residence time.

Results

Enhanced sampling technique for estimating free energies: OPES

To accelerate the occurrence of binding and unbinding events, we use OPES[15]. This method allows the system to overcome kinetic barriers by transforming the original CV probability distribution P(s) into a smoother and thus simpler to sample one Ptg(s). In the variant of OPES that we use here, we transform the s probability distribution , where β = 1/kBT is the inverse temperature, V(R) the interaction potential and Z = ∫e−dR the partition function, into the smoother well-tempered distribution Ptg(s) ∝ [P(s)]1/ [17], where the parameter γ > 1 regulates the broadening of the target distribution. In standard metadynamics this transformation is achieved by iteratively building a bias potential V(s), while in OPES one instead reconstructs the probability distribution P(s) on-the-fly. At iteration step n, the probability distribution P(s) is estimated aswhere G(s, s) are multivariate Gaussian kernels evaluated at every step k, the weights are computed from the bias at step k − 1. In turn, the bias iswhere Z is a factor that measures the configuration space thus far explored and ϵ is a very important parameter that controls the maximum bias that can be deposited in the system.

Machine learning-based CVs: Deep-LDA and Deep-TICA

As in our previous work[22], we use the machine learning-based Deep-LDA method[20] to design effective CVs to be used in conjunction with OPES. The method is based on the time-honored Linear Discriminant Analysis technique[29] employed in classification applications. Our aim is to build a CV that is able to distinguish between two sets of data. In our case, data come from unbiased simulations of the system in the bound (B) and unbound (U) states. In standard LDA, one optimizes Fisher’s ratio to obtain the linear combination s(R) = wd(R) of descriptors d(R) that best separates the two states. Fisher’s ratio is written in terms of the scatter matrix and the within matrix w = B + U, where B, U indicate the average descriptors values and B, U the descriptors variance matrices in the two states. The vector w that maximizes this ratio is the direction that optimally discriminates the states and provides the best-separated projection of the data in the one-dimensional s space[30]. In Deep-LDA, the set of Nd descriptors d is fed into a neural network (NN) that is trained by applying the LDA criterion to the last hidden layer h of the network. In analogy with LDA, a projection of the Nh components of the last hidden layer produces the Deep-LDA CV s = wh. This CV, being by construction a non-linear combination of the original input descriptors d, is more expressive than a simple linear combination and has been successfully applied to solve a number of problems[22,31,32]. As s tends to produce sharp distributions, we apply the cubic transformation s = s + s3 [22,33] to make the CV more easily applicable to enhanced simulations. While often successful in driving a system forth and back between different states, a Deep-LDA CV does not encode any information on the transition state. This information could have been obtained if one had access to a long dynamical trajectory in which many state-to-state transitions did occur. In this case, a way of building a good CV, would have been to use the Time-lagged Independent Component Analysis[34,35], in which one looks for the most slowly decorrelating modes. These slow modes can be found using the variational principle[36]. If the modes are expressed as a linear combination of descriptors, the variational principle leads to a generalized eigenvalue equation. As in Deep-LDA, one can apply TICA not only to a linear combination of descriptors but also to the last hidden layer of a NN. This greatly improves the variational flexibility of the solution and thus its quality. In its original formulation, this approach was meant to be applied to unbiased trajectories. However, McCarty and Parrinello[37] using a linear approach and later Bonati et al.[21] using a non-linear one (Deep-TICA) have shown how to extract useful CVs from biased simulations. The resulting Deep-TICA eigenfunctions encode the slow modes of the biased simulation used for training. Here, we are going to apply Deep-TICA to the set of Nd descriptors on a converged OPES trajectory where the Deep-LDA CV was biased.

Enhanced sampling technique for estimating residence times: OPES flooding

As discussed in the introduction, to calculate rates from a biased simulations no bias must be deposited in the transition region. In such a case, the physical residence time t is related to the physical simulation time tMD by[23,24]where V(s) is the instantaneous bias potential and the acceleration factor is computed as an average along the simulation. To ensure that the condition under which Eq. (3) is satisfied, we introduce a variation of OPES that we call OPESf[28] that is inspired by the variational flooding method[26]. The condition for OPESf to allow calculating physical rates are easily satisfied. The maximum amount of bias deposited can be controlled by the choice of the ϵ parameter in Eq. (2), making sure that it is lower than the free energy barrier. Furthermore via the parameter EXCLUDED_REGION, one can prevent OPESf from depositing bias in a preassigned region of configuration space. Since, from an initial free energy surface (FES) estimate one can roughly estimate both the height and the location of the transition, the setup of a subsequent OPESf rate calculation is relatively simple and straightforward[28].

Designing water CVs

In our OPES simulations we shall use as CVs the distance z from the binding site and a CV that encodes water behavior. Water is known to play a non-trivial role in ligand binding[38-46], therefore we want to use the power of machine learning techniques to capture and encode its behavior in a CV, generalizing the strategy that we have proposed for a smaller host-guest system in ref. 22. The choice of an effective set of descriptors d is critical as it must capture the solvation in different situations that are relevant to the binding-unbinding process, i.e., the ligand itself, the binding position, the binding pocket, and the binding path. Some of the descriptors can be identified following physical intuition. For instance, to describe ligand’s solvation we focus on its charged tail and use the coordination number of water around the Carbon atom of the amidine group ({G}) (see Fig. 2c). Regarding the binding position, we analogously evaluate the coordination number between water and the Carbon atom of the carboxylate group in residue Asp189 ({H}). These choices are similar in spirit to other solvation variables that have been devised in the past[22,47-51].
Fig. 2

Identification of long-lived water molecules.

a Graphical representation of the four steps strategy used to identify the long-lived hydration spots. (1) α-Carbon atoms of the residues located on a protein's outer surface, represented by red dots. (2) Convex hull surface built from the α-Carbon atoms shown in (1). (3) Distribution of the long-lived water molecules inside the surface. (4) Centers of the water distribution from step (3) obtained with a clustering algorithm. b Convex hull surface built around the binding pose of Trypsin. c Position of the 16 hydration spots Vi, shown by spheres. The hydration spots V5−V12 coincide with the position of the reservoir water molecules of ref. 14 and the dark red spheres lie on the binding path. The yellow sphere on V3 shows the position of the key W1 water molecule[14] that forms a hydrogen bond with the ligand. Black arrows indicate the four possible paths of entrance/exit for water molecules in the binding pocket.

However, we follow a different approach to describe water behavior around the binding pocket and along the binding path. The S1 binding pocket of apo Trypsin form (see Fig. 1) is known from high resolution experiments[14,52] to be characterized by the presence of a set of deeply buried water molecules. These water molecules have a long residence time and form what has been called in ref. 14 the reservoir. To encode their role in the descriptor set, we propose a general method that can be applied to describe trapped water molecules in biological systems. This method consists of four steps: (1) selecting the α-Carbon atoms of the relevant part of the protein that encloses water (label 1 of Fig. 2a), (2) building the convex hull surface with the coordinates from step 1 (label 2 of Fig. 2a), (3) collecting the position of the water molecules that lie within the convex hull for longer than a pre-defined lifetime, and (4) clustering the collected data using a K-means clustering method to determine areas of high water density and their centers (label 4 of Fig. 2a). We call those points hydration spots ({V}) and use them as centers around which we calculate water coordination. The method is implemented in a Python script[53]. Following ref. 54, we also investigated the role of the ionic density and found that it does not play a relevant role in this system (see Supplementary Information (SI)).
Fig. 1

The Trypsin-Benzamidine system.

A cartoon representation of Trypsin structure with the ligand Benzamidine and the Funnel restraint geometry. Oxygen, Nitrogen, Hydrogen, and Carbon atoms are colored in red, blue, white, and yellow, respectively. The protein is colored in green. A gray cone and cylinder represent the funnel restraint. The (un)binding path of the ligand is aligned along the z-axis. Relevant to binding residue Asp189 is highlighted.

The Trypsin-Benzamidine system.

A cartoon representation of Trypsin structure with the ligand Benzamidine and the Funnel restraint geometry. Oxygen, Nitrogen, Hydrogen, and Carbon atoms are colored in red, blue, white, and yellow, respectively. The protein is colored in green. A gray cone and cylinder represent the funnel restraint. The (un)binding path of the ligand is aligned along the z-axis. Relevant to binding residue Asp189 is highlighted.

Identification of long-lived water molecules.

a Graphical representation of the four steps strategy used to identify the long-lived hydration spots. (1) α-Carbon atoms of the residues located on a protein's outer surface, represented by red dots. (2) Convex hull surface built from the α-Carbon atoms shown in (1). (3) Distribution of the long-lived water molecules inside the surface. (4) Centers of the water distribution from step (3) obtained with a clustering algorithm. b Convex hull surface built around the binding pose of Trypsin. c Position of the 16 hydration spots Vi, shown by spheres. The hydration spots V5−V12 coincide with the position of the reservoir water molecules of ref. 14 and the dark red spheres lie on the binding path. The yellow sphere on V3 shows the position of the key W1 water molecule[14] that forms a hydrogen bond with the ligand. Black arrows indicate the four possible paths of entrance/exit for water molecules in the binding pocket. In Trypsin, we restrict the analysis to the region around the S1 binding pose where we build the convex hull. That area encloses both the deeply buried water molecules and the binding path of the ligand (see Fig. 2b). From a number of unbiased trajectories (both in states B and U) we collect the positions of the water molecules that have a residence time inside the convex hull longer than 100 ps. Then, using the clustering method introduced in step 4, we identify 16 Vs (see Fig. 2c) positions. The number of clustering centers is chosen so that the relative distance between the Vs lies in a range of 2–3 Å. We find that there is a correspondence between the V5–V12 centers and the position of the reservoir water molecules reported in ref. 14. Furthermore, V3 lies at the position of the long-lived water molecule that stabilizes binding and that in ref. 14 is called W1. We use as descriptors all the water coordination on ({G}, {H}, {V}) to generate Deep-LDA sw and Deep-TICA st water CVs.

Static properties: Binding free energy, enthalpy, and entropy

To calculate the binding free energy, we perform an OPES simulation using 32 walkers, biasing z and the Deep-LDA water CV sw, for a total simulation time of 3.2 μs. More details about the simulation are provided in the SI. Convergence is achieved as several binding and unbinding events occur in a quasi-static regime of the bias (see Supplementary Fig. 2). For comparison, in the SI, we present an analogous simulation in which we bias only z. In that case, the sampling is much worse quality and convergence is not reached (see Supplementary Figs. 3–5). In Fig. 3a, we show the FES projection on z, calculated as a block average with an error bar that is within the line width of the plot. We calculate the free energy difference between the B and U states using the funnel correction in Eq. (4) obtaining a value of 6.36 ± 0.07 kcal/mol, in an embarrassing and certainly fortuitous agreement with the experimental value of 6.36 kcal/mol[55,56].
Fig. 3

Mechanistic interpretation of the binding process.

a 1D free energy surface of Trypsin-Benzamidine reconstructed using reweighting along z with the statistical uncertainty within the line width. b 2D free energy landscape along z and sw CVs. c–f Representative configurations of the relevant states are shown with a focus on the solvation pattern around the ligand and the binding pose.

Mechanistic interpretation of the binding process.

a 1D free energy surface of Trypsin-Benzamidine reconstructed using reweighting along z with the statistical uncertainty within the line width. b 2D free energy landscape along z and sw CVs. c–f Representative configurations of the relevant states are shown with a focus on the solvation pattern around the ligand and the binding pose. The high level of accuracy that the combination of OPES and good quality CVs, makes it possible to estimate separately enthalpy ΔU and entropy ΔS of binding. This is achieved by converging binding free energy calculations in a range of temperatures and using the relationship ΔF = ΔU − TΔS. In Table 1 we report our estimate for these thermodynamic quantities and observe that they are also in good agreement with experiments. Further details about these simulations are provided in the SI.
Table 1

Our simulation estimates of the free energy, enthalpy, and entropy of binding at T = 300 K and corresponding experimental data[55,56]

ΔFΔUTΔS
This work6.36 ± 0.074.12 ± 1.562.23 ± 1.56
EXP.6.364.521.84
Our simulation estimates of the free energy, enthalpy, and entropy of binding at T = 300 K and corresponding experimental data[55,56] In Fig. 3b, we show the two-dimensional FES of binding projected on z and sw, along with a number of representative snapshots of the different states and their typical water arrangement. State B is the global minimum of the FES and corresponds to the protein-ligand crystallographic structure (e.g., PDB 3atl[57]). In line with experiments[14], the W1 water molecule connects the ligand to Trypsin residues Tyr228, and Ser190, forming a total of 3 HBs. Furthermore, in the reservoir region below residue Asp189 we observe on average the presence of 5 water molecules, in agreement with the X-ray structure[14]. State B1 is a less stable binding pose where the ligand is in the same configuration as B, but the reservoir contains on average one more water molecule. In state I, the reservoir region contains another extra water molecule that tends to bridge the amidine group of the ligand and the binding site of Asp189, thus weakening binding. In the fully dissociated state U, the binding cavity returns to its apo form in which the 5 water of the reservoir are in the experimental structure of the holo form. The space previously occupied by the ligand is replaced by about 4–5 water molecules. The fact that in the apo form the reservoir structure of the water is preserved underlines once more the relevance of this structure (see Fig. 3f). Note that, as the ligand progresses towards the U state, the number of water molecules in the binding site increases, underlining the role of water throughout the whole unbinding process. In spite of the limit of being trained on data coming exclusively from B and U, the Deep-LDA CV in combination with OPES is able to capture this non-trivial water behavior and converge the FES by virtue of the ability of OPES to deal with non-optimal CVs[58]. Nevertheless, sw is not able to resolve the diverse water arrangements in states such as B and B1, as we shall see below. In order to capture the finer details of the role of water in the binding pose, the intermediate states and the path to unbinding, we use Deep-TICA to determine a new water CV that we call st. Since Deep-TICA is trained on trajectories coming from biased simulations, the resulting CV is informed on the water modes that Deep-LDA is not able to resolve. Using the data generated in the Deep-LDA-driven simulation, we project in Fig. 4a, b the FES along different pairs of CVs: z, st and sw, st, respectively. We find that st resolves the original B state into states B and B’, with state B being about 12 kJ/mol more stable than B’ (see Supplementary Fig. 8). From these two states, two different unbinding pathways depart. We denote the states belonging to the less stable branch with a prime. Figure 4d, e shows typical configurations of the ligand and the surrounding water molecules in state B and B’. The number of water molecules in the reservoir is the same, but the water arrangement is different. In B, water molecules are part of an extended HB network that includes residue Asp189, while in state B’ this network is less structured.
Fig. 4

Deep-TICA analyis.

2D FES along st and a z, b sw, c V9. In d and e, representative configurations of the ligand and its solvation pattern in states B and B’, respectively. The location of V9 is highlighted with a semi-transparent pink sphere.

Deep-TICA analyis.

2D FES along st and a z, b sw, c V9. In d and e, representative configurations of the ligand and its solvation pattern in states B and B’, respectively. The location of V9 is highlighted with a semi-transparent pink sphere. One striking difference between B and B’ is that in state B there is one water molecule trapped deeply in the binding cavity, in an intermediate position between residues Tyr182, Lys219 and Gln219 which corresponds to descriptor V9 (see the semi transparent sphere in Fig. 4d). The FES projection along V9 and st in Fig. 4c, confirms that V9 discriminates well between B and B’. Further analysis reveals that state B presents a slightly larger volume of the reservoir region than B’ (see Supplementary Fig. 7), which, in turn, possibly facilitates the formation of an extended HB network. A study of the relative importance of the water descriptors on the NN CVs can be also found in the SI (see Supplementary Fig. 11). The presence of a water molecule in V9 discriminates well between B and B’, as can be seen, if we project the FES along V9 and st. Thus this molecule stabilizes the cavity HB network that is such an important feature of this protein.

Dynamic properties: Ligand residence time

We compute the ligand-unbinding rate by employing the OPES flooding technique[28] on simulations that start from state B. In this approach a choice of CVs that captures well the complexity of the path from state B to state U is essential. The Deep-LDA coordinate built only on the knowledge of the B and U states is not adequate to the purpose. In fact, being able to fill all the different metastable bound states is crucial for promoting transitions to the U state without remaining stuck in intermediate states. For this reason, besides z we use Deep-TICA CV st. By construction, st is able to extract the slow modes of the system and as a consequence to drive it out of deep basins towards the transition state. The use of OPESf for the calculation of rates requires that we define an excluded region to avoid depositing bias in the transition state region. An analysis of the FES in Fig. 4a suggests to prevent depositing bias in the region z > 6 Å. We run a total of 55 ligand unbinding simulations and, by using Eq. (3), we determine for each simulation a physical ligand residence time t. The distribution of transition times of a rare event dominated by a single barrier is expected to be Poissonian where τ is the characteristic time of the associated homogeneous process[59]. We fit all our data to such a model and find that the quality of the fit is poor (see Supplementary Fig. 13), as indicated by its low p-value. It is known that complex biological processes present multiple timescales and are not expected to necessarily follow such a simple model[1,60,61]. This led us to further analyze the unbinding trajectories and to identify the presence of two possible unbinding mechanisms: a faster and a slower one. The key difference between the two is related in the water network arrangement in the binding pocket (see Fig. 5). All unbinding events start with the water reservoir acquiring an extra molecule and the system reaching state B1 (see Fig. 5b). Then two possibilities occur. In the faster case, the presence of a water molecule in the vicinity of W1 weakens the bond between the ligand and W1 (see Fig. 5c), which leads to another water molecule to bind to residue Asp189 (see Fig. 5d) and finally brings about the ligand-unbinding event. In the slower mechanism, W1 is stable and, as the reservoir increases its water content (see Fig. 5e), eventually one water molecule bridges the ligand and Asp189 (see Fig. 5f), driving the system towards unbinding. As reported in Table 2, both mechanisms fit well the homogeneous model and present a discrepancy in the resulting τ of about one order of magnitude. In particular, the slower mechanism τ that takes place in about 60% of the simulations has a koff = 1/τ = 687 s−1 that is in close agreement with the experimental value of koff = 600 ± 300 s−1 [62].
Fig. 5

Two ligand unbinding mechanisms.

Representative configurations of states a B and b B1. c, e The pre-intermediate step of fast and slow mechanisms, respectively. d, f The intermediate step of the faster (IF) and the slower (IS) unbinding mechanisms. g Intermediate I1 that presents a number of water molecules between ligand and the binding pose. The yellow sphere highlights the location of the W1 water molecule. The green sphere indicates the position of the water molecule that bridges the ligand and the Asp189 residue. The protein is shown as a semi-transparent ribbon.

Table 2

Ligand residence time

τ (10−3 s)koff (102 s−1)p-valueμ (10−3 s)σ (10−3 s)Number of events
All data0.6415.60.061.101.4955
Faster path0.1855.40.620.230.2923
Slower path1.456.870.631.581.5932
EXP.6.00 ± 3.00

τ is the characteristic time from an exponential fit[59], koff = 1/τ, and p-value measures the quality of the fit.

μ and σ are the average and the standard deviation of the data, respectively. We show results from all our data and from data split into the two observed mechanisms. The experimental results are taken from ref. 62.

Two ligand unbinding mechanisms.

Representative configurations of states a B and b B1. c, e The pre-intermediate step of fast and slow mechanisms, respectively. d, f The intermediate step of the faster (IF) and the slower (IS) unbinding mechanisms. g Intermediate I1 that presents a number of water molecules between ligand and the binding pose. The yellow sphere highlights the location of the W1 water molecule. The green sphere indicates the position of the water molecule that bridges the ligand and the Asp189 residue. The protein is shown as a semi-transparent ribbon. Ligand residence time τ is the characteristic time from an exponential fit[59], koff = 1/τ, and p-value measures the quality of the fit. μ and σ are the average and the standard deviation of the data, respectively. We show results from all our data and from data split into the two observed mechanisms. The experimental results are taken from ref. 62. To further validate the pathways that we observe, we perform a set of OPESf simulations with an enlarged bias-free region z > 4 Å. These simulations ensure that the steps that lead from state B1 to state I occur in an unbiased regime (see Supplementary Fig. 14). We perform simulations where both z and st are biased and where only z is biased. In both cases, we observe pathways belonging to the slow and the fast mechanism, in a similar proportion as the one from the full unbinding simulations. More details are in the SI.

Discussion

Our work shows clearly that water plays a major role in the activation of Trypsin and modulates the release process of Benzamidine. Through the use of water-focused machine-learned CVs, we estimate static and dynamic properties in excellent agreement with experiments and we analyze the results with a level of resolution that lets us identify the importance of each individual water molecules. We observe how the presence of water in key positions affects the binding stability and determines unbinding mechanisms with significantly different ligand residence times. In conclusion, one might say that the set of reservoir water molecules, present both in the apo and the holo form, are an essential part of the enzyme, its structure, and its activity. We believe that this is not just specific to the Trypsin-Benzamidine system and we argue that water would play such a non-trivial role at least in all the cases where the ligand has to negotiate its way out of a water-rich enzymatic cavity.

Methods

We perform all simulations with the molecular dynamics code GROMACS 2020.4[63] patched with PLUMED 2.8[64] and the Pytorch library 1.4[20,65]. We use TIP3P water, while the protein is described by the Amber-14SB force field and the ligand interactions are taken from the Amber GAFF library[66]. We employ Ewald summation[67] for long-range electrostatic interactions with a cutoff of 10 Å for both the Coulomb and the van der Waals interactions. All simulations are run in the NPT ensemble with a timestep of 2 fs. We use the Parrinello-Rahman barostat[68] and the stochastic velocity rescaling thermostat[69], both with a coupling constant of 1 ps. For more details, see Supplementary Methods. We train a Deep-LDA CV by running unbiased simulations on states B and U for 60 ns and evaluating the water coordination on a descriptor set d made by the 18 components ({G}, {H}, {V}). For training, we take as state B the initial configuration used in ref. 70. We use a NN made of 4 layers with 2.5 × 10−5 learning rate. To train the Deep-TICA CV, we take the converged OPES trajectory used for calculating the binding free energy and feed the descriptors set d to a NN made of 4 layers with 1.0 × 10−4 learning rate and 0.07 lag time. Further details can be found in the SI. One of the difficulties in ligand-unbinding problem arises from the fact that once the ligand leaves the protein, it has to explore a large conformational space. To tackle this issue, we use the Funnel restraint proposed in ref. 71. In this method, the space available to the ligand in the unbound state is limited by confining it to a cylindrical volume above the binding site (see Fig. 1) which introduces an entropic restraint in the U state. To calculate the absolute free energy of binding, one needs to apply a correction to the apparent free energy W(z) that comes from the simulation:where C0 = 1/1660 Å−3 is the standard concentration, z is the distance between the center of mass of the ligand and the binding site along the funnel’s axis, W(z) is the free energy along the funnel axis and WU is its reference value in state U. Supplementary information Peer Review File
  55 in total

1.  Sampling protein motion and solvent effect during ligand binding.

Authors:  Vittorio Limongelli; Luciana Marinelli; Sandro Cosconati; Concettina La Motta; Stefania Sartini; Laura Mugnaini; Federico Da Settimo; Ettore Novellino; Michele Parrinello
Journal:  Proc Natl Acad Sci U S A       Date:  2012-01-11       Impact factor: 11.205

2.  Variationally Optimized Free-Energy Flooding for Rate Calculation.

Authors:  James McCarty; Omar Valsson; Pratyush Tiwary; Michele Parrinello
Journal:  Phys Rev Lett       Date:  2015-08-10       Impact factor: 9.161

3.  Substrate binding mechanism of HIV-1 protease from explicit-solvent atomistic simulations.

Authors:  Fabio Pietrucci; Fabrizio Marinelli; Paolo Carloni; Alessandro Laio
Journal:  J Am Chem Soc       Date:  2009-08-26       Impact factor: 15.419

4.  Data-Driven Collective Variables for Enhanced Sampling.

Authors:  Luigi Bonati; Valerio Rizzi; Michele Parrinello
Journal:  J Phys Chem Lett       Date:  2020-04-02       Impact factor: 6.475

5.  Rethinking Metadynamics: From Bias Potentials to Probability Distributions.

Authors:  Michele Invernizzi; Michele Parrinello
Journal:  J Phys Chem Lett       Date:  2020-03-23       Impact factor: 6.475

6.  Deep learning the slow modes for rare events sampling.

Authors:  Luigi Bonati; GiovanniMaria Piccini; Michele Parrinello
Journal:  Proc Natl Acad Sci U S A       Date:  2021-11-02       Impact factor: 11.205

7.  ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB.

Authors:  James A Maier; Carmenza Martinez; Koushik Kasavajhala; Lauren Wickstrom; Kevin E Hauser; Carlos Simmerling
Journal:  J Chem Theory Comput       Date:  2015-07-23       Impact factor: 6.006

8.  Intriguing role of water in protein-ligand binding studied by neutron crystallography on trypsin complexes.

Authors:  Johannes Schiebel; Roberto Gaspari; Tobias Wulsdorf; Khang Ngo; Christian Sohn; Tobias E Schrader; Andrea Cavalli; Andreas Ostermann; Andreas Heine; Gerhard Klebe
Journal:  Nat Commun       Date:  2018-09-03       Impact factor: 14.919

9.  Multisecond ligand dissociation dynamics from atomistic simulations.

Authors:  Steffen Wolf; Benjamin Lickert; Simon Bray; Gerhard Stock
Journal:  Nat Commun       Date:  2020-06-10       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.