Literature DB >> 28979936

Allosteric Communication Networks in Proteins Revealed through Pocket Crosstalk Analysis.

Giuseppina La Sala¹, Sergio Decherchi^2,3, Marco De Vivo^1,4, Walter Rocchia².

Abstract

The detection and characterization of binding pockets and allosteric communication in proteins is crucial for studying biological regulation and performing drug design. Nowadays, ever-longer molecular dynamics (MD) simulations are routinely used to investigate the spatiotemporal evolution of proteins. Yet, there is no computational tool that can automatically detect all the pockets and potential allosteric communication networks along these extended MD simulations. Here, we use a novel and fully automated algorithm that examines pocket formation, dynamics, and allosteric communication embedded in microsecond-long MD simulations of three pharmaceutically relevant proteins, namely, PNP, A2A, and Abl kinase. This dynamic analysis uses pocket crosstalk, defined as the temporal exchange of atoms between adjacent pockets, along the MD trajectories as a fingerprint of hidden allosteric communication networks. Importantly, this study indicates that dynamic pocket crosstalk analysis provides new mechanistic understandings on allosteric communication networks, enriching the available experimental data. Thus, our results suggest the prospective use of this unprecedented dynamic analysis to characterize transient binding pockets for structure-based drug design.

Entities: Chemical Disease Gene Mutation Species

Year: 2017 PMID： 28979936 PMCID： PMC5620967 DOI： 10.1021/acscentsci.7b00211

Source DB: PubMed Journal: ACS Cent Sci ISSN： 2374-7943 Impact factor: 14.553

Introduction

Binding pockets are often crucial for modulating the function of biomolecules, such as those in protein enzymes and ion channels. For example, small molecule drugs exert their beneficial action by binding to a functional pocket of the protein target(s).[1] Detecting and characterizing these functional binding pockets is therefore of paramount importance for biochemistry and drug discovery.[2] In this regard, molecular dynamics (MD) is a useful tool for studying the appearance, evolution, and structural modifications of binding pockets in large biomolecules, along trajectories of hundreds of nanoseconds to even a few milliseconds.[3,4] MD can also describe the plasticity of those superficial and shallow transient cavities,[5] which are often involved in protein function because they interact with a small substrate or another partner protein.[6,7] As MD trajectories of large structural ensembles increase in length, they create massive data files. These files can be hundreds of gigabytes in size and are expected to reach tens of terabytes in the near future.[8] Therefore, there is a major need for algorithms that can automatically extract the embedded information from these massive data sets and produce intelligible reports on the spatiotemporal evolution of the targeted protein, including its potentially druggable binding pockets.[9] There are already a number of algorithms that can detect protein binding pockets in static structures.[10,11] Some of these rely on the Voronoi diagrams[12] (e.g., MolAxis,[13] MOLE[14]), grids[15] (e.g., POCKET,[16] PocketFinder,[17] POVME[18,19]), and molecular surfaces and probes (e.g., HOLE,[20] SURFNET[21]). Other algorithms analyze ensembles of structures, but usually require a preliminary structural alignment (e.g., MDpocket,[22] PocketAnalizerPCA,[23] Epock, Trj_cavity,[24] and TRAPP[25]). In this case, the resulting information may depend on the specific reference structure used for the alignment. Atom-based algorithms (e.g., PROVAR[26] and EPOSBP[27]) avoid the alignment step. Nevertheless, most of these methods are limited to analyzing a priori defined pocket(s) of interest only. Another key aspect is that pockets can sometime take part in protein allosteric signaling.[2,28] Indeed, a number of theoretical approaches already exist to investigate allosteric signaling, such as bioinformatics methods that rely on the analysis of protein sequences under the assumption that evolutionarily conserved residues are likely to have a functional role.[29] Vibrational motions of proteins examined through normal-mode analysis (NMA) can also offer insights into potential allosteric mechanisms. In this case, low frequency modes define functionally relevant movements often triggered by the binding of an allosteric effector.[30,31] Alternatively, allosteric signaling is often investigated through protein conformational ensembles generated via molecular dynamics (MD). These conformational ensembles are mapped into a graph-based representation, which is composed of interconnected nodes. The degree of a node’s interdependence, which reflects correlation of motions of distant allosteric parts of the protein, can be calculated, for example, via a mutual information analysis[32,33] or using the analysis of atomic positional fluctuations.[34−39] Here, we present an original algorithm for efficiently analyzing extended MD trajectories. Differently from all previous methods, this algorithm detects the formation and spatiotemporal evolution of all the protein pockets. In addition, it monitors pocket crosstalk, defined as the temporal exchange of atoms between adjacent pockets, which we propose as a means to identify allosteric signaling (see Theory). In particular, our algorithm automatically executes (a) an alignment-independent identification of all the pockets on the protein surface; (b) a quantification and visualization of the volume and surface area of all the pockets found in the protein, for each structural frame of the MD trajectory; (c) a report of crosstalk between pockets, described as merging and splitting events; and (d) a detection of allosteric signal transmission networks across the protein surface, defined as interconnected pocket motions (see Theory). We have applied this new algorithm to long MD simulations of a few selected pharmaceutically relevant targets: (i) the purine nucleoside phosphorylase (PNP) enzyme, (ii) the adenosinic receptor (A2A) in POPC membrane, and (iii) the Abelson (Abl) kinase. The algorithm produced a detailed analysis of the structure–dynamics–function relationships, which it automatically extracted from these MD simulations. Our results demonstrate the power of this dynamical analysis, particularly as a prospective tool for characterizing binding pockets in structure-based drug design.

Theory

Outline of the Method

The algorithm identifies pockets and tracks their evolution over time along an MD trajectory. NanoShaper 0.7 (freely available at www.electrostaticszone.eu),[40] is used as a preliminary utility to detect the pockets on the protein surface on individual frames, with particular attention given to those that could be druggable sites. In a single frame of a protein of ∼2000 atoms, the execution time for detecting the binding pockets is in the range of 2–6 seconds on a standard workstation. The main part of the method is then dedicated to characterizing the pocket dynamics along the MD trajectory through the assignment of a unique and dynamics-consistent identifier for every pocket. The spatiotemporal evolution of each pocket is thus monitored over the entire MD trajectory. Both algorithms are described in the following sections. This method is called Pocketron and is a module implemented in the BiKi Life Sciences software suite (www.bikitechnologies.com).

Static Pocket Detection

The static pocket detection algorithm is based on the concept of the solvent excluded surface (SES),[41] or Connolly–Richards surface,[42] which is defined as the surface obtained by rolling a spherical probe over the van der Waals surface of the molecular system. The analytical computation of the SES is done via the Alpha Shapes theory, as described by Decherchi and Rocchia.[40] Then, pockets are identified by calculating the volumetric difference between the regions enclosed by the SESs, obtained with two different probe radii (Supporting Figure 4). The smaller rolling spherical probe has a radius of 1.4 Å, which corresponds to the spherical approximation of a water molecule. Conversely, the larger rolling spherical probe has a default radius of 3 Å. The size values can be modified at will. However, we found that these specific values, together with a subsequent filter selecting only pockets with a volume of at least 34.5 Å3 (∼3 water molecules), provide a reasonable identification of potential binding sites. This information can be used to analyze the entire protein surface, list the identified pockets, and store their calculated volume and the list of contributing atoms.

Dynamical Pocket-Tracking Algorithm and Pocket Crosstalk Detection

The algorithm tracks the atoms forming each pocket along an MD trajectory. These data constitute the fundamental basis for how the algorithm monitors the pocket dynamics. The algorithm monitors all the atom-based events occurring along an MD trajectory, without requiring any structural alignment or prior knowledge of the region to be searched. It thus tracks the exchange of atoms between adjacent pockets, which is our indicator of “pocket crosstalk”. With “exchange of atoms”, we refer to the fact that the same atom may belong to different pockets at different times (i.e., frames) during the simulation. This analysis considers “merging” and “splitting” events to be significant. These events are calculated via simple operations on the atom sets that form each pocket. More specifically, a “merge” event occurs when atoms belonging to the same pocket in the current frame have belonged to separate pockets in a previous frame. Similarly, a “split” event occurs when atoms of a single pocket divide into two or more distinct pockets. To guarantee consistency in pocket tracking, the algorithm initially stores the list of pockets found in the first frame, as defined by their constituent atoms, and assigns a unique pocket identifier (pID) to each stored pocket. Then, for every additional frame, the new set of pockets is computed and compared with the stored ones. When there is a mismatch with respect to the stored pockets, the current pocket is added as a new entry in the stored pocket’s list and assigned a new pID. The matching algorithm relies on the Jaccard index to measure the degree of similarity between two pockets. For example, let A and B be the sets of atoms belonging to two pockets. Then, the Jaccard index between them is defined aswhere the modulus symbol indicates the cardinality of the atom sets. If A and B are identical (i.e., have the same constituent atoms), then the index reaches its maximum value of 1. If only a fraction of the atoms are shared between A and B, then the index provides the number of matching atoms (numerator) divided by the cardinality of the union of the atom sets. If there are no matching atoms, the index is null. Pocket identification is therefore performed with the similarity matrix M. Each entry M(i,k) is the Jaccard index between the atoms of the ith stored pocket and those of the kth pocket (detected in the current frame). For each frame, each pocket is compared to all of the stored ones. If M(i,k) = 0 ∀i, then the kth pocket is considered to be a new pocket and is stored. Otherwise, the kth pocket is connected to the stored entry that maximizes the Jaccard index. In mathematical terms, p is the ith stored pocket and (t) is the kth detected pocket on the frame at time t; therefore, the M matrix is the following:where K is the total number of detected pockets in the current frame and N is the total number of stored pockets so far. As already mentioned, the pocket ID assignment is executed by finding the maximum entry for every column of M:where the index w is corresponding to the stored pocket to which k is assigned. If M(w,k) = 0, a new entry is created and stored in the pocket list with its own new pID. Similarly, to detect merging and splitting events, the algorithm builds the matrix F, so that F(k,j) is the Jaccard index of the kth pocket at instant t and the jth pocket found in the previous frame at instant t – Δt. Then, the F matrix iswhere J is now the total number of pockets at the t – Δt instant (i.e., the previous frame presented to the algorithm). Traversing this matrix allows the easy detection of merge and split events. Indeed, if we let “NZ” be the operator that returns all the nonzero entries of a row/column vector, and let “(i,:) ” and “(:,i)” be the operators that return all the elements of the ith row/column, respectively, then we havewhere Merge((t)) is the set containing all the indexes of the pockets detected at the time (t – Δt) that shared some atoms with pocket k. If this set is empty, then no merge event involved (t). Similarly, Split((t – Δt)) is the set containing all the indexes of the pockets detected at the at time t that shared some atoms with the pocket labeled as j at time (t – Δt). If this set is empty, then no split event involved (t – Δt). On a technical note, the entries of the F matrix involve only those pockets detected at the t and t – Δt instants. However, once the merging and splitting events have been detected, the M matrix is once again used to connect these pockets to the stored ones and to make the tracking consistent, as described above.

Pocket Crosstalk Analysis and Allosteric Signaling

With the above-mentioned ability to track pockets over time, the volume and surface area of all pockets are stored for each frame. Thus, the algorithm generates the full history of the volume, area, splitting, and merging events of each pocket, along the MD trajectory. It also generates the list of contributing atoms and residues for all the unique pockets detected during the MD run. This allows the estimation of the probability (estimated as a relative frequency) that each residue belongs to a given pocket. Moreover, for each pocket, all the merging and splitting events with other pockets are stored. Indeed, one of the final results of the tracking process is a square matrix called N, whose (i,j)th entry represents the merging probability between two pockets, expressed as the number of times they merge over the total number of frames. Similarly, the square matrix N expresses the corresponding statistics for the splitting events. Empirically, we found that the corresponding entries of N and N very often coincide in magnitude, indicating that the splitting and merging events often occur at the same time. For this reason, we define a matrix of aggregate statistics N, which contains the maximum value of the corresponding entries in N and N (Figure ). The nature of the information stored in N calls for effective graphical methods to represent the pocket dynamics, particularly since N is usually very sparse. The algorithm translates the merging and splitting frequency matrix N into a 3D network graph, where the nodes represent pockets, and the edges indicate communication between two pockets or, in other words, how often two pockets exchange atoms. In detail, the position of each node is the geometric center of the atoms that form the corresponding pocket. The color of the node indicates its persistency over the simulation time. The red color indicates high persistency whereas blue indicates low persistency. The thickness of the edge is directly proportional to the frequency of the merging and splitting events between the two connected pockets, according to the corresponding entry in the N matrix. The dimension of the sphere represents the pocket volume (Figure and Figure ). Together with the merging and splitting of adjacent pockets, the N matrix returns indirectly also the long-range crosstalk network connecting distant pockets. That is, a crosstalk network can connect distant pockets through a sequence of neighboring pockets that exchange their atoms during the MD run. In this way, a 3D representation of the crosstalk network can be observed, which can be used to identify putative allosteric signaling network.

Figure 1

(A) Representation of the merging and splitting matrix F, calculated using all the detected pockets. The matrix F allows retrieving information on merging and splitting events. For each frame along the MD trajectory, pockets at time t are compared with pockets at time t – Δt, using the Jaccard index. In this example, at time t, the pockets 1, 2, 3, and 4 (in rows) have been detected and stored. At this point, the Jaccard index is computed with all pockets detected in the previous frame at times t – Δt, i.e., with pockets 1, 2, and 3 (in columns). Moving from t – Δt to t, this example shows that pocket 1 split into two pockets, forming the new pocket 4. Concomitantly, pockets 2 and 3 merged, forming a larger pocket that is still identified as pocket 3, according to its Jaccard index. (B) Schematic example of the conversion of the aggregate merging/splitting statistics N into an undirected network graph. In the matrix N, the off-diagonal red numbers indicate the frequency of the merging and splitting events, which is then reflected by the size of the edge connecting two pockets.

Figure 6

Networks of the most persistent pockets found in the KDin, T315I-KDin, Myr/KDin, and Myr/T315I-KDin trajectories. Each pocket (i.e., network node) is represented as a sphere, with the different colors indicating the pocket’s persistency. The pockets are connected via black lines (i.e., network edges). The width of each edge is proportional to the communication frequency. The networks connect the ATP and the myristate binding sites in all systems except T315I-KDin. We performed our analysis considering only pockets having a persistency of at least 20% and above, along the simulation time.

Results

Protein Binding Pocket Detection and Dynamics

First, we tested the ability of our new algorithm to identify and monitor protein pocket dynamics. We used ∼700 ns long MD simulations of the purine nucleoside phosphorylase (PNP), a homotrimeric enzyme that is involved in purine metabolism and T-cell function.[43] PNP inhibition is a strategy for treating T-cell-mediated diseases, such as leukemia and lymphoma.[44,45] The algorithm detected a total of 22 pockets that exist for more than 30% of the overall simulations describing the dynamic docking of DADMe-ImmH into PNP. Among the most persistent ones, our algorithm identified three large pockets, namely, pocket ID (pID) 4, pID 9, and pID 12. The persistency is ∼79% for pID 4 and ∼96% for both pID 9 and pID 12. These three pockets lie at the edge between the adjacent PNP monomers (Figure and Supporting Figure 1). Each of them corresponds to the known orthosteric binding site that is targeted by the endogenous substrate as well as by known PNP inhibitors such as Immucillin-H[46] and DADMe-ImmH.[47] These pockets have an average volume of 367, 894, and 977 Å3, respectively. These volumes are larger than those observed in the holo PNP crystal, where they are 332, 327, and 409 Å3, respectively. This suggests an elevated plasticity of the orthosteric pockets. Notably, the reduced volume of pID 4 is explained by the existence of a nearby stable pocket, pID 13, which has a persistency of 94% and an average volume of 313 Å3 (Supporting Figure 1). These two pockets are merged for ∼26% of the simulation time, resulting in a single larger binding pocket with an average volume of 677 Å3, which is comparable with the other two orthosteric pockets pID 9 and pID 12. Furthermore, our algorithm detected two smaller pockets, pID 3 and pID 14, located in proximity of the orthosteric binding pockets pID 4 and pID 12 (Figure ). These pockets are present for 71% and 73% of the simulation time and have an average volume of 133 Å3 and 144 Å3, respectively, in agreement with their value in the PNP X-ray structure (203 Å3 and 232 Å3, respectively). Interestingly, these pockets have been shown to constitute a prebinding site where the inhibitor DADMe-ImmH transiently binds before accessing one of the orthosteric sites.[48] Notably, this prebinding site was not found in proximity of the orthosteric pocket pID 9. Likely, this is because pID 9 already embeds the prebinding site during our simulations, as suggested by its larger volume compared to pID 4 and pID 12.

Figure 2

(A) Localization of the main pockets computed for the PNP X-ray structure 3K8O. On the right, the orthosteric ligand DADMe-ImmH located in the orthosteric pocket (in orange), as in the X-ray structure, and in the prebinding pocket (in yellow), as found in our MD simulations. (B) Volume over time of the three orthosteric sites pID 4, pID 9, and pID 12. The volumes have been smoothed employing a Gaussian filter. Another pocket, namely, pID 22, is detected at the center of the trimerization interface, with an average volume of 398 Å3 (248 Å3 in the X-ray structure) and a time persistency of 88%. Intriguingly, this pocket shows a marked crosstalk pattern with the orthosteric binding site of each monomer during the MD simulations (Supporting Figure 2). This suggests a possible signaling transmission network between the different monomers, mediated by this common interface pocket. This may explain the negative cooperativity between PNP subunits observed by Schramm and co-workers[49,50] (see Discussion).

Identification of Interaction Patterns between Pockets in Proteins

Our algorithm detects pocket interactions by observing pocket splitting and merging during an MD simulation (see Theory). Here, we demonstrate this feature on two proteins with multiple binding pockets, which are well-characterized with biochemical and structural data (see below). These proteins are (1) the adenosinic receptor A2A and (2) the Abelson (Abl) tyrosine kinase. The adenosinic receptor A2A is a G-protein coupled receptor (GPCR) with a recognized key role in several pathophysiological processes.[51] It is a promising target for pain, depression,[52] and neurological diseases such as Parkinson’s disease.[53] The Abelson (Abl) tyrosine kinase is involved in cell growth and survival.[54] It is a validated target for treating several types of cancer.[55]

Adenosinic Receptor A2A

The analysis performed on 100 ns long A2A trajectories revealed 13 pockets that were present for more than 30% of the simulations. Of these, pID 15 and pID 22 coincide with the structurally characterized orthosteric and allosteric binding sites, respectively.[56,57] In particular, pID 15 is located on the extracellular side of the receptor and is targeted by both agonist and antagonist drugs, such as adenosine[58] and ZM241385.[59] In contrast, pID 22 is located in the core of the transmembrane bundle and is normally occupied by a sodium ion, which can be displaced by allosteric drugs, such as amiloride and HMA (Figure ).[60,61] Our analysis revealed that, although pID 15 and pID 22 were well-separated during most of the MD simulations, they sometimes communicated via merging and splitting events. This means that these pockets share a set of residues at their interface. These residues are Val 84, Ala 88, Phe 242, and Trp 246 (Figure A). They are involved in a crosstalk between these pockets. Interestingly, both MD simulations and experimental studies have demonstrated a ligand-dependent A2A signaling, which involves an allosteric effect via these two pockets.[60,62,63]

Figure 3

(A) Time persistency of residues that define the orthosteric pocket (blue stem, pID 15) and the allosteric pocket (red stem, pID 22). (B) Volume plot of pID 15 and pID 22 in selected frames and representation of merging and splitting events. For example, at frame 844 (out of 1000 frames analyzed), the larger orthosteric pocket pID 15 joins the allosteric pID 22 (Figure B). Interestingly, when these two pockets are merged, they allow a small channel to form (Figure and Figure ). As suggested by thermal stability studies[60] and MD simulations,[64] this channel permits the passage of a sodium ion from the extracellular side to the allosteric pocket. The residues at the edge between the two pockets may therefore act as a gate, mediating this channel’s opening and closing, and thus modulating the entrance of the sodium ion. This would also account for the conformation flexibility of Trp 246, known as the “toggle switch” residue, which is associated with the activation mechanism of GPCRs.[56,59,65,66] Notably, the conformational flexibility of residues that lie between two pockets has been shown to be crucial also in other enzymes, such as those processing lipids (e.g., fatty acid amide hydrolase[67,68] and monoacylglycerol lipase[7]).

Figure 4

Pocket crosstalk analysis reveals that when pID 15 merges with the allosteric pID 22, a small channel is formed (Figure ). We used adiabatic biased simulations to characterize the passage of a sodium ion through the transient channel detected by our algorithm (see Supporting Information). (A) The green spheres indicate the pathway followed by the sodium ion along the simulations, toward the extracellular site. The two conserved residues Trp246 and Asp52 are shown. In panel B we show the narrowest section of the channel, with the gating Trp246 partially flipped so as to allow ion passage. We used adiabatic bias simulations to test the passage of the sodium ion through the transiently formed channel detected via pocket crosstalk analysis (Figure ); although qualitative, our results indicate that this channel can allow the sodium ion to access the protein through a continuous and sterically accessible pathway (Figure ). While the calculation of the associated thermodynamics would require more extended simulations, it is interesting to note that this pathway well resembles the one shown by Yuan et al., identified along 9.6 μs of plain MD simulation.[69]

Abelson (Abl) Tyrosine Kinase

Here, we analyzed the MD simulations of the kinase domain (KD) of Abl, both in the catalytically active DFG-in (KDin) and in the catalytically inactive DFG-out (KDout) conformations (see Methods for details).[70] Among the most persistent pockets in KDin, our algorithm successfully identified the large ATP site (pID 5, see Supporting Figure 3).[71] In addition to pID 5, we found another nearby pocket located between the DFG motif and the αC helix (pID 3, see Supporting Figure 3), which is known to be an allosteric site located in the KD.[72] Interestingly, pID 5 and pID 3 present a marked crosstalk in our MD simulations, sharing a large set of residues, which are listed in Supporting Table 1. For ∼63% of the simulation, they remain separate, with volumes of 439 Å3 and 167 Å3, respectively. However, for ∼35% of the simulations, the two pockets merge into a single larger pocket with an average volume of 497 Å3 (Figure ).

Figure 5

Representation of the dynamical behavior of the ATP pockets during MD simulations of the KDin (a, b) and KDout systems (c–f). In KDin, the ATP pocket pID 5 coexists with the nearby pID 3 for 63.0% of the simulations (a), while pID 5 is the only pocket for 35% of the simulations (b). In KDout, the ATP pocket coexists with pID 3 and pID 28 for 11% of the MD simulations (c). It is the only emerging pocket for 38% of the simulations (d), and it coexists with pID 3 for 20% of the simulations (e) and with pID 28 for 23% of the simulations (f). In KDout, our algorithm again identified the ATP pocket pID 5 and the nearby allosteric pocket pID 3 (Supporting Figure 3), with an average volume of 339 Å3 and 222 Å3, respectively. However, the algorithm identified also an additional pocket, pID 28, which is located between the P-loop and the β3-αC-helix loop (Supporting Figure 3), and which has a smaller volume of 150 Å3. Notably, the region containing this new pocket is targeted by a new class of diphenylamine-derived allosteric inhibitors of MEK 1 and MEK 2 kinases.[73] Here too, the three pockets (pID 3, pID 5, and pID 28) share a set of residues (listed in Supporting Table 1), suggesting a communication network. Interestingly, in this case, these three pockets exist as individual entities for only ∼11% of the overall simulated time. This is because merging events are more frequent than in the catalytically active KDin system (Figure ). According to our analysis, the DFG-in conformation limits the crosstalk between the ATP pocket and the nearby pocket pID 3 in the catalytically active KDin system. In contrast, in KDout, the DFG-out conformation favors communication with the surrounding pockets pID 3 and pID 28. Indeed, the DFG-out conformation has been demonstrated to increase the flexibility of the ATP binding site, which leads to a loss of KD activity.[74,75] Thus, our analysis suggests that the specific DFG conformation (in or out) affects the formation of transient pockets and their communication network. These, in turn, modulate the ATP binding site’s overall shape and the resulting KD activity.

Allosteric Signal Transmission Networks Inferred from Pocket Crosstalk Analysis

Here, we tested the capability of our algorithm to reveal allosteric signal propagation pathways in proteins, starting from an MD-generated equilibrium ensemble of structures. As a paradigmatic example, we investigated the Abl kinase, which is one of the most characterized systems in terms of molecular determinants for protein allostery.[74−77] We performed a comparative analysis of four different Abl systems: (1) the wild-type KDin form, (2) KD bound to myristate (Myr/KDin), (3) the KD apo form of the T315I mutation (T315I-KDin), and (4) system T315I-KDin bound to myristate (Myr/T315I-KDin). In all systems, our analysis detected both the orthosteric ATP pocket and the allosteric myristate pocket, which is located in the C-lobe of the KD.[78] The analysis also returned an intensive crosstalk between pockets distributed on the protein surface. This was detected by looking at the off-diagonal elements of the square matrix N (see Theory). Notably, this crosstalk network connects the distal orthosteric ATP and the allosteric myristate binding sites (Figure ), which is known to be central for the allosteric signal propagation in Abl.[79−82] In particular, in both KDin and Myr/KDin, the ATP and the myristate pockets are connected via a long-range crosstalk network composed of a number (∼3 to 5) of small transient pockets (average volume ∼90 Å3, and persistency between 20% and 70%). These pockets are mainly located in the C-lobe and mostly involve the αG and αI helices (Figure ). Notably, the location of the pockets corresponds to that of the new allosteric pockets reported by Shan et al. in their study on binding pathways in Src kinase.[83] Moreover, in the MD simulations of Abl with the drug-resistant T315I point mutation (T315I-KDin system), the ATP ↔ myristate long-range crosstalk network is actually disrupted (Figure ). The perturbation of this communication pathway in the T315I-KDin system might explain the dysregulation of the T315I kinase form.[84,85] Remarkably, however, the ATP ↔ myristate pockets’ long-range communication network is restored in our simulations when the myristate binds to the T315I Abl mutated form (i.e., the Myr/T315I-KDin system). This is also observed in the wild-type system and is in line with other experimental and computational studies showing that myristate mimesis elicits a structural rearrangement on the ATP pocket and influences the binding affinity of orthosteric inhibitors (see Supporting Information).[77,79] This allosteric communication network could also explain the results of HX MS experiments, which further detailed the presence of an allosteric-signal-propagation pathway from myristate to the ATP pockets, passing through KD structural elements such as the αI-helix.[79] Networks of the most persistent pockets found in the KDin, T315I-KDin, Myr/KDin, and Myr/T315I-KDin trajectories. Each pocket (i.e., network node) is represented as a sphere, with the different colors indicating the pocket’s persistency. The pockets are connected via black lines (i.e., network edges). The width of each edge is proportional to the communication frequency. The networks connect the ATP and the myristate binding sites in all systems except T315I-KDin. We performed our analysis considering only pockets having a persistency of at least 20% and above, along the simulation time. As a following step, we aimed at connecting such a mechanism for allosteric signaling with useful information for ligand design. Toward this end, we performed enhanced sampling MD simulations to test whether this network for allosteric signaling affects ligand binding at the orthosteric site. Thus, we performed a set of scaled-MD simulations[86] on the T315I mutant of Abl, either in complex with dasatinib bound at the orthosteric site (i.e., Das/T315I-KDin) or in complex with both dasatinib and the myristate, at the distal myristate binding pocket (i.e., Das-Myr/T315I-KDin). Scaled-MD simulations can be used to accelerate the unbinding process by reducing the atomic interactions in the potential energy of the system. While facilitating the unbinding from the pocket during the MD run, scaled-MD simulations allow also a qualitative and relative estimation of the ligand residence time.[86,87] Here, we used scaled-MD simulations to estimate the residence time of dasatinib in Das-Myr/T315I-KDin and Das/T315I-KDin systems, which only differ in the presence of the myristate (Figure ). In these simulations, dasatinib showed an average scaled residence time ∼2 times longer when unbinding from Das-Myr/T315I-KDin compared to when it unbinds from Das/T315I-KDin (see Supporting Table 2). Converted back to nonscaled residence times, (see ref (86) for further details) dasatinib in the two systems returns a ratio value of more than 4. Although only qualitative, this result further indicates that the presence of the myristate, which coincides with the formation of the crosstalk signaling network connecting the orthosteric ATP and the allosteric myristate pockets (Figure ), stabilizes the complex with dasatinib. Notably, biochemical assays confirm that dasatinib is active against the T315I Abl form only in the presence of an allosteric binder.[79] Moreover, the residence time of a drug often correlates with its binding affinity, due to the relationship between the rate of dissociation (koff) and the thermodynamic dissociation constant (Kd). Hence, a longer residence time suggests a greater affinity of the drug for its target. Taken together, these results verify binding cooperativity effects in the Abl system, in which the presence of an allosteric binder at the myristate pocket is shown to impact on ligand binding at the orthosteric site. This example suggests that the information generated by dynamic pocket crosstalk analysis can help identify suitable conformational states for ligand binding thermodynamics obtained through subsequent calculations.

Discussion

In this work, we provide new mechanistic insights on pocket dynamics and allosteric communication in three pharmaceutically relevant proteins, namely, the purine nucleoside phosphorylase (PNP) enzyme, the adenosinic receptor (A2A), and the Abelson (Abl) kinase. Importantly, these findings are obtained through the use of a novel algorithm that analyzes pocket crosstalk events along the dynamics of a given biomolecular system. In this study, we demonstrate that this method can effectively reveal allosteric communication networks embedded in the ever-longer available MD trajectories. In addition, this new method characterizes all binding pockets through site-centered descriptors, which can be used in machine-learning-based virtual screening protocols with enhanced predictivity.[3] First, the algorithm identified all three experimentally known orthosteric binding sites in the PNP homotrimeric enzyme (pID 4, pID 9, and pID 12 in Figure ).[44] As expected, these are the largest and most persistent pockets. The dynamical analysis also identified smaller pockets in proximity of these three sites. Notably, these auxiliary pockets were involved in a prebinding state of DADMe-ImmH, a PNP inhibitor,[47] which was found to transiently interact with this small pocket before entering the main orthosteric site.[48] Analysis of the volume variation over time pointed to an elevated plasticity of the orthosteric binding sites. In fact, their average volume is higher than the crystallographic corresponding value. This suggests that the pockets undergo a structural rearrangement, which likely facilitates the binding of inhibitors observed during the simulations.[48] Our analysis also revealed a marked pattern of communication between each orthosteric pocket and its nearby prebinding site, reflected by a high frequency of merge and split events. Interestingly, this evidence is in line with structural studies, which show that two of the binding site’s structural motifs (the His 257 helix and His 64 loop) undergo a major conformational alteration upon the binding of the inhibitor. This alteration affects the binding pocket’s size and shape.[49] In addition, the high flexibility of this structural region in PNP was recently used to demonstrate that MD can be effectively deployed to improve virtual screening results.[3] Our algorithm also identified a long-range communication pathway connecting the orthosteric pockets and another pocket at the center of the trimerization interface (pID 22). This pathway suggests a possible allosteric connection between the three orthosteric pockets that passes through pID 22, at the trimerization interface. Intriguingly, this allosteric signal may help explain the negative cooperation between PNP subunits, as observed by Schramm and collaborators.[49,50] In H/D exchange experiments, dynamical coupling has also been observed between the orthosteric sites and the pocket located at the trimerization interface.[88] This further corroborates our hypothesis, according to which pID 22 could be the hub of the allosteric signal transmission between PNP monomers. We then analyzed the MD trajectory of A2A, simulated in POPC membrane. This analysis captured the experimentally known pockets, which are the orthosteric binding site, located close to the extracellular side, and the underlying sodium allosteric site (Figure ).[56,57,59] Our analysis also showed that, during the simulation, the two pockets communicated through a set of residues located at the interface of the two sites (i.e., Val 84, Ala 88, Phe 242, and Trp 246). This result is in line with earlier experimental and computational studies, which hypothesized the presence of an allosteric signal between the two cavities, which is crucial for modulating A2A function.[60,64,89] In this regard, we found that the conformational flexibility of Trp 246, also known as the “toggle switch”, appears to be crucial for the formation of a small channel during a two-pocket merging event. Via this channel, the Na+ ion may access the inner part of the protein from the extracellular space, as experimentally reported (Figure and Figure ).[56,60,90] Merging and splitting events were also shown in our MD simulations of the Abl kinase. Here, we focused on the orthosteric ATP site, comparing the results obtained from analyzing the catalytically active DFG-in kinase domain and the catalytically inactive DFG-out kinase domain (KDin and KDout, respectively). Together with the ATP site, our algorithm correctly detected two additional smaller adjacent pockets. One is an allosteric pocket often exploited by type II kinase inhibitors, while the other is present only in KDout, lying between the P-loop and the β3-αC-helix loop (Figure ). Notably, this region is targeted by a new class of MEK1 and 2 kinase inhibitors.[73] This pocket also accommodates the piperazine–phenyl–pyrimidine moiety of SCH772984 in both ERK1 and ERK2 kinases.[91] This confirms the ability of our algorithm to identify small hidden cavities that can be exploited in the design of new selective inhibitors. Notably, our analysis captured even subtle protein conformational changes, such as split and merge events between the large ATP site and the two nearby subpockets. These events are more frequent in the inactive KDout than in the active KDin form. This intensive crosstalk reflects the high plasticity of the ATP binding site in KDout, as suggested by previous computational studies,[74,75] which results in a diminished catalytic activity. Lastly, the method was able to reveal a long-range connection between the orthosteric ATP binding site and the allosteric myristate pocket. This has frequently been reported in the literature as a crucial allosteric mechanism for kinase function.[79−82] Our analysis reveals a pocket crosstalk network, which passes through a set of small pockets located close to the αG and αI helices of the C-lobe, in a region that can be targeted by allosteric kinase inhibitors, as suggested by a recent computational study (Figure ).[83] Computational and experimental studies[76,77,79] suggest that the integrity of this communication network coincides with a proper functioning of Abl. Targeting this pocket communication network may thus offer a new way for modulating the activity of the protein, as exemplified here for the binding of dasatinib to Abl, and its allosteric modulation.[79] Indeed, both HX MS experiments and MD simulations indicated that the αG and αI helices are involved in the allosteric signal propagation in Abl.[79] Our analysis detected a perturbation of this pocket communication network in the dysregulated T315I Abl form only, which occurs at the level of the αG and αI helices of the C-lobe (Figure ). That is, we found that the T315I point mutation in the absence of the myristate interrupts this pocket crosstalk network, which might explain the dysregulation of the T315I Abl form.[84] Ultimately, within the sampling limitations of the input trajectories, this new algorithm’s dynamical analysis detects interaction networks, which involve rearrangements that may reveal new binding sites on the protein surface.[92] It can also point to possible mechanisms for allosteric signal propagation, spotting protein configurations that may be used as target structures for ligand binding thermodynamics through computations for structure-based drug design.[93]

Conclusions

In this work, we provide new mechanistic understandings about pocket formation and allosteric communication in three relevant drug discovery targets, namely, PNP, A2A, and Abl kinase. Importantly, these findings are obtained through the use of an original and fully automated method for analyzing pocket crosstalk along microsecond-long MD simulations, which are performed to examine the spatiotemporal evolution of proteins. We demonstrate that this unprecedented dynamical analysis can reveal otherwise hidden connections between pockets, which may also underlie allosteric communication networks in proteins, as discussed for the biomolecular systems here investigated. Ultimately, we propose dynamic pocket crosstalk analysis for a more detailed understanding of the structural dynamics of proteins and biological regulation through allosteric communication, suggesting a prospective use of this method in structure-based drug design.

Methods

Structural Models

In the present work, we analyzed the MD simulations of three different systems, namely, purine nucleoside phosphorylase (PNP), the adenosinic receptor (A2A), and the Abelson (Abl) kinase. The homotrimeric construct of PNP was modeled using 3K8O X-ray structures and was simulated in the presence of nine DADME-ImmH ligands and phosphate ions, retrieved from 1RSZ and 1RR6(94) PDB structures, respectively. The A2A receptor was mostly built using the 3UZC X-ray structure,[95] while the 4EIY X-ray structure[56] was used as a template for the missing ECL2 in 3UZC. The apo protein was embedded in POPC membrane. For Abl, we built five different model systems. Two included the apo wild-type KD alone in both the DFG-in (KDin) and DFG-out conformations (KDout). Two comprised the KD mutated at the gatekeeper residue T315 (located at the ATP binding site) with an isoleucine residue. These mutated forms are either in the apo form (T315I-KDin) or in complex with myristate (Myr/T315I-KDin). The last system is the wild-type KD in complex with myristate (Myr/KDin). The KDout system was built starting from the 1OPL chain B X-ray structure[96] after removal of the SH2 domain. The remaining systems were modeled starting from the 2F4J X-ray structure.[97] For further details, see the Supporting Information.

MD Simulations

PNP was parametrized using the Amber ff99SB-ILDN force field,[98] while the General Amber Force Field (GAFF)[99] was used to parametrize the ligands and the ions. Ligand partial charges were fitted using the RESP procedure via Antechamber.[100] The system was immersed in a TIP3P water box[101] and comprised ∼100,000 atoms. After 350 ns of equilibration, which led the system to a temperature of 300 K and to a pressure of 1 bar, we ran the production in NVT ensemble. For the analysis, we collected ∼700 ns of simulation stored in ∼1800 frames. A2A was parametrized using the Amber ff99SB-ILDN force field.[98] The protein and the POPC membrane were immersed in a TIP3P water box[101] reaching a total of 65,802 atoms. Here too, the system was first equilibrated to reach a temperature of 300 K and a pressure of 1 bar, while the production was run for 100 ns in NPT ensemble. We extracted ∼1000 frames from the trajectory for analysis. For adiabatic bias simulations, we used Plumed2.[102] The reaction coordinate was the distance between the sodium ion and the geometric center of the alpha carbons of Leu267, Asp175, and Gln163 (pdb 3EML). The line connecting this geometric center and the sodium ion is quasi parallel to the principal axis of the GPCR and thus is a reasonable steering coordinate. The spring constant was set to 500 kJ/mol/ Å2. The five Abl kinase systems were parametrized using the Amber ff99SB force field.[103] For Myr/T315I-KDin and Myr/KDin models, the myristate was parametrized using the General Amber Force Field (GAFF)[99] after computing the point charges at the HF/6-31G* level of theory. The systems were solvated in TIP3P water and comprised ∼45000 atoms. After 5 ns of equilibration, the systems reached a temperature of 300 K and a pressure of 1 bar, and we then ran the production in NPT ensemble for a simulation time that ranged from ∼0.8 μs to ∼2.5 μs. Here, our algorithm was used to analyze ∼16,000 to ∼50,000 frames. See the Supporting Information for more details.

Algorithm Parameters

For each analysis, we set the small and large probe radii to 1.4 and 3 Å, respectively. To avoid unnecessary noise due to the detection of very small pockets, we set the minimum volume of a detectable pocket to the equivalent of 5 water molecules for PNP and A2A, and 4 water molecules for Abl kinase. Finally, for tracking purposes, we removed all the ligands, ions, membrane, and water molecules from each trajectory, taking into account the sole protein.

96 in total

1. Lid domain plasticity and lipid flexibility modulate enzyme specificity in human monoacylglycerol lipase.

Authors: Laura Riccardi; Jose M Arencibia; Luca Bono; Andrea Armirotti; Stefania Girotto; Marco De Vivo
Journal: Biochim Biophys Acta Mol Cell Biol Lipids Date: 2017-01-12 Impact factor: 4.698

2. Automatic atom type and bond type perception in molecular mechanical calculations.

Authors: Junmei Wang; Wei Wang; Peter A Kollman; David A Case
Journal: J Mol Graph Model Date: 2006-02-03 Impact factor: 2.518

3. Complexes of HIV-1 integrase with HAT proteins: multiscale models, dynamics, and hypotheses on allosteric sites of inhibition.

Authors: Armida Di Fenza; Walter Rocchia; Valentina Tozzini
Journal: Proteins Date: 2009-09

4. Catalytic site conformations in human PNP by 19F-NMR and crystallography.

Authors: Javier Suarez; Antti M Haapalainen; Sean M Cahill; Meng-Chiao Ho; Funing Yan; Steven C Almo; Vern L Schramm
Journal: Chem Biol Date: 2013-02-21

5. Hot spots for allosteric regulation on protein surfaces.

Authors: Kimberly A Reynolds; Richard N McLaughlin; Rama Ranganathan
Journal: Cell Date: 2011-12-23 Impact factor: 41.582

6. Immucillin-H binding to purine nucleoside phosphorylase reduces dynamic solvent exchange.

Authors: F Wang; R W Miles; G Kicska; E Nieves; V L Schramm; R H Angeletti
Journal: Protein Sci Date: 2000-09 Impact factor: 6.725

Review 7. Cycling, stressed-out and nervous: cellular functions of c-Abl.

Authors: R A Van Etten
Journal: Trends Cell Biol Date: 1999-05 Impact factor: 20.808

8. Structural basis for the autoinhibition of c-Abl tyrosine kinase.

Authors: Bhushan Nagar; Oliver Hantschel; Matthew A Young; Klaus Scheffzek; Darren Veach; William Bornmann; Bayard Clarkson; Giulio Superti-Furga; John Kuriyan
Journal: Cell Date: 2003-03-21 Impact factor: 41.582

9. Exploring residue component contributions to dynamical network models of allostery.

Authors: Adam T Vanwart; John Eargle; Zaida Luthey-Schulten; Rommie E Amaro
Journal: J Chem Theory Comput Date: 2012-07-05 Impact factor: 6.006

10. Improved side-chain torsion potentials for the Amber ff99SB protein force field.

Authors: Kresten Lindorff-Larsen; Stefano Piana; Kim Palmo; Paul Maragakis; John L Klepeis; Ron O Dror; David E Shaw
Journal: Proteins Date: 2010-06

14 in total

1. Chokepoints in Mechanical Coupling Associated with Allosteric Proteins: The Pyruvate Kinase Example.

Authors: Lewis E Johnson; Bojana Ginovska; Aron W Fenton; Simone Raugei
Journal: Biophys J Date: 2019-04-02 Impact factor: 4.033

2. Microsecond dynamics in proteins by two-dimensional ESR: Predictions.

Authors: Pranav Gupta; Zhichun Liang; Jack H Freed
Journal: J Chem Phys Date: 2020-06-07 Impact factor: 3.488

Review 3. Investigating Cryptic Binding Sites by Molecular Dynamics Simulations.

Authors: Antonija Kuzmanic; Gregory R Bowman; Jordi Juarez-Jimenez; Julien Michel; Francesco L Gervasio
Journal: Acc Chem Res Date: 2020-03-05 Impact factor: 22.384

4. Sparse group selection and analysis of function-related residue for protein-state recognition.

Authors: Fangyun Bai; Kin Ming Puk; Jin Liu; Hongyu Zhou; Peng Tao; Wenyong Zhou; Shouyi Wang
Journal: J Comput Chem Date: 2022-06-03 Impact factor: 3.672

5. Probabilistic Pocket Druggability Prediction via One-Class Learning.

Authors: Riccardo Aguti; Erika Gardini; Martina Bertazzo; Sergio Decherchi; Andrea Cavalli
Journal: Front Pharmacol Date: 2022-06-29 Impact factor: 5.988

Review 6. Thermodynamics and Kinetics of Drug-Target Binding by Molecular Simulation.

Authors: Sergio Decherchi; Andrea Cavalli
Journal: Chem Rev Date: 2020-10-02 Impact factor: 60.622

7. Will the Real Cryptic Pocket Please Stand Out?

Authors: Rommie E Amaro
Journal: Biophys J Date: 2019-01-26 Impact factor: 4.033

8. Chemical Modulation of the 1-(Piperidin-4-yl)-1,3-dihydro-2H-benzo[d]imidazole-2-one Scaffold as a Novel NLRP3 Inhibitor.

Authors: Simone Gastaldi; Valentina Boscaro; Eleonora Gianquinto; Christina F Sandall; Marta Giorgis; Elisabetta Marini; Federica Blua; Margherita Gallicchio; Francesca Spyrakis; Justin A MacDonald; Massimo Bertinaria
Journal: Molecules Date: 2021-06-29 Impact factor: 4.411

Review 9. PI3K inhibitors: review and new strategies.

Authors: Mingzhen Zhang; Hyunbum Jang; Ruth Nussinov
Journal: Chem Sci Date: 2020-05-19 Impact factor: 9.825

Review 10. Allosteric Regulation at the Crossroads of New Technologies: Multiscale Modeling, Networks, and Machine Learning.

Authors: Gennady M Verkhivker; Steve Agajanian; Guang Hu; Peng Tao
Journal: Front Mol Biosci Date: 2020-07-09