The non-receptor tyrosine kinase proline-rich tyrosine kinase 2 (Pyk2) is a critical mediator of signaling from cell surface growth factor and adhesion receptors to cell migration, proliferation, and survival. Emerging evidence indicates that signaling by Pyk2 regulates hematopoietic cell response, bone density, neuronal degeneration, angiogenesis, and cancer. These physiological and pathological roles of Pyk2 warrant it as a valuable therapeutic target for invasive cancers, osteoporosis, Alzheimer's disease, and inflammatory cellular response. Despite its potential as a therapeutic target, no potent and selective inhibitor of Pyk2 is available at present. As a first step toward discovering specific potential inhibitors of Pyk2, we used an in silico high-throughput screening approach. A virtual library of six million lead-like compounds was docked against four different high-resolution Pyk2 kinase domain crystal structures and further selected for predicted potency and ligand efficiency. Ligand selectivity for Pyk2 over focal adhesion kinase (FAK) was evaluated by comparative docking of ligands and measurement of binding free energy so as to obtain 40 potential candidates. Finally, the structural flexibility of a subset of the docking complexes was evaluated by molecular dynamics simulation, followed by intermolecular interaction analysis. These compounds may be considered as promising leads for further development of highly selective Pyk2 inhibitors.
The non-receptor tyrosine kinase proline-rich tyrosine kinase 2 (Pyk2) is a critical mediator of signaling from cell surface growth factor and adhesion receptors to cell migration, proliferation, and survival. Emerging evidence indicates that signaling by Pyk2 regulates hematopoietic cell response, bone density, neuronal degeneration, angiogenesis, and cancer. These physiological and pathological roles of Pyk2 warrant it as a valuable therapeutic target for invasive cancers, osteoporosis, Alzheimer's disease, and inflammatory cellular response. Despite its potential as a therapeutic target, no potent and selective inhibitor of Pyk2 is available at present. As a first step toward discovering specific potential inhibitors of Pyk2, we used an in silico high-throughput screening approach. A virtual library of six million lead-like compounds was docked against four different high-resolution Pyk2 kinase domain crystal structures and further selected for predicted potency and ligand efficiency. Ligand selectivity for Pyk2 over focal adhesion kinase (FAK) was evaluated by comparative docking of ligands and measurement of binding free energy so as to obtain 40 potential candidates. Finally, the structural flexibility of a subset of the docking complexes was evaluated by molecular dynamics simulation, followed by intermolecular interaction analysis. These compounds may be considered as promising leads for further development of highly selective Pyk2 inhibitors.
The focal adhesion kinase (FAK) and its homologous FAK-related proline-rich tyrosine kinase 2 (Pyk2) define a distinct family of non-receptor tyrosine kinases that coordinate adhesion and cytoskeletal dynamics with survival and growth signaling. FAK and Pyk2 exhibit ~48% amino acid sequence identity, common phosphorylation sites, and a similar domain structure, which includes an N-terminal four-point one, ezrin, radixin, and moesin (FERM) domain, a kinase domain, three proline-rich regions, and a C-terminal focal adhesion-targeting (FAT) domain. Following integrin or growth factor stimulation, Pyk2 and FAK are autophosphorylated on a tyrosine residue (Y402 and Y397, respectively), which provides a critical binding site for Src kinase. Following its binding, Src phosphorylates additional tyrosines on Pyk2 or FAK, which are important for full activation of the kinases and for their binding to downstream signaling proteins.1 Although FAK is expressed in most cells, Pyk2 exhibits a more restricted expression pattern with the strongest expression in the central nervous system and in hematopoietic cells.2 FAK is a major intracellular signaling component of integrin-mediated cell adhesion and plays a role in signaling pathways mediated by growth factor receptors. PYK2, however, is activated by a variety of extracellular cues, including agonists of G protein-coupled receptors, increase in intracellular calcium concentration, inflammatory cytokines, and stress signals, as well as integrin-mediated cell adhesion.2–4 The differential expression and localization patterns of FAK versus Pyk2 might limit their functional redundancy and may suggest distinct and possibly antagonistic roles in cells.Pyk2 appears to be important for the organization of cytoskeletal components in a polarized manner during directional motility in response to chemotactic gradients in macrophages5 and in B cells6 and for integrin-mediated degranulation response of neutrophils during infection.7 Deletion of Pyk2 in mice leads to increase in bone mass due to enhanced differentiation and activity of osteoprogenitor cells8 as well as impairment in osteoclast function.9 Pyk2 was also found to regulate skin wound healing by controlling epidermal keratinocyte migration via a pathway that requires the expression and function of matrix metalloproteinases.10 The gene encoding Pyk2 was recently found as one of 11 new susceptibility loci for late-onset Alzheimer’s disease11 and as an in vivo marker and modulator of tau toxicity.12 Upregulation of Pyk2 expression has been noted in several humantumors, including glioma, hepatocellular carcinoma, nonsmall cell lung carcinoma, prostate cancer,13 and early and advanced breast cancers.14 The different physiological and pathological roles of Pyk2 present it as a high-value therapeutic target for inflammatory cellular responses, osteoporosis, Alzheimer’s disease, and invasive cancers.An increase in Pyk2 expression accompanied by a compensatory function for Pyk2 upon FAK loss has been described in FAK-deficient mouse embryonic fibroblasts (MEFs),4,15 following conditional deletion of FAK in macrophages;16 in the in vivo regulation of bone architecture and density;17 and in blood vessel formation during angiogenesis.18 The ability of Pyk2 to adopt a compensatory role in integrin-mediated signaling pathways suggests that strategies that will selectively target FAK might lack efficacy due to Pyk2 compensation, particularly in applications directed toward blocking inflammatory response, osteoporosis, and tumor angiogenesis.Clinical translation of kinase inhibitors has mainly focused on competitive inhibition of the catalytic domain. While classical kinase inhibitors bind directly to the ATP binding site, this approach has been challenged by the significant sequence and structure conservation of the catalytic domains among different protein kinases. With the exception of cancer therapeutics, where inhibition of multiple kinase targets might gain additional therapeutic benefits, minimizing off-target activity is most often desired in drug design. Therefore, an alternative approach has been the use of bipartite inhibitors that target both the adenosine triphosphate (ATP)-binding site and a less conserved adjacent hydrophobic region formed by the altered conformation of the activation loop containing the DFG (Asp–Phe–Gly) motif. Inhibitors of this group stabilize the inactive conformation of the kinase (DFG-out conformation) and offer a potential of improved selectivity.Despite significant efforts to develop a potent and selective inhibitor for Pyk2 over the last several years, the available inhibitors for Pyk2 lack the potency, selectivity, or have impaired pharmacokinetics,19,20 and no selective inhibitor has yet proceeded beyond pre-clinical trials, including Pfizer’s lead compound (S)-14a from the series of sulfoximine-substituted molecules. In spite of its increased selectivity for Pyk2 over FAK, N-methylsulfoximine(S)-14a lacks sufficient metabolic stability and is characterized by high in vivo clearance in rats.20In this study, we used an in silico high-throughput screening approach in order to identify potential Pyk2 kinase inhibitors from a large library of small molecules. Results of these studies uncovered potential lead-like molecules that were subjected to comparative docking and free binding energy calculations, followed by molecular dynamics (MD) simulations in order to identify compounds that are predicted to have affinity and selectivity for Pyk2. A final set of potential candidates with predicted selectivity for Pyk2 was assembled. These candidates may be considered as lead compounds that could be further developed into potent and selective Pyk2 inhibitors.
Methods
Database selection and ligand preparation
We used the commercially available ZINC12 database (http://zinc.docking.org/)21 for virtual screening, which contains more than 35 million purchasable compounds in ready-to-dock, three-dimensional formats. The lead-like subset of compounds, containing six million compounds, was selected on the basis of their properties to allow further optimization. The lead-like criteria properties are molecular weight between 250 and 350 g/mol, predicted partition constant (xLogP) ≤3.5, and number of rotatable bonds (RBs) ≤7. The subset was processed by Raccoon22 utility where missing polar hydrogen atoms were added.
Target preparation and grid generation
Four high-resolution crystal structures of humanPyk2 kinase domain with distinct conformations of the active site and activation loop were selected for our high-throughput in silico screening: 1) apo-Pyk2 (Protein Data Bank [PDB] ID: 3FZO); 2) Pyk2 in complex with the inhibitor PF-431396 (PDB ID: 3FZR); 3) Pyk2 in complex with the inhibitor PF-4618433 (PDB ID: 3FZT);19 and 4) Pyk2 in complex with N-methylsulfonamide 2a (PDB ID: 3H3C).20 The resolutions of the target PDB structures 3FZO, 3FZR, 3FZT, and 3H3C were 2.2, 2.7, 1.95, and 2.0 Å, respectively. Prior to docking simulations, missing side chains and loops were filled using Prime23 and termini were capped. Hydrogen atoms were added, bond orders were assigned, and ions were removed using the Protein Preparation Wizard.24 Crystallographic water molecules were removed as they contained less than three hydrogen bonds after sampling their orientations at pH 7 using PROPKA25 and optimizing their hydrogen bonds. Following this step, the holo form of the structures was minimized by a restrained energy minimization using the OPLS2005 force field and default root-mean-square deviation (RMSD) constraint of 0.3 Å. Refined structures were imported to AutoDockTools26 where Gasteiger charges were assigned to all atoms and non-polar hydrogen atoms were merged. For each target, three-dimensional affinity grids of dimensions up to 30×18×26 Å were centered around the active site with 1.0 Å spacing.
In silico high-throughput screening
AutoDock Vina27 was used for the docking simulation. The settings of exhaustiveness for finding the global minimums were defined as 4, and only the best ranked poses were retrieved. The screening calculations ran on a double-threaded 480 CPU 2.67 GHz Xeon Linux cluster machine.
Efficiency metrics scoring
To promote hits with favorable affinity and pharmacokinetics, several indices were utilized. The mean accumulated binding (MAB) index was developed to assess the contribution of multiple conformational binding of a molecule without impairing other valid molecules that are predicted to bind fewer structures.
where ΔGn is the calculated free energy of a bound molecule to a specific conformation in kcal/mol and N denotes the total number of binding molecules. In this sense, binding is defined as any Vina docking score within the top 10,000 molecules of the corresponding structure. The binding efficiency index (BEI) quantifies the efficiency of the binding affinity based on a molecular weight scale.28Surface-binding efficiency index (SEI) quantifies the efficiency of the binding affinity based on total polar surface area (TPSA).28Lipophilic Ligand Efficiency (LLE) estimates the specificity of a molecule in binding to the target relative to the calculated partition constant (xLogP).29Pareto score optimizes multiple objectives, BEI versus SEI (BvS) simultaneously in a trade-off approach.28
where BEIr and SEIr are the ranking of the respective index. Cheminformatic analysis was performed using MATLAB.
Visual inspection
The top-ranked molecules containing undesirable chemical groups and substructures,30 including toxic and highly metabolized molecules, were filtered via visual inspection. Molecules containing more than four aliphatic or aromatic rings were filtered as too many rings pose a liability in drug design.31,32 In addition, compounds containing more than two merged aromatic rings were excluded to avoid highly planar structures.33,34
Selectivity evaluation and screening
The 240 top filtered molecules were submitted for a second screen in AutoDock Vina and Glide35 to collect the compounds that are selective for Pyk2 compared to FAK. The crystal structures of Pyk2 (PDB ID: 3FZT and 3H3C) and FAK (PDB ID: 1MP8) were prepared as described earlier, including the grid generation for docking in Vina. As a preparation step for docking in Glide, the molecules were prepared using LigPrep Wizard36 and grids with similar box size were generated in the Receptor Grid Generation of Schrodinger.37To assess the docking score more reliably, exhaustiveness in Vina was increased to 8 and the extra-precision (XP) mode was used in Glide. The resulting poses predicted in Glide XP for Pyk2 and FAK were rescored and compared using Prime MM-GBSA (molecular mechanics/generalized born surface area). Prime MM-GBSA is a physics-based method that calculates the force field energies of the bound and unbound states of the protein–ligand complex.38 The van der Waals surface-based surface generalized born 2.0 implicit solvation model was used with the OPLS2005 force field, and residue flexibility was defined throughout the structure. The MM-GBSA binding free energy is defined as follows:39,40ΔEmm is the difference in energy between the complex structure and the sum of the energies of the unbound ligand and protein. ΔGsol is the difference in the solvation energy of the complex and the sum of the solvation energies for the unbound ligand and protein. ΔGsa is the difference in the surface area energy for the complex and the sum of the surface area energies for the unbound ligand and protein.40 Entropy terms related to the ligand and protein are not incorporated, due to the expensive computational demand and no apparent improvement in many cases.41,42
MD simulation
The final drug candidates were submitted to MD simulation performed using the Desmond package43 and OPLS2005 force field. The docking models of the most selective ligands as calculated by MM-GBSA were used as the initial coordinates for the MD calculation. The TIP3 explicit water model was used to define the solvent.44 The simulation solvent cell box with periodic boundary conditions was set to orthorhombic shape with a buffer distance of 10 Å to each dimension. The system was neutralized by placing counter ions of 6Na+ and 3Cl− for the structures of Pyk2 and FAK, respectively, and background salt was added to the solvent with a concentration of 0.15 M.45 Prior to the production of molecular dynamic simulations, the system was equilibrated using the default Desmond relaxation protocol. The final state of the system was set to 300 K and 1 atm,39 and the equations of motions were integrated with a 2 fs time step.46 The simulations were performed in the NPT (constant number of atoms, pressure, temperature) ensemble using Nosé–Hoover thermostat with 1 ps relaxation time and Martyana–Tobias–Klein barostat with a 2 ps relaxation time.47 The smooth particle mesh Ewald (PME) algorithm48 was used to deal with long-range electrostatic interactions with a cutoff radius of 9 Å.49 The production of 20 ns long MD simulations was performed on each complex system, and dynamic trajectories were analyzed in Desmonds’ Maestro simulation analysis tools and MATLAB. Quantitative analysis of the ligand–protein molecular interactions was calculated over the MD simulation. The distances as defined in Desmond for hydrogen bonds were <2.5 Å. General hydrophobic interaction distances were within 3.6 Å, and π–cation and π–π distances were within 4.5 Å. Ionic interaction distances were within 3.7 Å, and hydrogen bonding via water bridge molecule distance was <2.7 Å.
Results
In order to identify potential inhibitors of Pyk2 kinase domain, an in silico high-throughput screening approach was used. To enhance the sampling of the target conformational space, four different Pyk2 kinase domain crystallographic structures with unique conformations were processed and prepared for in silico screening. The four crystal structures represent different conformations that were induced by ligands that have different selectivity profiles: 1) 3FZO is the apo, unbound state;19 2) 3FZR adopts an active conformation state (DFG-in) and contains an ATP-mimetic inhibitor, PF-431396, with a low selectivity to Pyk2;19 3) 3FZT adopts an inactive conformation (DFG-out) and contains an allosteric inhibitor, PF-4618433, selective to Pyk2;19 and 4) 3H3C adopts an active form (DFG-in) and contains the inhibitor N-methylsulfonamide 2a, which is selective to Pyk2.20 The two-dimensional interaction plots of the ligands are presented in Figure S1. A library of more than six million purchasable lead-like molecules from the ZINC database was screened against the binding pockets of each of the four structures (active site grids). The primary criterion used to select for initial set of molecules was binding free energy values (ΔG) calculated by AutoDock Vina. A histogram of the affinity scores for 10,000 molecules for each of the structures is plotted in Figure 1 and depicts the highest scores for 3FZT and 3H3C, followed by 3FZR and 3FZO in a semi-logarithmic scale.
Figure 1
Histogram of binding affinities for the top 10,000 compounds docked to Pyk2.
Note: Shown are semi-logarithmic plots of the binding affinities (ΔG) calculated using AutoDock Vina of the top 10,000 compounds docked to Pyk2 for PDB IDs (A) 3FZO, (B) 3FZR, (C) 3FZT, and (D) 3H3C.
To compare the affinities of the screened compounds to affinities of existing Pyk2 inhibitors, we docked the known inhibitors into their corresponding structures using AutoDock Vina. The calculated binding energies of the reference compounds ranged from −8.7 to −9.3 kcal/mol, which was similar to the range engaged by the top 10,000 molecules for the apo structure 3FZO and was higher than all the top compounds of the rest of the three structures. Therefore, the cutoff for our docking studies was set to include the top 10,000 molecules with the highest affinity for each of the four structures and considered for further evaluation, composing a pool of 40,000 docked modes.To insure proper target preparation, we calculated the RMSD between the crystallographic poses and the docked poses of PDB IDs 3FZR, 3FZT, and 3H3C (Figure S2). The RMSD values of the reconstituted pose ranged from 0.41 to 2.57 Å, supporting the target preparation for docking protocols. To support the goodness of predictive docking screens, enrichment experiments were carried out. First, we collected a dataset comprising 124,000 decoys from the Directory of Useful Decoys (DUD) database50 and 18 actives known from the literature20 via Konstanz Information Miner (KNIME). In addition, we collected another dataset containing 800 decoys generated by the DUD-enhanced (DUD-E) server51 based on the 18 actives. These two datasets were then screened against all four Pyk2 target structures using AutoDock Vina. The docking enrichment plots for four protein targets are shown in Figure S3. The docking enrichment plots show that the percentage of true ligands found by docking, at any given percentage, of the docking-ranked database is almost always greater compared to being chosen by random selection.
Compound processing and ranking
We analyzed the retrieved compounds by combining the four sets of compounds to a single pool of compounds and scored via affinity and efficiency-based metrics: binding energy (ΔG), MAB, BEI, SEI, and LLE. The top 1,000 molecules for each index were ranked by the corresponding value and are presented as scatter plots in Figure 2. In an attempt to identify top hits, which are suitable for drug development, and to remove false positives, we used several sub-structural filters by manual review to remove unwanted groups.30 Owing to the potential detrimental effect of polycyclic compounds with a high ring count,31 molecules were additionally filtered out and restricted to a maximal limit of aromatic and aliphatic rings.
Figure 2
Index scores of top 1,000 compounds docked to Pyk2.
Notes: (A) Scatter plots of the calculated index score of the top 1,000 compounds ranked by value. The calculated indices correspond to binding affinity (ΔG), MAB, BEI, SEI, LLE, and plot of SEI score as a function of BEI score (BvS). Data points are plotted in blue, and the top 10 filtered compounds are displayed by red circles. (B) Pie charts demonstrating the distribution of the top compounds to the PDB target with the highest scores for the six indices.
Abbreviations: Pyk2, proline-rich tyrosine kinase 2; MAB, mean accumulated binding; BEI, binding efficiency index; SEI, surface-binding efficiency index; LLE, lipophilic ligand efficiency; BvS, BEI versus SEI.
An additional approach for compound selection is the Pareto-based ranking scheme. The Pareto-based ranking scheme takes into account multiple objectives for the optimization/selection process52 and combines their separate contribution in the final ranking. The multiple objectives, BEI and SEI, were optimized by scoring each compound with the sum of molecules dominating it in any of the indices. The top 10 filtered molecules retrieved for every index and the mapping of BvS indices for the top 1,000 molecules are presented in Figure 2A. Pareto ranking is a strategy particularly well suited for competitive multi-objective problems. In many cases, the BEI and SEI objectives are correlated; however, this is not always true,53 and in this study, BEI and SEI were considered as partially competitive multi-object problems.To visualize target distribution of compounds that displayed the best docking scores and efficiency values for every parameter, a chart summarizing the distribution of the top 1,000 compounds binding to a specific structure is depicted in Figure 2B, where binding is defined as any docking score within the top 10,000 molecules of the corresponding structure. The resulting distribution revealed that 3FZT that possessed a DFG-out conformation was a dominating structure for the majority of the compounds.
Selection of the highest-ranked compounds
From each index, 40 molecules were independently collected for further examination. The two-dimensional structures and molecular weights of the top 5 filtered candidates with the highest ranking per index are shown in Table 1, and their physicochemical properties along with the calculated indices are shown in Table 2. The total 240 compounds were evaluated for appearance in the literature using KNIME54 nodes, which facilitate querying and retrieving data from the ChEMBL55 bioactivity database via RESTful Web Services.56 These molecules were not detected as known inhibitors for Pyk2.
Table 1
Top 5 filtered predicted hits ranked by six indices
Note: Gray shadings represent the scoring index by which the compounds were enriched.
Abbreviations: Pyk2, proline-rich tyrosine kinase 2; Mw, molecular weight; TPSA, total polar surface area; MAB, mean accumulated binding; BEI, binding efficiency index; SEI, surface-binding efficiency index; LLE, lipophilic ligand efficiency; BvS, BEI versus SEI; ZINC, Zinc is not commercial.
Table 2
Molecular properties and index scores of the five highest ranked filtered compounds for Pyk2
Score
ZINC ID
Mw
xLogP
TPSA
No of structures
Affinity
MAB
BEI
SEI
LLE
BvS
Affinity
97306129
349
3.25
72
3
−11.8
−11.8
0.024
0.12
5.18
–
04497319
348
3.13
49
4
−11.4
−10.4
0.023
0.17
5.01
–
60976516
349
3.49
76
3
−11.3
−11.3
0.023
0.11
4.58
–
22506733
345
3.4
61
3
−11.3
−10.5
0.023
0.13
4.67
–
89569030
311
2.24
61
3
−11.3
−11.3
0.026
0.13
5.83
1,132
MAB
91775214
347
2.35
63
3
−11.5
−11.5
0.024
0.13
5.86
–
77010348
336
1.58
90
3
−11.3
−11.3
0.024
0.09
6.49
–
77197683
348
2.67
61
3
−11.3
−11.3
0.026
0.13
5.83
–
77977063
320
3.3
49
4
−11.3
−11.3
0.025
0.16
4.77
1,204
19855164
348
3.49
24
4
−11.3
−11.3
0.023
0.34
4.58
–
BEI
83664860
257
3.28
32
3
−10.3
−10.3
0.029
0.23
4.08
249
43646021
266
3.32
32
3
−10.6
−10.6
0.028
0.24
4.25
247
96153562
262
2.91
56
3
−10.4
−10.4
0.028
0.13
4.52
724
05942802
263
3.16
29
3
−10.4
−10.4
0.028
0.26
4.27
234
32603600
261
2.91
62
3
−10.2
−10.2
0.028
0.12
4.38
905
SEI
21364194
296
1.00
4
4
−10.3
−10.5
0.025
1.84
6.36
–
04933001
343
0.86
9
3
−10.3
−10.3
0.021
0.82
6.50
–
43670548
300
3.28
16
3
−10.5
−10.5
0.025
0.47
4.22
992
55292595
321
3.38
16
4
−10.3
−10
0.023
0.46
3.98
–
02810101
338
1.95
20
3
−10.6
−10.6
0.022
0.38
5.62
–
LLE
07744353
345
−1.44
50
3
−10.2
−10.2
0.021
0.15
8.73
–
77585305
348
−1.03
44
3
−10.6
−10.6
0.022
0.17
8.60
–
29392239
343
−1.00
24
3
−10.5
−10.5
0.022
0.31
8.50
–
29347469
333
−1.43
50
2
−9.8
−9.8
0.021
0.14
8.43
–
57785087
329
−0.97
62
3
−10.3
−10.3
0.022
0.12
8.33
–
BvS
72128349
261
3.32
29
2
−10.8
−10.8
0.030
0.27
4.49
175
83596759
269
3.41
21
2
−10.4
−10.4
0.028
0.35
4.02
227
41384833
267
3.33
33
2
−10.2
−10.2
0.027
0.22
3.96
392
82802913
278
2.32
32
3
−10.4
−10.4
0.027
0.23
5.11
466
21996132
269
3.23
39
3
−10.3
−10.3
0.027
0.19
4.13
493
Note: Gray shadings represent the scoring index by which the compounds were enriched.
Abbreviations: Pyk2, proline-rich tyrosine kinase 2; Mw, molecular weight; TPSA, total polar surface area; MAB, mean accumulated binding; BEI, binding efficiency index; SEI, surface-binding efficiency index; LLE, lipophilic ligand efficiency; BvS, BEI versus SEI; ZINC, Zinc is not commercial.
Selectivity prediction by calculation of binding free energies
To evaluate the potential specificity for Pyk2 over FAK among the top retrieved predicted filtered hits, two different approaches were used. To predict the difference in binding association with the targets, we performed a second docking screen with a higher precision using increased exhaustiveness in Vina and Glide XP modes and the generated docked mode of Glide was utilized as an input to additional binding free energy scoring using Prime MM-GBSA. By implementing more precise docking methodologies and combining MM-GBSA, which outperforms docking in predicting the binding affinities to experimental data,57 we can predict more reliably the differences in ligand-binding efficiencies of Pyk2 versus FAK. The correlation coefficient between calculated and experimental binding of Prime MM-GBSA, based on a diverse set of 855 complexes, was reported to be R2=0.63.58 Since the predominant Pyk2 structure that bound the compounds contains a DFG-out conformation, a high-resolution crystal structure of FAK domain that possesses a DFG-out conformation was selected (PDB ID: 1MP8).59 The FAK structure was imported and prepared for docking and was then used as a reference for estimating the selectivity of the top 240 compounds that were selected as mentioned earlier. For selectivity prediction, both the DFG-in and DFG-out conformations were used (PDB ID: 3FZT and 3H3C), as the predicted Vina scores of cognate ligands for both conformations were similar. Only 40 compounds that resulted in favorable binding to Pyk2, as calculated by the difference in at least 2 out of 3 binding energies using Vina, Glide, and Prime MM-GBSA software, were further processed. Using these methods, we selected 20 potential inhibitors ranked by MM-GBSA for 3FZT (ZINC06232011, ZINC02529497, ZINC01646132, ZINC15952140, ZINC18700196, ZINC00173518, ZINC00217347, ZINC35514633, ZINC04975487, ZINC07503677, ZINC03358424, ZINC05078298, ZINC09550062, ZINC03421151, ZINC65845103, ZINC02644767, ZINC03341864, ZINC00269705, ZINC10556478, and ZINC06648526) and 20 potential inhibitors ranked by MM-GBSA for 3H3C (ZINC06232011, ZINC02529497, ZINC18700196, ZINC01646132, ZINC15952140, ZINC05244105, ZINC71894482, ZINC08104814, ZINC00094214, ZINC58514284, ZINC04842554, ZINC00470121, ZINC64790378, ZINC67630577, ZINC00004724, ZINC25251328, ZINC00782941, ZINC82137153, ZINC05626761, and ZINC00617844). Notably, the top 5 predicted selective candidates for 3H3C were among the top 8 selective compounds for 3FZT. Results showed that ZINC06232011 was predicted to be the most selective ligand based on MM-GBSA scoring with Pyk2 (PDB ID 3FZT) ΔG of −35.37 kcal/mol and FAK ΔG of −5.50 kcal/mol, which results in ΔΔG of −29.87 kcal/mol. The top eight candidates (PDB ID 3FZT) were ranked by the energy difference MM-GBSA ΔΔG, and their two-dimensional structures with calculated binding energies are presented in Table 3. Although MM-GBSA was the main method to rank the compounds, it should be pointed out that the procedure used a continuum model of the solvent and this approximation can strongly affect the calculated binding energies, which may result in unreliable results.57
Table 3
Candidates docking and MM-GBSA binding energy of Pyk2 and FAK ligand complexes
Structure
Vina Pyk2 ΔG
Vina FAK ΔG
Vina ΔΔG
Glide Pyk2 ΔG
Glide FAK ΔG
Glide ΔΔG
Pyk2 MM-GBSA ΔG
FAK MM-GBSA ΔG
MM-GBSA ΔΔG
−9.2
−9.9
−0.7
−7.6
−7.1
−0.5
−35.37
−5.50
−29.87
−8.7
−10.2
−1.5
−9.5
−6.4
−3.1
−22.77
5.97
−28.74
−7.9
−9.8
−1.9
−7.9
−6.2
−1.7
−23.42
2.62
−26.04
−9.6
−11.3
−1.7
−7.1
−5.8
−1.3
−32.77
−7.71
−25.06
−8.7
−9.8
−1.1
−6.8
−5.5
−1.3
−34.35
−11.86
−22.49
−7.9
−10.4
−2.5
−8.7
−5.2
−3.5
−47.05
−26.94
−20.10
−7.8
−10.6
−2.8
−9.0
−5.45
−3.6
−21.09
−1.80
−19.29
−9.8
−10.4
−0.6
−10.7
−7.7
−3.0
−52.77
−33.73
−19.04
Abbreviations: Pyk2, proline-rich tyrosine kinase 2; FAK, focal adhesion kinase; MM-GBSA, molecular mechanics/generalized Born surface area; ZINC, Zinc is not commercial.
To support the predictive goodness of this selectivity assay, we compared the predicted binding energy and interaction profile of the native control compounds of PDB IDs 3FZR (PF-431396), 3FZT (PF-4618433), and 3H3C (N-methylsulfonamide 2a) with those obtained using experimental data (Table S1). The predicted binding energies agreed with experimental activity assays, and the interaction profile was similar. 3FZR displayed the smallest predicted selectivity (largest ΔΔG), followed by 3H3C and 3FZT. By consensus, the predicted ΔΔG ranked the cognate ligands according to experimental data and thus substantiated our techniques for finding potential selective Pyk2 inhibitors.
Analysis of the identified molecules
The binding poses of the top candidate compounds bound to Pyk2 (3FZT) as predicted by MM-GBSA are given in Figure 3, and two-dimensional interaction plots are presented in Figure S4. Docking pose analysis revealed one hydrogen bond between Tyr505 and ZINC06232011, ZINC01646132, and ZINC00217347, in which the last two form π–π interactions with Phe568. Also observed were two hydrogen bonds of ZINC02529497 with Asp567 and with Glu474, respectively, as well as a cation–π interaction with Arg572. Compounds ZINC159521402, ZINC00173518 and ZINC97378786 were involved in a similar interaction forming two hydrogen bonds with Glu474 and one hydrogen bond to Asp657, while the last compound also formed π–π interaction with His547. Interestingly, ZINC18700196 was located furthest away from the ATP-binding site and formed a total of four hydrogen bonds with residues Lys457, Asp567, and Arg572, while still involved in π–π interaction with Phe436 and two cation–π interactions with Arg572. Molecular descriptors of physicochemical properties, ligand efficiency scores, and bound structures with the predicted highest binding affinity are presented in Table S2.
Figure 3
Binding poses of the eight candidates in the Pyk2 (PDB ID: 3FZT) binding site.
Notes: Shown are the predicted interactions formed by the compounds (A) ZINC06232011, (B) ZINC02529497, (C) ZINC01646132, (D) ZINC15952140, (E) ZINC18700196, (F) ZINC00173518, (G) ZINC00217347, and (H) ZINC97378786 in the active site. The compounds are represented in cyan sticks. The Pyk2 structure is shown as a green ribbon diagram with exception to the activation loop containing the DFG-motif, which is shown in purple sticks. The yellow dashed lines represent hydrogen bonds, and blue dashed lines denote hydrophobic interactions. The binding poses were obtained by Prime MM-GBSA.
Abbreviations: Pyk2, proline-rich tyrosine kinase 2; PDB, Protein Data Bank; MM-GBSA, molecular mechanics/generalized Born surface area; DFG, Asp-Phe-Gly.
For selectivity prediction, both the DFG-in and DFG-out conformations were used. The predicted Vina scores of cognate ligands for the DFG-out and DFG-in were similar and differed by 1.0–1.5 kcal/mol (which is lower than Vina’s standard error of 2.85 kcal/mol).27 Thus, we decided to use both DFG-in (PDB ID 3FZT) and DFG-out (PDB ID3H3C) conformations.An alternative way to interpret the contribution of each scoring profile is to visualize the ranking of the compound instead of its scoring value. The information is displayed in Figure 4 by radar plots, where the value of each property corresponds to the ranking of the score; closer to the center indicates a property with a good result, while far from the center fails to compete with the rest of the compounds.
Figure 4
Radar plot scores of the top 8 eight candidates for Pyk2 (PDB ID: 3FZT).
Notes: Each radial axis represents the compound ranking in the index scoring profile of (A) ZINC06232011, (B) ZINC02529497, (C) ZINC01646132, (D) ZINC15952140, (E) ZINC18700196, (F) ZINC00173518, (G) ZINC00217347, and (H) ZINC97378786. The cutoff value above which the rankings are omitted was set to 1,000.
Abbreviations: Pyk2, proline-rich tyrosine kinase 2; PDB, Protein Data Bank; SEI, surface-binding efficiency index; LLE, lipophilic ligand efficiency; BvS, BEI versus SEI; MAB, mean accumulated binding; BEI, binding efficiency index.
To take into account structural flexibility, the behavior of a subset of the predicted complexes of Pyk2 and FAK was compared by MD simulation. The top 8 Pyk2DFG-out candidates were incorporated in Desmond, and MD simulation was performed in explicit aqueous solution for 20 ns for each complex (Figure 5A). To explore the dynamic stability, RMSD of protein–ligand complexes of Pyk2 (3FZT) and FAK (1MP8) against their initial structure was generated and analyzed using MATLAB. The backbone RMSDs were stable throughout the simulations, with the exception of compound ZINC02529497, where there was a sudden increase in deviation at 9 ns within the FAK complex (Figure 5B).
Figure 5
RMSDs during MD simulation of Pyk2 (3FZT) and FAK (1MP8) of protein–ligand complexes.
Notes: Plotted are the RMSDs of the protein backbone of (A) Pyk2 and (B) FAK of protein–ligand complexes during 20 ns MD simulation. Similarly shown are the RMSDs of the ligand position in the binding site of (C) Pyk2 and (D) FAK of protein–ligand complexes during the same 20 ns MD simulation. Also shown are the RMSFs of (E) Pyk2 and (F) FAK residues along the 20 ns MD simulation. Note the ligands selectivity for Pyk2 as indicated by low dispersion.
Ligand positional RMSDs were generated to evaluate and compare the binding stability of the lead molecules for each of the targets. Pyk2 complexes demonstrated conformational stability of the ligand in which the RMSD values remained within 0.15 nm (Figure 5C), whereas half of the FAK complexes showed higher RMSD values throughout most of the simulation time (Figure 5D). The computed RMSD values of the ligands ZINC02529497, ZINC01646132, ZINC159521402, and ZINC97378786 in complex with FAK were >0.15 nm and reached 0.25 nm, indicating lower stability.Root-mean-square fluctuations (RMSFs) were generated to evaluate and compare the residual mobility of Pyk2 and FAK while bound to each of the lead compounds. The RMSFs were integrated along the MD simulation time for each protein–ligand complex and were plotted against the residue number (Figure 5E). In all cases, the computed RMSFs of the residues were lower than 0.6 nm and did not produce any abnormal fluctuations, with the exception of ZINC02529497 and ZINC01646132, which produced fluctuations >0.6 nm in FAK (Figure 5F).
Intermolecular interaction analysis
To determine the stability of protein–ligand binding during the trajectory period of the MD simulation, the intermolecular interactions of ligands in complex with Pyk2 and FAK were analyzed. The intermolecular interaction analysis of the complexes was generated in Desmond and processed in MATLAB, and the time-dependent data were integrated for each interaction type to compare the results as illustrated in Figure 6. The analysis revealed that ZINC18700196 was superior in binding to Pyk2 compared to FAK in all interaction types: hydrogen bond, hydrophobic, ionic, π–cation, π–π, and water bridge. ZINC159521402 exhibited a similar profile, excluding redundant ionic and π–π interactions, which are very low for Pyk2 and FAK. The compounds ZINC06232011, ZINC02529497, and ZINC97378786 bound preferably to Pyk2 in most of the interactions, particularly by hydrogen bonds and hydrophobic interactions, while π–π and water bridges had similar or lower values for Pyk2. The candidate molecules ZINC01646132 and ZINC00217347 maintained higher amount of hydrogen bonds to FAK, while ZINC00173518 showed more hydrophobic, ionic, and substantial increase in water bridge interactions to FAK. Overall, the MD simulations of the ligand complexes with Pyk2 display stability under dynamic conditions and the analysis supports the binding energy predictions.
Figure 6
Total number of intermolecular interactions of the candidates.
Notes: Hydrogen bond, hydrophobic, ionic, π–cation, π–π, and water bridge interactions involved in Pyk2 (3FZT; red) and FAK (1MP8; blue) complexes: (A) ZINC06232011, (B) ZINC02529497, (C) ZINC01646132, (D) ZINC15952140, (E) ZINC18700196, (F) ZINC00173518, (G) ZINC00217347, and (H) ZINC97378786. The results were integrated over 20 ns MD simulation and presented as the amount of interactions per 1 ns.
Abbreviations: Pyk2, proline-rich tyrosine kinase 2; FAK, focal adhesion kinase; MD, molecular dynamics; ZINC, Zinc is not commercial.
Discussion
The identification of a proper lead compound for a given molecular target is a critical step in the process of drug discovery. Traditional high-throughput screening of large chemical libraries has been a primary source for identification of novel lead compounds. However, the high cost and low hit rate, which are associated with high-throughput screening, have stimulated the development of computational alternatives and the broad application of the cheaper and faster in silico screening approach.60 With an increasing number of targets being identified by high-throughput genomics and proteomics efforts, in silico screening may provide an excellent complementary approach to the traditional high-throughput screening and will potentially improve the speed and efficiency of drug discovery and development processes.As part of our effort to identify novel candidates for Pyk2 (Figure 7), potency and several efficiency metrics that attempt to define aspects of compound quality have been used. The tendency to focus on enhancing potency rather than practicing a more holistic approach in drug discovery was suggested to be responsible for failing drug discovery projects with molecules burdened by excessive molecular weight and lipophilicity, referred to as molecular obesity.61 In order to avoid biasing toward molecules where the dominating attractor is potency, we allowed only part of the retrieved predicted hits to be beneficially prioritized for potency. Since the best performing single structure is not known in advance and using multiple fixed protein conformation is useful in predicting active compounds by docking calculations,62,63 we virtually screened four different Pyk2 structures and defined MAB to score the contribution of a well docking compound versus multiple conformations. With the increasing popularity of ligand efficiency indices (Figure S5) and increasing involvement in contemporary drug discovery,64 we aimed to balance potency and ADMET (absorption, distribution, metabolism, excretion and toxicity) with BEI, SEI, LLE, and the combination of BEI and SEI. These indices are among the most used and robust metrics,64–66 and we have implemented them computationally with calculating the binding energy as a surrogate for in vitro potency, as has been proposed by Abad-Zapatero.53 To our knowledge, this is the first study to test the efficiency metrics, including BEI, SEI, and LLE, in an in silico high-throughput screening approach that combines selectivity scoring methodologies and incorporates molecular docking, MM-GBSA, and MD.
Figure 7
Overview of the virtual screen workflow.
Notes: Candidate inhibitors of the Pyk2 kinase domain were identified by in silico high-throughput screening. Binding sites of four different Pyk2 crystal structures were generated and prepared for screening using the ZINC lead-like dataset of six million compounds. The compounds were screened against the four structures, and the top 10,000 compounds were retrieved for each conformation. The compounds were scored by six indices, which measure potency and efficacy. For each index, the 1,000 top molecules were filtered by sub-structural features using visual inspection to obtain the top 40 filtered candidates. Ligand selectivity for Pyk2 over FAK was evaluated by the difference in binding energy using two docking algorithms Vina and Glide and by Prime MM-GBSA to obtain 40 potential candidates. The top eight compounds in complex with Pyk2 (PDB ID: 3FZT) and FAK (PDB ID: 1MP8) were subjected to MD analysis.
Abbreviations: Pyk2, proline-rich tyrosine kinase 2; FAK, focal adhesion kinase; PDB, Protein Data Bank; MD, molecular dynamics; ZINC, Zinc is not commercial.
Despite the advances in computational aided drug design and widespread application of docking methods, there are still limitations that affect the accuracy of the predictions. These limitations include unavoidable imprecision of crystal structures67 and the lack of information about the number and free energy of water molecules.57 In addition, the inhibition of kinase activity is not necessarily a direct indicator of binding affinity, and the idealized conditions of the docking simulations that predict binding free energies rarely reflect the conditions in experimental assays.68The top predicted selective candidates exhibit fairly low SEI scores, and the Pareto-based BvS score was mostly dominated by BEI, which is also demonstrated in Figure 2A. Among these candidates, LLE was the least enriched metric. Although the initial compound enrichment in the virtual screen was scored by potency, the affinity and MAB rankings show that most of the top predicted selective candidates were not among the top predicted potent compounds.The high structural similarity of Pyk2 and FAK kinase domains may pose a challenge in discovering novel selective and potent inhibitors. Selective ligands with reduced potency for Pyk2 such as PF-4618433 were observed, and other studies have identified selective compounds but at the expense of potency.69–71 We aimed to avoid focusing mainly on enhancing potency and to fulfill our preconditions by using efficiency metrics; the compounds were subjected to more stringent criteria affecting their affinity. In addition, the affinity of the identified ligands was compromised by small molecular weight, lipophilicity, and polarity.Clinical experience with kinase inhibitors has demonstrated that inhibition of protein tyrosine kinases should not rely exclusively on modulation of catalytic activity due to specificity issues and the unexpected emergence of resistance. Thus, combining kinase inhibition with approaches that inhibit extra-catalytic modules that regulate effector functions of tyrosine kinases would be a welcome asset to the therapeutic arena. In addition to its tyrosine kinase activity, Pyk2 has scaffolding functions in the formation of multi-protein signaling complexes. Therefore, targeting extra-catalytic protein modules that regulate complex assembly may provide a complementary approach for efficient and specific inhibition of Pyk2. Future studies that will characterize Pyk2-mediated signaling complex formation will be necessary in order to achieve this significant goal.Two-dimensional representation of intermolecular interactions of Pyk2 inhibitors with the Pyk2 kinase domain.Notes: Interactions are illustrated for ligands with the following PDB IDs: (A) 3FZR (Pyk2 kinase domain in complex with PF-431396); (B) 3FZT (Pyk2 in complex with PF-4618433); and (C) 3H3C (Pyk2 in complex with N-methylsulfonamide 2a).Abbreviations: Pyk2, proline-rich tyrosine kinase 2; PDB, Protein Data Bank.Predicted binding modes’ comparison of Pyk2 crystallographic ligands.Notes: (A) PDB ID 3FZO: apo state of Pyk2. Overlay of the predicted (turquoise) binding poses by AutoDock Vina versus crystallographic (gray) poses of (B) PDB ID 3FZR: Pyk2 binding pocket with inhibitor stabilizing the DFG-in motif, (C) PDB ID 3FZT: Pyk2 binding pocket with inhibitor stabilizing the DFG-out motif, and (D) PDB ID3H3C: Pyk2 binding pocket with inhibitor stabilizing the DFG-in motif. In the rightmost column, the chemical structures of the bound inhibitors are shown. Note the residues of the DFG motif highlighted in purple in (B–D).Abbreviations: Pyk2, proline-rich tyrosine kinase 2; PDB, Protein Data Bank.Docking enrichment plots for four Pyk2 protein targets using DUD.Notes: The docking-ranked database (x-axis) is plotted against the percentage of docked known ligands found by calculations (y-axis) in (A) 3FZO, (B) 3FZR, (C) 3FZT and (D) 3H3C. The gray line represents the results expected from selecting ligands randomly. The blue and red lines show the docking enrichment against the DUD database and the active-based decoys datasets, respectively. Note that the percentage of true ligands found by docking is greater compared to being chosen randomly.Abbreviations: Pyk2, proline-rich tyrosine kinase 2; DUD, Directory of Useful Decoys.Two-dimensional representation of intermolecular interactions of the candidate ligands of Pyk2 with the Pyk2 kinase domain.Notes: Interactions are illustrated for the following ligands: (A) ZINC06232011, (B) ZINC02529497, (C) ZINC01646132, (D) ZINC15952140, (E) ZINC18700196, (F) ZINC00173518, (G) ZINC00217347, and (H) ZINC97378786.Abbreviations: Pyk2, proline-rich tyrosine kinase 2; ZINC, Zinc is not commercial.Increase in the number of annual publications using ligand-binding efficiency indices.Note: Total annual publications in the years 1968–2015 as obtained by a PubMed search for the topic ligand efficiency.Predicted binding energies of the cognate ligandsAbbreviations: Pyk2, proline-rich tyrosine kinase 2; FAK, focal adhesion kinase.Molecular properties and index scores of the top candidates for Pyk2 (PDB ID: 3FZT)Note:Relates to the Pyk2 structure with the best binding energy.Abbreviations: Pyk2, proline-rich tyrosine kinase 2; PDB, Protein Data Bank; Mw, molecular weight; TPSA, total polar surface area; MAB, mean accumulated binding; BEI, binding efficiency index; SEI, surface-binding efficiency index; LLE, lipophilic ligand efficiency; BvS, BEI versus SEI; ZINC, Zinc is not commercial.
Authors: Matthew P Jacobson; David L Pincus; Chaya S Rapp; Tyler J F Day; Barry Honig; David E Shaw; Richard A Friesner Journal: Proteins Date: 2004-05-01
Authors: Richard A Friesner; Robert B Murphy; Matthew P Repasky; Leah L Frye; Jeremy R Greenwood; Thomas A Halgren; Paul C Sanschagrin; Daniel T Mainz Journal: J Med Chem Date: 2006-10-19 Impact factor: 7.446
Authors: Daniel P Walker; Michael P Zawistoski; Molly A McGlynn; Jian-Cheng Li; Daniel W Kung; Peter C Bonnette; Amy Baumann; Leonard Buckbinder; Janet A Houser; Jason Boer; Anil Mistry; Seungil Han; Li Xing; Angel Guzman-Perez Journal: Bioorg Med Chem Lett Date: 2009-04-24 Impact factor: 2.823
Authors: Yangmi Lim; Ssang-Taek Lim; Alok Tomar; Margaret Gardel; Joie A Bernard-Trifilo; Xiao Lei Chen; Sean A Uryu; Rafaela Canete-Soler; Jinbin Zhai; Hong Lin; William W Schlaepfer; Perihan Nalbant; Gary Bokoch; Dusko Ilic; Clare Waterman-Storer; David D Schlaepfer Journal: J Cell Biol Date: 2008-01-14 Impact factor: 10.539