| Literature DB >> 36061673 |
Keisuke Yanagisawa1, Rikuto Kubota1,2, Yasushi Yoshikawa1, Masahito Ohue1, Yutaka Akiyama1.
Abstract
Virtual screening is a commonly used process to search for feasible drug candidates from a huge number of compounds during the early stages of drug design. As the compound database continues to expand to billions of entries or more, there remains an urgent need to accelerate the process of docking calculations. Reuse of calculation results is a possible way to accelerate the process. In this study, we first propose yet another virtual screening-oriented docking strategy by combining three factors, namely, compound decomposition, simplified fragment grid storing k-best scores, and flexibility consideration with pregenerated conformers. Candidate compounds contain many common fragments (chemical substructures). Thus, the calculation results of these common fragments can be reused among them. As a proof-of-concept of the aforementioned strategies, we also conducted the development of REstretto, a tool that implements the three factors to enable the reuse of calculation results. We demonstrated that the speed and accuracy of REstretto were comparable to those of AutoDock Vina, a well-known free docking tool. The implementation of REstretto has much room for further performance improvement, and therefore, the results show the feasibility of the strategy. The code is available under an MIT license at https://github.com/akiyamalab/restretto.Entities:
Year: 2022 PMID: 36061673 PMCID: PMC9435046 DOI: 10.1021/acsomega.2c03470
Source DB: PubMed Journal: ACS Omega ISSN: 2470-1343
Figure 1Three factors of the fundamental ideas.
Figure 2Workflow of the REstretto, a proof-of-concept implementation of the proposed strategy.
Figure 3Construction of the simplified fragment grid. (a) Fragment grid containing scores of all rotations. (b) Simplified fragment grid containing the 1-best (k = 1) score among rotations per position.
Figure 4Rough conformer score calculation.
Protein Data Bank (PDB) ID and Box Information Determined Using the eBoxSize[26] of Each Target
| target | PDB ID | box center | docking region |
|---|---|---|---|
| AKT1 | (5.99 Å, 3.01 Å, 17.34 Å) | 14 × 14 × 14 Å3 | |
| AMPC | (80.84 Å, 5.01 Å, 31.29 Å) | 18 × 18 × 18 Å3 | |
| CP3A4 | (36.69 Å, −15.69 Å, 29.69 Å) | 24 × 24 × 24 Å3 | |
| CXCR4 | (20.17 Å, −7.83 Å, 70.62 Å) | 18 × 18 × 18 Å3 | |
| GCR | (39.85 Å, 30.29 Å, 9.33 Å) | 22 × 22 × 22 Å3 | |
| HIVPR | (20.14 Å, −2.72 Å, 18.30 Å) | 20 × 20 × 20 Å3 | |
| HIVRT | (9.53 Å, 12.41 Å, 17.43 Å) | 18 × 18 × 18 Å3 | |
| KIF11 | (17.81 Å, 16.12 Å, 109.27 Å) | 20 × 20 × 20 Å3 |
Number of Compounds, Fragments, And Types of Fragments for Each Target in the DUD-E Diverse Subseta
| target | no. of compds | no. of fragments | no. of types of fragments |
|---|---|---|---|
| AKT1 | 16 743 | 97 119 (×5.8) | 8536 (×0.51) |
| AMPC | 2898 | 11 848 (×4.1) | 2843 (×0.98) |
| CP3A4 | 11 970 | 69 612 (×5.8) | 8348 (×0.70) |
| CXCR4 | 3446 | 18 801 (×5.5) | 2848 (×0.83) |
| GCR | 15 258 | 73 662 (×4.8) | 8378 (×0.55) |
| HIVPR | 36 286 | 247 065 (×6.8) | 11 098 (×0.31) |
| HIVRT | 19 229 | 88 673 (×4.6) | 11 129 (×0.58) |
| KIF11 | 6966 | 33 601 (×4.8) | 5415 (×0.78) |
The ratios to the number of compounds are shown in parentheses.
Average Values of ROC-AUC, EF1%, and EF10% over the DUD-E Diverse Subseta
| AutoDock
Vina | Glide | ||||
|---|---|---|---|---|---|
| metrics | REstretto | ex = 1 | ex = 8 | HTVS | SP |
| ROC-AUC | 0.638 | 0.644 | 0.667 | 0.725 | |
| EF1% | 5.4 | 7.2 | 8.6 | 16.5 | |
| EF10% | 2.6 | 3.1 | 4.3 | ||
The best values between REstretto and AutoDock Vina are shown in bold.
Execution Times Per Compound with Active and Decoy Compounds in the DUD-E Diverse Subset (CPU core s)a
| execution
time (CPU core s) | |||||
|---|---|---|---|---|---|
| AutoDock
Vina | Glide | ||||
| DUD-E target | REstretto | ex = 1 | ex = 8 | HTVS | SP |
| AKT1 | 48.12 | 619.87 | 0.46 | 12.04 | |
| AMPC | 23.86 | 168.39 | 0.57 | 5.93 | |
| CP3A4 | 41.05 | 304.86 | 1.23 | 21.83 | |
| CXCR4 | 33.48 | 404.03 | 1.82 | 32.91 | |
| GCR | 28.83 | 236.56 | 0.39 | 8.69 | |
| HIVPR | 31.84 | 376.09 | 0.68 | 17.47 | |
| HIVRT | 19.96 | 180.75 | 0.27 | 5.51 | |
| KIF11 | 34.08 | 276.24 | 0.66 | 13.63 | |
| average | 29.04 | 341.89 | 0.63 | 13.84 | |
The fastest time between REstretto and AutoDock Vina for each DUD-E target is shown in bold. Note that AutoDock Vina (ex = 8) was calculated with 12 cores and the execution times of all cores were accumulated.
Breakdown of the Average Execution Time of Each Step of REstretto per Compound State (CPU core s)a
| average
execution time (CPU core s) | ||||
|---|---|---|---|---|
| (B) structure decomposition | (C) grid generation | (D) rough conformer scoring | (E) local optimization | |
| target | ||||
| AKT1 | 1.23 (11.4%) | 3.26 (30.2%) | 0.22 (2.0%) | 5.36 (49.7%) |
| AMPC | N/A | N/A | N/A | N/A |
| CP3A4 | 1.31 (5.6%) | 15.00 (63.7%) | 0.77 (3.3%) | 5.88 (25.0%) |
| CXCR4 | N/A | N/A | N/A | N/A |
| GCR | 0.93 (5.0%) | 10.72 (57.8%) | 0.48 (2.6%) | 5.99 (32.3%) |
| HIVPR | 1.73 (8.9%) | 10.14 (52.3%) | 0.50 (2.6%) | 6.30 (32.5%) |
| HIVRT | 0.55 (4.1%) | 7.21 (53.7%) | 0.22 (1.6%) | 5.13 (38.2%) |
| KIF11 | 0.99 (5.6%) | 10.01 (56.6%) | 0.39 (2.2%) | 5.82 (32.9%) |
This shows the calculation time for 10 000 compounds. The breakdown was not applicable (N/A) for the targets AMPC and CXCR4, as they comprised less than 10 000 compounds. |TC|, F, |A|, and |TF| represent the total number of compounds, the set of the types of fragments, the number of atoms in a fragment f, and the total number of fragments among all compounds, respectively.
Figure 5Docking pose examples docked by (a) REstretto and (b) AutoDock Vina. The compound is CHEMBL359864, which is known to be an active compound for the target protein AKT1.