| Literature DB >> 28377797 |
Alexey V Sulimov1, Dmitry A Zheltkov2, Igor V Oferkin3, Danil C Kutov1, Ekaterina V Katkova1, Eugene E Tyrtyshnikov4, Vladimir B Sulimov1.
Abstract
We present the novel docking algorithm based on the Tensor Train decomposition and the TT-Cross global optimization. The algorithm is applied to the docking problem with flexible ligand and moveable protein atoms. The energy of the protein-ligand complex is calculated in the frame of the MMFF94 force field in vacuum. The grid of precalculated energy potentials of probe ligand atoms in the field of the target protein atoms is not used. The energy of the protein-ligand complex for any given configuration is computed directly with the MMFF94 force field without any fitting parameters. The conformation space of the system coordinates is formed by translations and rotations of the ligand as a whole, by the ligand torsions and also by Cartesian coordinates of the selected target protein atoms. Mobility of protein and ligand atoms is taken into account in the docking process simultaneously and equally. The algorithm is realized in the novel parallel docking SOL-P program and results of its performance for a set of 30 protein-ligand complexes are presented. Dependence of the docking positioning accuracy is investigated as a function of parameters of the docking algorithm and the number of protein moveable atoms. It is shown that mobility of the protein atoms improves docking positioning accuracy. The SOL-P program is able to perform docking of a flexible ligand into the active site of the target protein with several dozens of protein moveable atoms: the native crystallized ligand pose is correctly found as the global energy minimum in the search space with 157 dimensions using 4700 CPU ∗ h at the Lomonosov supercomputer.Entities:
Keywords: Docking; Drug design; Flexible ligand; Protein moveable atoms; Protein-ligand complex; Tensor train
Year: 2017 PMID: 28377797 PMCID: PMC5367798 DOI: 10.1016/j.csbj.2017.02.004
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Flowgraph of the program complex for low energy local minima search with flexible ligand and moveable target protein atoms. Stage I: the data preparation and TT global energy minima search with the SOL-P program. Stage II: the analysis of binary data with the “non-optimized minima” obtained from the SOL-P program and preparation of the table with the results and the final minima set.
Complexes for testing parameters of the TT-docking algorithm. PDB ID is the ID of the respective protein-ligand complex taken from Protein Data Bank [64].
| Protein name | PDB ID | Number of ligand atoms including hydrogen ones | Number of ligand torsions |
|---|---|---|---|
| Urokinase | 20 | 2 | |
| 24 | 6 | ||
| 61 | 17 | ||
| 74 | 19 | ||
| CHK1 (checkpoint kinase 1) | 35 | 6 | |
| Thrombin | 64 | 10 | |
| ERK2 (extracellular signal-regulated kinase 2) | 57 | 12 |
Values of INON index (Index of the minimum Near Optimized Native) for three protein-ligand complexes with different numbers of protein moveable atoms. PDB ID is the ID of the respective protein-ligand complex taken from Protein Data Bank [64].
| PDB ID (number of ligand torsions) | Number of protein moveable atoms | INON | ||
|---|---|---|---|---|
| 0, 6, 15, 27, 35 | 1 | 1 | 1 | |
| 0, 6 | inf | inf | inf | |
| 13 | 1 | inf | 17 | |
| 26 | 1 | 1 | 2 | |
| 48 | 2 | 2 | 1 | |
| 0 | 16 | 24 | 29 | |
| 6 | 17 | 21 | 21 | |
| 13 | 17 | 18 | 19 | |
| 25 | 15 | 15 | 15 | |
| 42 | inf | inf | inf | |
Fig. 2Dependence of computing resources on the number of protein moveable atoms for the native ligand docking by the SOL-P program with different sets of TT-docking parameters. Integer n is the initial grid size.
Validation set of protein-ligand complexes. Numbers of atoms includes hydrogen ones. N is the total number of the protein moveable atoms. N is the number of the protein moveable hydrogen atoms.
| Protein name | PDB ID | Num. of ligand torsions | Numbers of ligand atoms | Numbers of moveable protein atoms 13–18 | Numbers of moveable protein atoms 25–35 |
|---|---|---|---|---|---|
| Urokinase | 2 | 20 | 14/8 | 26/17 | |
| 4 | 34 | 15/8 | 27/17 | ||
| 6 | 24 | 16/8 | 27/14 | ||
| 6 | 46 | 17/11 | 28/20 | ||
| 17 | 61 | 16/8 | 28/15 | ||
| 19 | 74 | 16/9 | 30/18 | ||
| CHK1 (checkpoint kinase 1) | 0 | 26 | 15/11 | 29/22 | |
| 3 | 42 | 15/11 | 26/20 | ||
| 5 | 32 | 13/10 | 25/20 | ||
| 6 | 35 | 15/11 | 31/22 | ||
| Factor Xa | 7 | 54 | 14/10 | 30/18 | |
| 7 | 60 | 13/10 | 29/21 | ||
| 7 | 50 | 13/9 | 26/17 | ||
| 8 | 61 | 17/10 | 31/22 | ||
| Poly(ADP-ribose) polymerase | 1 | 24 | 14/7 | 20/10 | |
| 3 | 33 | 14/12 | 27/18 | ||
| 3 | 20 | 14/9 | 25/15 | ||
| ERK2 (extracellular signal-regulated kinase 2) | 8 | 52 | 16/13 | 33/25 | |
| 12 | 57 | 18/11 | 26/19 | ||
| Thrombin | 10 | 64 | 14/6 | 25/15 | |
| 12 | 71 | 16/12 | 26/19 | ||
| Trypsin | 6 | 69 | 15/8 | 25/16 | |
| 10 | 68 | 13/8 | 27/19 | ||
| GNC92H2 antibody | 5 | 44 | 17/16 | 29/26 | |
| Apolipoprotein | 6 | 22 | 13/5 | 30/10 | |
| Beta-1,4-xylanase | 6 | 35 | 15/8 | 29/18 | |
| Ricin | 7 | 29 | 15/9 | 27/18 | |
| Neuraminidase | 10 | 50 | 16/12 | 33/23 | |
| Hen egg-white lysozyme | 11 | 56 | 15/10 | 26/19 | |
| HIV-1 protease | 14 | 70 | 14/9 | 31/21 |
Indexes EN and INON for the SOL-P program with different numbers of protein moveable atoms (0, 13–18 and 25–35) and for the FLM program with rigid proteins.
| Complex id | EN/INON, SOL-P 0 | EN/INON, SOL-P 13–18 | EN/INON, SOL-P 25–35 | EN/INON, FLM |
|---|---|---|---|---|
| 1B9V | inf/344 | 513/353 | inf/333 | inf/inf |
| 1BR5 | 144/45 | 241/23 | inf/29 | inf/309 |
| 1C5Y | 1/1 | 1/1 | 1/1 | 1/1 |
| 1DWC | inf/20 | inf/289 | inf/98 | inf/377 |
| 1EFY | 72/46 | 38/20 | 40/16 | 158/81 |
| 1F5L | 1/1 | 2/1 | 2/1 | 1/1 |
| 1HPV | inf/1 | 6/1 | 2/1 | 98/1 |
| 1I7Z | 1/1 | 1/1 | 1/1 | 1/1 |
| 1J01 | inf/inf | 1/1 | 1/1 | 1/1 |
| 1K1J | inf/inf | inf/inf | inf/19 | 1/4 |
| 1LQD | inf/5 | 1/1 | 1/1 | 1/1 |
| 1LZG | inf/inf | inf/1270 | inf/771 | inf/inf |
| 1MQ6 | inf/inf | 2/2 | inf/3 | 7/4 |
| 1O3P | 13/11 | 13/2 | 2/1 | 16/14 |
| 1PPC | inf/inf | inf/26 | inf/51 | 1/1 |
| 1SQO | 1/1 | 1/1 | 1/1 | 1/1 |
| 1TOM | inf/181 | inf/465 | inf/570 | inf/inf |
| 1VJ9 | inf/26 | inf/29 | inf/23 | 48/1 |
| 1VJA | inf/50 | inf/inf | inf/127 | 41/4 |
| 2P94 | inf/2 | inf/2 | 27/2 | 36/2 |
| 2PAX | 1/1 | 1/1 | 1/1 | 1/1 |
| 3CEN | inf/inf | inf/1 | inf/1 | 94/1 |
| 3KIV | 9/1 | 5/1 | 4/1 | 12/1 |
| 3PAX | 2/1 | 2/1 | 2/1 | 2/1 |
| 4FSW | 6/5 | 6/5 | 6/5 | 8/7 |
| 4FT0 | 21/20 | 20/15 | 15/9 | 32/30 |
| 4FT9 | inf/23 | 35/21 | 40/22 | 46/29 |
| 4FTA | 176/176 | 370/370 | 415/415 | inf/inf |
| 4FV5 | inf/231 | 87/87 | 122/84 | 189/122 |
| 4FV6 | inf/337 | inf/213 | inf/325 | inf/inf |
Fig. 3Numbers of complexes with different values of INON index. PMA indicates the range of protein moveable atoms for the SOL-P program. INON is the index of the minimum having RMSD from the optimized native ligand position less than 2 Å; if there are several such minima, the minimum with the lowest energy (with the lowest index) should be taken.