| Literature DB >> 25722738 |
Ludwig Krippahl1, Pedro Barahona1.
Abstract
This paper presents a constraint-based method for improving protein docking results. Efficient constraint propagation cuts over 95% of the search time for finding the configurations with the largest contact surface, provided a contact is specified between two amino acid residues. This makes it possible to scan a large number of potentially correct constraints, lowering the requirements for useful contact predictions. While other approaches are very dependent on accurate contact predictions, ours requires only that at least one correct contact be retained in a set of, for example, one hundred constraints to test. It is this feature that makes it feasible to use readily available sequence data to predict specific potential contacts. Although such prediction is too inaccurate for most purposes, we demonstrate with a Naïve Bayes Classifier that it is accurate enough to more than double the average number of acceptable models retained during the crucial filtering stage of protein docking when combined with our constrained docking algorithm. All software developed in this work is freely available as part of the Open Chemera Library.Entities:
Keywords: Constraints; Docking
Year: 2015 PMID: 25722738 PMCID: PMC4340843 DOI: 10.1186/s13015-015-0036-6
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
This table shows the results for the 28 complexes in the test set
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| 1gla | 0 / 0 | 0 / 0 | 0 / 0 | 0 / 0 | 1 / 1 | 2 / 2 | 2 / 4 | 7 / 8 |
| 1y64 | 0 / 0 | 0 / 0 | 0 / 0 | 0 / 0 | 0 / 1 | 0 / 2 | 0 / 2 | 0 / 3 |
| 1akj | 10 / 3 | 13 / 5 | 15 / 6 | 26 / 13 | 5 / 5 | 8 / 8 | 14 / 12 | 18 / 21 |
| 1s1q | 0 / 0 | 0 / 0 | 0 / 0 | 0 / 0 | 4 / 12 | 5 / 16 | 6 / 30 | 8 / 41 |
| 3bp8 | 4 / 0 | 5 / 2 | 9 / 4 | 27 / 14 | 9 / 8 | 10 / 13 | 12 / 25 | 18 / 42 |
| 1pvh | 0 / 0 | 0 / 0 | 0 / 0 | 0 / 0 | 2 / 1 | 2 / 1 | 3 / 2 | 5 / 3 |
| 2nz8 | 1 / 1 | 1 / 1 | 3 / 2 | 4 / 3 | 23 / 9 | 38 / 10 | 50 / 12 | 70 / 23 |
| 2j0t | 2 / 2 | 3 / 2 | 7 / 3 | 11 / 7 | 10 / 5 | 12 / 5 | 20 / 8 | 30 / 20 |
| 7cei | 0 / 1 | 0 / 1 | 0 / 1 | 0 / 4 | 5 / 1 | 6 / 1 | 7 / 1 | 7 / 1 |
| 1ijk | 0 / 0 | 0 / 0 | 0 / 0 | 1 / 1 | 2 / 2 | 2 / 2 | 3 / 3 | 5 / 5 |
| 2o3b | 0 / 0 | 0 / 0 | 0 / 0 | 10 / 4 | 0 / 0 | 0 / 0 | 0 / 0 | 1 / 1 |
| 1jwh | 0 / 0 | 1 / 0 | 5 / 0 | 9 / 1 | 0 / 0 | 0 / 0 | 0 / 0 | 3 / 0 |
| 1i2m | 0 / 0 | 0 / 0 | 0 / 0 | 3 / 0 | 1 / 1 | 3 / 1 | 6 / 3 | 12 / 6 |
| 2pcc | 2 / 0 | 3 / 0 | 4 / 1 | 8 / 1 | 2 / 0 | 5 / 5 | 8 / 6 | 15 / 13 |
| 1h1v | 0 / 0 | 0 / 0 | 0 / 0 | 0 / 0 | 2 / 2 | 3 / 3 | 5 / 6 | 10 / 8 |
| 1b6c | 0 / 0 | 1 / 1 | 2 / 1 | 2 / 1 | 21 / 25 | 36 / 35 | 62 / 43 | 117 / 61 |
| 2hle | 1 / 0 | 2 / 1 | 3 / 1 | 11 / 3 | 2 / 0 | 4 / 0 | 9 / 2 | 16 / 5 |
| 1bkd | 0 / 0 | 0 / 0 | 0 / 0 | 0 / 0 | 19 / 6 | 35 / 9 | 67 / 15 | 106 / 20 |
| 1he1 | 11 / 6 | 18 / 12 | 30 / 17 | 53 / 34 | 32 / 3 | 42 / 5 | 77 / 9 | 140 / 17 |
| 1m10 | 16 / 5 | 19 / 5 | 27 / 11 | 54 / 17 | 36 / 9 | 54 / 14 | 75 / 17 | 106 / 22 |
| 1i4d | 0 / 0 | 0 / 0 | 0 / 0 | 0 / 0 | 4 / 7 | 5 / 8 | 9 / 11 | 13 / 22 |
| 1azs | 0 / 0 | 0 / 0 | 0 / 0 | 0 / 0 | 16 / 2 | 24 / 2 | 48 / 3 | 58 / 7 |
| 2sni | 1 / 1 | 1 / 2 | 1 / 3 | 3 / 16 | 26 / 14 | 41 / 26 | 70 / 38 | 142 / 82 |
| 1gpw | 0 / 1 | 2 / 1 | 2 / 3 | 13 / 10 | 25 / 14 | 42 / 22 | 73 / 26 | 132 / 41 |
| 1ofu | 3 / 0 | 4 / 0 | 6 / 1 | 10 / 1 | 7 / 0 | 18 / 0 | 38 / 2 | 75 / 4 |
| 1z0k | 1 / 1 | 3 / 1 | 16 / 4 | 39 / 20 | 36 / 17 | 58 / 27 | 92 / 43 | 179 / 85 |
| 1jzd | 0 / 0 | 0 / 0 | 1 / 0 | 2 / 4 | 29 / 7 | 44 / 13 | 67 / 14 | 110 / 27 |
| 1fak | 0 / 0 | 0 / 0 | 0 / 0 | 1 / 0 | 0 / 1 | 1 / 2 | 6 / 4 | 11 / 6 |
| Av. gain | 2,48 | 2,24 | 2,26 | 1,86 | 2,09 | 2,15 | 2,43 | 2,38 |
The top five complexes are those for which no true contact was ranked in the highest ranking 100 predicted contacts and had no correct constraints among the 100 used in docking. Unbound and bound docking runs are split into several columns for different numbers of models retained, and each cell shows the number of acceptable models using constraints and without constraints. The bottom row shows the average gain in acceptable models in each case, given by the total number of acceptable models using constraints divided by the total without constraints (including all complexes, even those without correct constraints).