| Literature DB >> 23855673 |
Masahito Ohue, Yuri Matsuzaki, Nobuyuki Uchikoga, Takashi Ishida, Yutaka Akiyama1.
Abstract
The elucidation of protein-protein interaction (PPI) networks is important for understanding cellular structure and function and structure-based drug design. However, the development of an effective method to conduct exhaustive PPI screening represents a computational challenge. We have been investigating a protein docking approach based on shape complementarity and physicochemical properties. We describe here the development of the protein-protein docking software package "MEGADOCK" that samples an extremely large number of protein dockings at high speed. MEGADOCK reduces the calculation time required for docking by using several techniques such as a novel scoring function called the real Pairwise Shape Complementarity (rPSC) score. We showed that MEGADOCK is capable of exhaustive PPI screening by completing docking calculations 7.5 times faster than the conventional docking software, ZDOCK, while maintaining an acceptable level of accuracy. When MEGADOCK was applied to a subset of a general benchmark dataset to predict 120 relevant interacting pairs from 120 x 120 = 14,400 combinations of proteins, an F-measure value of 0.231 was obtained. Further, we showed that MEGADOCK can be applied to a large-scale protein-protein interaction-screening problem with accuracy better than random. When our approach is combined with parallel high-performance computing systems, it is now feasible to search and analyze protein-protein interactions while taking into account three-dimensional structures at the interactome scale. MEGADOCK is freely available at http://www.bi.cs.titech.ac.jp/megadock.Entities:
Mesh:
Substances:
Year: 2014 PMID: 23855673 PMCID: PMC4443796 DOI: 10.2174/09298665113209990050
Source DB: PubMed Journal: Protein Pept Lett ISSN: 0929-8665 Impact factor: 1.890
The proposed all-to-all protein-protein interaction prediction method
| All-to-all docking by MEGADOCK that outputs 2,000x | |
| ii) | Reranking according to the energy calculation of the high scoring decoys recorded by process (i), the number of decoys is reduced to 1,000. These decoys have the lowest energy score assigned by the reranking. |
| iii) | Clustering according to the structural similarity of the decoys. |
| iv) | Calculation of affinity scores for each protein-pair according to the highest docking score of the decoy included in the clusters that have more than |
| v) | Prediction of protein pairs that have the potential to interact, with |
The 23 complex structures selected from the ZLAB Benchmark 2.0 dataset for the selected weighted parameter.
| 1AK4, 1AVX, 1AY7, 1B6C, 1CGI, 1D6R, 1E96, 1EAW, 1EWY, 1GCQ, 1GHQ, 1HE1, 1KAC, 1KTZ, 1PPE, 1SBB, 1UDI, 2PCC, 2SIC, 2SNI, 7CEI |
| Medium Difficulty (2) |
| 1ACB, 1GRN |
The 44 complex structures selected from the ZLAB Benchmark 2.0 dataset (small dataset).
| 1AK4, 1AVX, 1AY7, 1B6C, 1BUH, 1BVN, 1CGI, 1D6R, 1DFJ, 1E6E, 1E96, 1EAW, 1EWY, 1F34, 1FC2, 1FQJ, 1GCQ, 1GHQ, 1HE1, 1KAC, 1KTZ, 1KXP, 1KXQ, 1MAH, 1PPE, 1QA9, 1SBB, 1TMQ, 1UDI, 2BTF, 2PCC, 2SIC, 2SNI, 7CEI |
| Medium Difficulty (6) |
| 1ACB, 1GRN, 1HE8, 1I2M, 1M10, 1WQ1 |
| Difficult (4) |
| 1ATN, 1FQ1, 1H1V, 1IBR |
The 120 complex structures selected from the ZLAB Benchmark 4.0 dataset (large dataset).
| 1AK4, 1AVX, 1AY7, 1B6C, 1BUH, 1BVN, 1CGI, 1CLV, 1D6R, 1DFJ, 1E6E, 1E96, 1EAW, 1EFN, 1EWY, 1F34, 1FC2, 1FFW, 1FLE, 1FQJ, 1GCQ, 1GHQ, 1GL1, 1GLA, 1GPW, 1GXD, 1H9D, 1HE1, 1J2J, 1JTG, 1KAC, 1KTZ, 1KXP, 1KXQ, 1MAH, 1N8O, 1OC0, 1OPH, 1OYV, 1PPE, 1PVH, 1QA9, 1R0R, 1S1Q, 1SBB, 1T6B, 1TMQ, 1UDI, 1US7, 1XD3, 1YVB, 1Z0K, 1Z5Y, 1ZHH, 1ZHI, 2A5T, 2A9K, 2ABZ, 2AJF, 2B42, 2BTF, 2FJU, 2G77, 2HLE, 2HQS, 2I25, 2J0T, 2O8V, 2OOB, 2OUL, 2PCC, 2SIC, 2SNI, 2UUY, 2VDB, 3D5S, 3SGQ, 7CEI, BOYV |
| Medium Difficulty (23) |
| 1ACB, 1GRN, 1HE8, 1I2M, 1JIW, 1LFD, 1M10, 1MQ8, 1NW9, 1R6Q, 1SYX, 1WQ1, 1XQS, 2AYO, 2CFH, 2H7V, 2HRK, 2J7P, 2NZ8, 2OZA, 2Z0E, 3CPH, 4CPA |
| Difficult (18) |
| 1ATN, 1BKD, 1F6M, 1FQ1, 1H1V, 1IBR, 1IRA, 1JK9, 1PXV, 1R8S, 1Y64, 1ZLI, 1ZM4, 2C0L, 2I9B, 2IDO, 2O3B, 2OT3 |
Calculation times for MEGADOCK and ZDOCK.
| MEGADOCK 1.0 | MEGADOCK 2.0 | MEGADOCK 2.1 | ZDOCK 3.0 | |
|---|---|---|---|---|
| Average (s.d.) [min] | 13.3 (10.1) | 14.7 (10.8) | 16.6 (11.8) | 124.6 (94.1) |
| 6Increase over ZDOCK 3.0 | 9.37 | 8.48 | 7.51 | (1.0) |
Calculation times for ZRANK reranking.
|
| 1 | 2 | 3 | 5 | 10 | 20 |
|---|---|---|---|---|---|---|
| Number of decoys (2000 × | 2,000 | 4,000 | 6,000 | 10,000 | 20,000 | 40,000 |
| Calculation time [min] | 3.1 | 6.2 | 9.2 | 15.2 | 30.4 | 60.8 |
Results of 44 × 44 protein-protein interaction predictions.
| Decoys recorded per rotation ( | 1 | 2 | 3 | 5 | 10 | 20 | |
|---|---|---|---|---|---|---|---|
| Predictions without reranking | Precision | 0.563 | 0.435 | 0.474 | 0.429 | 0.409 | 0.450 |
| Recall | 0.205 | 0.227 | 0.205 | 0.205 | 0.205 | 0.205 | |
| F-measure | 0.300 | 0.299 | 0.286 | 0.277 | 0.273 | 0.281 | |
| Predictions with reranking | Precision | - | 0.375 | 0.447 | 0.320 | 0.347 | 0.318 |
| Recall | - | 0.409 | 0.386 | 0.364 | 0.386 | 0.318 | |
| F-measure | - | 0.391 | 0.415 | 0.340 | 0.366 | 0.318 | |
Comparison of the results of protein-protein interaction predictions for the ZLAB Benchmark dataset with MEGA-DOCK and other methods.
|
| Size | Precision | Recall | Accuracy | F-measure |
|---|---|---|---|---|---|
| MEGADOCK | 120×120 | 0.500 | 0.150 | 0.992 | 0.231 |
| ZDOCK+Clustering [ | 120×120 | 0.310 | 0.225 | 0.989 | 0.261 |
| ZDOCK+AEP [ | 84×84 | 0.035 | 0.274 | 0.902 | 0.063 |
| ZDOCK+ZRANK+Clustering | 120×120 | 0.474 | 0.225 | 0.991 | 0.301 |
Note: Values for ZDOCK+AEP were taken from reference [15].