| Literature DB >> 24079540 |
Iain H Moal1, Mieczyslaw Torchala, Paul A Bates, Juan Fernández-Recio.
Abstract
BACKGROUND: Protein-protein docking, which aims to predict the structure of a protein-protein complex from its unbound components, remains an unresolved challenge in structural bioinformatics. An important step is the ranking of docked poses using a scoring function, for which many methods have been developed. There is a need to explore the differences and commonalities of these methods with each other, as well as with functions developed in the fields of molecular dynamics and homology modelling.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24079540 PMCID: PMC3850738 DOI: 10.1186/1471-2105-14-286
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The success rates for the highest performing scoring functions. The success rates for the highest performing scoring functions. The number of complexes for which an acceptable or better solution could be found in the top 1, top 10 and top 100 solutions was calculated for each scoring function, and the best 40 scoring functions for each measure were selected. Acceptable quality solutions are shown in yellow, medium quality solutions in orange, and high quality solutions in red for the three measures (top 1 left, top 10 middle, top 100 right). The functions are ordered by top 10 success rate.
Figure 2The success rates for the highest performing scoring functions using the Benchmark 4.0 update. The success rates for the highest performing scoring functions using the Benchmark 4.0 update. These are the new complexes which were not present in previous versions of the benchmark. The performances are displayed and ordered as in Figure 1.
Figure 3The cardinalities of symmetric differences for pairs of high performing scoring functions. The cardinalities of symmetric differences for pairs of high performing scoring functions. Matrix indices were determined by complete-linkage clustering of the scoring functions, with dissimilarity defined by the cardinality of the symmetric difference sets. The corresponding dendrogram is shown on the left, with the cophenetic distance given by the U-link height. High cardinalities indicate greater ability for the scoring function pairs to identify near-native poses of acceptable or better quality in the top 10 models for different subsets of complexes.
Figure 4The cardinalities of unions for pairs of high performing scoring functions. The cardinalities of unions for pairs of high performing scoring functions. Clustering was performed as described in Figure 3, with the union defining the matrix. High cardinalities indicate that if a scoring function could be created from the two methods capable of identifying all the near natives correctly identified in the top 10 models for both methods, then it would identify a large proportion of the benchmark complexes.
Figure 5The cardinalities of relative complements for pairs of high performing scoring functions. The cardinalities of relative complements for pairs of high performing scoring functions. Indices are ordered by individual top10 acceptable or better success rate, as shown in the leftmost histogram, with acceptable, medium and high quality success rates shown in yellow, orange and red respectively. This matrix indicates the extent to which the method corresponding to each column can benefit from being able to identify the near-native solutions identified by the methods corresponding to each row. Equivalently, each row indicates the extent to which its method could contribute to the methods in each respective column.
A summary of the scoring functions evaluated
| CP_DECK [ | r | The DECK potential, reimplemented based on the original source code. |
| CP_RMFCA [ | r | An α-carbon potential. |
| CP_RMFCEN1 [ | r | A 6 bin distance-dependent centroid-centroid potential. |
| CP_RMFCEN2 [ | r | A 7 bin distance-dependent centroid-centroid potential. |
| CP_SKOIP [ | r | A statistical intermolecular contact potential. |
| CP_TB [ | r | A docking contact potential. |
| CP_TSC [ | r | A 2 bin docking potential. |
| PAIR [ | p | Residue potentials that have been factorised into different energetic contributions (E_pair, E_local, E_ZS3DC, E_3DC and E_3D respectively). These are prefixed with either ‘CP_E’ for energies or ‘CP_Z’ for z-scores, and suffixed with ‘_CB’ for the β-carbon potential and ‘_MIN’ for the minimum inter-residue distance potential. The combination of these into the MixRank ranking strategy is also included. For this method, the 5 largest complexes failed to produce scores and are thus omitted. |
| LOCAL[ | p | |
| S3DC [ | p | |
| 3DC [ | p | |
| 3D [ | p | |
| CP_MIXRANK [ | p | |
| CP_DDGrw [ | r | The weighted intermolecular contact potential extracted from ΔΔG data, a preliminary model. |
| CP_DDGru [ | r | The unweighted intermolecular contact potential extracted from ΔΔG data, a preliminary model. |
| CP_BFVK [ | r | A number of residue-level contact potentials which have been used for protein folding studies. For these, the naming scheme and descriptions can be found elsewhere [ |
| CP_BL [ | r | |
| CP_BT [ | r | |
| CP_GKS [ | r | |
| CP_HLPL [ | r | |
| CP_MJ1 [ | r | |
| CP_MJ2 [ | r | |
| CP_MJ2h [ | r | |
| CP_MJ3h [ | r | |
| CP_MJPL [ | r | |
| CP_MS [ | r | |
| CP_MSBM [ | r | |
| CP_Qa [ | r | |
| CP_Qm [ | r | |
| CP_Qp [ | r | |
| CP_RO [ | r | |
| CP_SJKG [ | r | |
| CP_SKOa [ | r | |
| CP_SKOb [ | r | |
| CP_TD [ | r | |
| CP_TEl [ | r | |
| CP_TEs [ | r | |
| CP_TS [ | r | |
| CP_VD [ | r | |
| AP_DCOMPLEX [ | r | The DComplex potential, reimplementation based on original data file. |
| AP_dDFIRE [ | d | The dDFIRE potential. |
| AP_DFIRE2 [ | d | The DFIRE 2.0 potential. |
| AP_T1 [ | r | The first of two two-step docking potentials. |
| AP_T2 [ | r | The second of two two-step docking potentials. |
| AP_DOPE [ | r | The standard DOPE potential. |
| AP_DOPE_HR [ | r | The high-resolution potentials implemented in MODELLER [ |
| AP_ACE [ | d | The atomic contact energy desolvation score, calculated using FireDock [ |
| AP_OPUS_PSP [ | d | The OPUS_PSP folding potential. |
| AP_GEOMETRIC | d | The geometric potential reported in Li and Liang: Geometric packing potential function for model selection in protein structure and protein-protein binding predictions, unpublished. |
| AP_DARS [ | r | The DARS decoys-as-reference-state statistical potential. |
| AP_URS [ | r | The URS statistical potential. |
| AP_MPS [ | r | The MFP statistical potential. |
| AP_WENG [ | r | An atomic contact potential. |
| AP_calRW [ | d | The distance-dependent calRW potential. |
| AP_calRWp [ | d | The orientation-dependent calRWplus potential. |
| AP_GOAP_ALL [ | d | The GOAP potential and its two constituent terms. |
| AP_GOAP_DF [ | d | |
| AP_GOAP_G [ | d | |
| AP_PISA [ | d | The PISA score. |
| AP_DDGrw [ | r | The weighted intermolecular contact potential extracted from ΔΔG data. |
| AP_DDGru [ | r | The unweighted intermolecular contact potential extracted from ΔΔG data. |
| ATTRACT [ | d | The ATTRACT scoring function, as calculated in PTools [ |
| PYDOCK_TOT [ | i | The PyDock scoring function and the electrostatics, van der Waals and desolvation terms it is composed from. |
| ELE [ | i | |
| VDW [ | i | |
| DESOLV [ | i | |
| FIREDOCK [ | d | The general purpose, enzyme-inhibitor and antibody-antigen FireDock scores and the insideness concavity score and hydrogen-bonding, π-π, cation-π and aliphatic potentials they are composed from. |
| FIREDOCK_EI [ | d | |
| FIREDOCK_AB [ | d | |
| INSIDE [ | d | |
| HBOND [ | d | |
| PI_PI [ | d | |
| CAT_PI [ | d | |
| ALIPH [ | d | |
| SIPPER [ | i | The SIPPER score and its amino-acid propensity and desolvation constituents. |
| PROPNSTS [ | i | |
| ODA [ | i | |
| ZRANK [ | d | The original ZRANK scoring function. |
| ZRANK2 [ | d | The reoptimised ZRANK scoring function. |
| NIP [ | d | Interface packing score. |
| NSC [ | d | Surface complementarity score. |
| ROSETTA [ | d | The unweighted Rosetta energy, calculated using PyRosetta. |
| ROSETTADOCK [ | d | The optimised RosettaDock energy, calculated using PyRosetta. |
| CG_PP [ | d | The coarse-grain PyRosetta pair-potential, van der Waals, environment potential and β-potential. |
| CG_VDW [ | d | |
| CG_ENV [ | d | |
| CG_BETA [ | d | |
| HBOND2 [ | d | The atomic-resolution PyRosetta hydrogen bonding potential, amino-acid propensity scores, attractive and repulsive van der Waals energies, pair potential and desolvation energy. |
| AA_PROP [ | d | |
| FA_ATR [ | d | |
| FA_REP [ | d | |
| PA_PP [ | d | |
| LK_SOLV [ | d | |
| NHB [ | d | The total number of hydrogen bonds, calculated using PyRosetta. |
| CHARMM_TOT [ | d | The total CHARMM energy, electrostatic energy, SASA energy and van der Waals, as calculated using the enerCHARMM script in the MMTSB toolset. |
| CHARMM_ELE [ | d | |
| CHARMM_SASA [ | d | |
| CHARMM_VDW [ | d | |
| SPIDER [ | d | The sub-graph mining based SPIDER score. As the SPIDER program only allowed scoring using a fixed receptor molecule, the unbound receptor conformation was used for this method, with a relaxed parameter set (dRMSD_CutOff = 1.0, intrCvrAbs_CutOff = 20, intrCvrPer_CutOff = 0.3, intrNumPat_CutOff = 10 and intrAveOcc_CutOff = 2). |
Shown are the name of the scoring function and reference, how it was calculated (r for reimplemented, d for downloaded, p for personal communication, i for in-house), and a description/notes.