| Literature DB >> 26248608 |
Matthias Dietzen1, Olga V Kalinina1, Katerina Taškova2,3, Benny Kneissl2,4, Anna-Katharina Hildebrandt5, Elmar Jaenicke6, Heinz Decker6, Thomas Lengauer1, Andreas Hildebrandt2.
Abstract
Macromolecular oligomeric assemblies are involved in many biochemical processes of living organisms. The benefits of such assemblies in crowded cellular environments include increased reaction rates, efficient feedback regulation, cooperativity and protective functions. However, an atom-level structural determination of large assemblies is challenging due to the size of the complex and the difference in binding affinities of the involved proteins. In this study, we propose a novel combinatorial greedy algorithm for assembling large oligomeric complexes from information on the approximate position of interaction interfaces of pairs of monomers in the complex. Prior information on complex symmetry is not required but rather the symmetry is inferred during assembly. We implement an efficient geometric score, the transformation match score, that bypasses the model ranking problems of state-of-the-art scoring functions by scoring the similarity between the inferred dimers of the same monomer simultaneously with different binding partners in a (sub)complex with a set of pregenerated docking poses. We compiled a diverse benchmark set of 308 homo and heteromeric complexes containing 6 to 60 monomers. To explore the applicability of the method, we considered 48 sets of parameters and selected those three sets of parameters, for which the algorithm can correctly reconstruct the maximum number, namely 252 complexes (81.8%) in, at least one of the respective three runs. The crossvalidation coverage, that is, the mean fraction of correctly reconstructed benchmark complexes during crossvalidation, was 78.1%, which demonstrates the ability of the presented method to correctly reconstruct topology of a large variety of biological complexes.Entities:
Keywords: 3D-MOSAIC; complex match score; macromolecular assembly; protein-protein interactions; structural modeling; transformation match score
Mesh:
Substances:
Year: 2015 PMID: 26248608 PMCID: PMC5049452 DOI: 10.1002/prot.24873
Source DB: PubMed Journal: Proteins ISSN: 0887-3585
Figure 1Exemplary assembly of the homo‐hexameric hemocyanin from Panulirus interruptus (PDB code 1HCY) using 3D‐MOSAIC. In each iteration, new monomers can be attached to all previously retained solutions. If a matching interface is found, the complex match score increases and the corresponding complex might be ranked further up in the list of solutions (green double‐tilted arrows). Solutions similar to better‐ranked ones or yielding severe steric clashes are discarded. After complex construction, a symmetry optimization can be performed. Complex images created with PyMOL.41
Performance of 3D‐MOSAIC in the Benchmark
| Parameter |
|
| ||
|---|---|---|---|---|
| setting |
| covcv |
|
|
| Best one | 221 (71.8) | 69.1 | 110 (35.7) | 60 (19.5) |
| Best two | 245 (79.5) | 76.6 | 125 (40.6) | 69 (22.4) |
| Best three | 252 (81.8) | 78.1 | 128 (41.6) | 73 (23.7) |
Number N (and coverage cov [%]) of the benchmark complexes reconstructed using the best one, two or three combinations of parameters, with corresponding crossvalidation coverage (covcv [%]) rates.
Joint Performance of the Best Three Combinations of Parameters
| Category | Number of complexes | Top 1 | Top 10 | Top 25 | Top 100 | All |
|---|---|---|---|---|---|---|
| Unbound | 9 | 6 | 7 | 7 | 7 | 7 |
| 1.00 (0.69) | 1.19 (0.78) | 1.19 (0.78) | 1.19 (0.78) | 1.19 (0.78) | ||
| Dimer | 8 | 5 | 6 | 6 | 6 | 6 |
| 1.00 (0.67) | 1.77 (0.75) | 1.77 (0.75) | 1.77 (0.75) | 1.77 (0.75) | ||
| Foreign | 108 | 74 | 83 | 86 | 86 | 90 |
| 1.00 (0.67) | 1.29 (0.74) | 1.88 (0.77) | 2.01 (0.77) | 8.06 (0.80) | ||
| Same | 183 | 130 | 143 | 146 | 148 | 149 |
| 1.00 (0.68) | 1.36 (0.75) | 1.47 (0.76) | 1.87 (0.77) | 3.18 (0.77) | ||
| Total | 308 | 215 | 239 | 245 | 247 | 252 |
| 1.00 (0.68) | 1.34 (0.75) | 1.62 (0.76) | 1.90 (0.77) | 4.84 (0.78) |
Number of complexes with correct solution within top‐ranked N solutions is reported. The mean rank of the first correct solution and the crossvalidation accuracy (in parentheses) are given in the next line. The ranks are computed by generating three ranked lists, one for each combination of parameters. Each list is ordered lexicographically with respect to symmetry, then cms. The resulting ranked lists are merged and items with equal rank are ordered with respect to the accumulated docking score.
Effect of Symmetry Optimization on Ranking of Solutions
| Ranking by | Top 1 | Top 10 | Top 25 | Mean Rank |
|---|---|---|---|---|
|
| 78.2 (3.8%) | 92.9 (1.8%) | 96.0 (1.0%) | 4.00 ± 11.27 |
| symmetry, | 82.9 (5.4%) | 94.9 (2.2%) | 97.2 (1.3%) | 3.18 ± 9.77 |
Mean percentage (and standard deviation) of correctly reconstructed benchmark complexes per parameter setting with a near‐native solution among the top 1, 10, 25 ranks, as well as the mean rank of the first correctly reconstructed complex. Ranking is either based on cms or on a lexicographical ordering with respect to the extent of symmetry involved and then by cms.
Figure 2Histogram of docking performance over all 1044 reference binding modes: for each binding mode, the minimum, median and maximum C dimer RMSD from the reference mode over all 10,000 docking poses was determined.
Figure 3Distribution of difference of best tRMSDs per assembly between CombDock and 3D‐MOSAIC. In only seven out of 190 cases, CombDock yielded a better tRMSD than 3D‐MOSAIC (bars below zero). Images created with Matplotlib.42
Figure 4The seven complexes that could be reconstructed in the single residue‐pair interaction constraints experiments. Each assembled complex is superimposed onto the respective reference. Complex images created with PyMOL.41
Figure 5Examples of successfully reconstructed assemblies, superimposed onto the corresponding reference complex. Images created with PyMOL.41
Figure 6Examples of complexes and corresponding topology graphs for hard cases: (a) ring‐like topology of T4 lysozyme hexamer (PDB code 3SBA), (b) cage‐like topology of pyruvate dehydrogenase E2 60‐mer core complex (PDB code 1B5S), (c) inovirus coat protein filament (PDB code 2C0W) composed of helical monomers, and (d) human cystatin C complex (PDB code 1R4C) forming interchain β‐sheets. Different node colors correspond to different protein types, different edge colors to different binding modes. Images created with PyMOL.41