| Literature DB >> 20305088 |
Ezgi Karaca1, Adrien S J Melquiond, Sjoerd J de Vries, Panagiotis L Kastritis, Alexandre M J J Bonvin.
Abstract
Over the last years, large scale proteomics studies have generated a wealth of information of biomolecular complexes. Adding the structural dimension to the resulting interactomes represents a major challenge that classical structural experimental methods alone will have difficulties to confront. To meet this challenge, complementary modeling techniques such as docking are thus needed. Among the current docking methods, HADDOCK (High Ambiguity-Driven DOCKing) distinguishes itself from others by the use of experimental and/or bioinformatics data to drive the modeling process and has shown a strong performance in the critical assessment of prediction of interactions (CAPRI), a blind experiment for the prediction of interactions. Although most docking programs are limited to binary complexes, HADDOCK can deal with multiple molecules (up to six), a capability that will be required to build large macromolecular assemblies. We present here a novel web interface of HADDOCK that allows the user to dock up to six biomolecules simultaneously. This interface allows the inclusion of a large variety of both experimental and/or bioinformatics data and supports several types of cyclic and dihedral symmetries in the docking of multibody assemblies. The server was tested on a benchmark of six cases, containing five symmetric homo-oligomeric protein complexes and one symmetric protein-DNA complex. Our results reveal that, in the presence of either bioinformatics and/or experimental data, HADDOCK shows an excellent performance: in all cases, HADDOCK was able to generate good to high quality solutions and ranked them at the top, demonstrating its ability to model symmetric multicomponent assemblies. Docking methods can thus play an important role in adding the structural dimension to interactomes. However, although the current docking methodologies were successful for a vast range of cases, considering the variety and complexity of macromolecular assemblies, inclusion of some kind of experimental information (e.g. from mass spectrometry, nuclear magnetic resonance, cryoelectron microscopy, etc.) will remain highly desirable to obtain reliable results.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20305088 PMCID: PMC2938057 DOI: 10.1074/mcp.M000051-MCP201
Source DB: PubMed Journal: Mol Cell Proteomics ISSN: 1535-9476 Impact factor: 5.911
Various experimental data that can be incorporated into HADDOCK
| Experimental data | HADDOCK representation |
|---|---|
| Mutagenesis data | Active and passive residues |
| Hydrogen/deuterium exchange data | Active and passive residues |
| Bioinformatics interface predictions | Active and passive residues |
| Mass spectrometry data | |
| Cross-linking data | Custom CNS restraints |
| Radical probe mass spectrometry | Active and passive residues |
| Limited proteolysis mass spectrometry | Active and passive residues or directly as an MTMDAT-generated HADDOCK parameter file |
| NMR data | |
| Chemical shift perturbation data | Active and passive residues |
| Cross-saturation experiments | Active and passive residues |
| Residual dipolar couplings | Directly |
| Diffusion anisotropy restraints | Directly |
| NOEs | Custom CNS restraints |
| Dihedral angles | Directly |
| Hydrogen bonds | Directly |
| Paramagnetic restraints | Under development |
| Shape data | |
| SAXS | Under development |
| EM | Under development |
Nuclear Overhauser effects.
Fig. 1.Illustration of AIRs used in HADDOCK to drive docking. Active residues correspond to residues experimentally identified or predicted to be at the interface. Passive residues are surface neighbors of active residues. AIRs are defined for each active residue with the effective distance being calculated from the sum of all individual distances between any atom of an active residue and any atom of all active and passive residues on the partner molecule (Equation 1).
Definition and illustration of symmetry restraining options in HADDOCK
Properties of multimer docking benchmark
| Protein Data Bank code | CATH classification | Complex type | Docking type | Symmetry type | Number of amino acids |
|---|---|---|---|---|---|
| Mainly β | Homotrimer | Bound | C3 | 128 | |
| Mainly α/mainly β | Homotrimer | Unbound | C3 | 400 | |
| αβ | Homotetramer | Bound | D2 | 114 | |
| αβ | Homotetramer | Bound | D2 | 200 | |
| Mainly β | Homopentamer | Bound | C5 | 289 | |
| Mainly α | Homodimer-double-stranded DNA | Unbound | C2 | 71 (protein), 20 (DNA) |
Fig. 2.View of multibody web interface of HADDOCK for data-driven docking.
List of active and passive residues used in HADDOCK to dock various benchmark complexes
| Protein Data Bank code | Active residues | Passive residues |
|---|---|---|
| 3, 4, 6, 7, 8, 11–18, 21, 28–31, 33, 69, 72, 73, 75, 77, 81, 82, 85, 88, 92, 100–114, 120, 122, 124 | 2, 9, 23, 24, 26, 36, 37, 38, 42, 52, 58, 63, 64, 67, 70, 79, 80, 83, 86, 89, 90, 93, 96, 97, 98, 99, 115, 116, 118, 126–128 | |
| 5, 8, 9, 10, 11, 13, 54, 71, 73, 75, 76, 78, 79, 87, 93, 98, 110, 118, 193, 196, 219, 222, 244, 248, 251, 267, 269, 270 | 4, 7, 12, 15, 21, 22, 24, 26, 28, 34, 36, 56, 57, 64, 66–70, 72, 77, 81, 83, 86, 92, 94, 95, 96, 107, 108, 120, 131, 150, 152–154, 192, 194, 195, 216–218, 224, 243, 246, 250, 253–263, 266, 271, 272, 273 | |
| 3, 15, 17, 19, 41, 42, 47–52, 71, 76–87, 89, 91, 93, 98, 99, 100–103, 106, 108, 110, 112–114 | 1, 2, 5, 7, 9, 12, 13, 21, 24, 25, 27, 39, 43, 45, 46, 53, 54, 64, 66, 68, 69, 70, 72, 73–75, 96, 97 | |
| 2–10, 16, 41–47, 50, 51, 54, 55, 57, 63–73, 138, 140–142, 144, 145, 147, 150, 151, 154, 155, 158, 159, 162, 163, 176–185 | 12–19, 35, 60–62, 74, 75, 89, 91, 95, 102, 129, 133–137, 146, 165–168, 170–174 | |
| 32–38, 52, 71, 74, 75, 78, 79, 107, 111–119, 123, 127, 130–137, 139, 142, 152, 160, 162, 225, 228, 229, 239, 240–245, 250, 252–260, 264–269, 274, 275, 288, 289, 291, 296, 299, 300, 303, 314, 316 | 39–41, 50, 51, 54, 56, 58, 60, 63–68, 72, 73, 77, 80, 81, 88, 93, 101, 102, 104–106, 108–110, 117, 124, 126, 128, 138, 140, 141, 143–146, 150, 151, 153–156, 158, 170, 177, 179, 183, 185, 231–236, 238, 244, 246–249, 251, 261, 262, 276, 290, 292–295, 297, 305, 307, 309, 311, 312 | |
| Protein: 29, 31, 32, 42–44; DNA: 4–7, 13–18, 22–25, 32–36 | Protein: 9, 18–20, 27, 28, 30, 34, 36, 37, 40, 41, 45, 46 |
The active and passive residue information is gathered via CPORT.
The active and passive residue information is gathered via CPORT and literature data.
The active and passive residue information is gathered via conservation and experimental data (mutagenesis and ethylation interference).
Multibody docking results obtained via using multibody interface of HADDOCK web server
| Protein Data Bank code | Quality/rank | Best structure i-r.m.s.d./l-r.m.s.d. | Best cluster quality/rank | Best cluster i-r.m.s.d./l-r.m.s.d. |
|---|---|---|---|---|
| Å | Å | |||
| ★★★/1 | 0.8/0.7 | ★★★/1 | 0.8 ± 0.1/0.7 ± 0.1 | |
| ★★/1 | 1.7/5.2 | ★★/1 | 1.8 ± 0.1/5.3 ± 0.1 | |
| ★★★/1 | 0.9/1.2 | ★★★/1 | 0.8 ± 0.1/1.3 ± 0.6 | |
| ★★★/1 | 1.0/1.2 | ★★★/1 | 1.2 ± 0.2/1.3 ± 0.2 | |
| ★★★/1 | 0.7/0.7 | ★★/1 | 4.1 ± 0.1/4.0 ± 0.1 | |
| ★★/1 | 1.79/2.2 | ★★/1 | 2.12 ± 0.3/2.8 ± 0.6 |
For the definitions of i-r.m.s.d. and l-r.m.s.d. refer to “Materials and Methods.”
Bound docking; the docking was performed with the separated monomers taken from the reference crystal structure.
Unbound docking; the docking was performed with the free form of the monomers (see “Materials and Methods” for details).
Fig. 3.View of best HADDOCK solutions (having colored monomers) superimposed onto their respective crystal reference structures (shown in a, 1QU9; b, 1URZ; c, 1OUS; d, 1VIM; e, 1VPN; and f, 3CRO. The figures were generated with Pymol (Delano Scientific LLC).