| Literature DB >> 16820531 |
Marc van Dijk1, Aalt D J van Dijk, Victor Hsu, Rolf Boelens, Alexandre M J J Bonvin.
Abstract
Intrinsic flexibility of DNA has hampered the development of efficient protein-DNA docking methods. In this study we extend HADDOCK (High Ambiguity Driven DOCKing) [C. Dominguez, R. Boelens and A. M. J. J. Bonvin (2003) J. Am. Chem. Soc. 125, 1731-1737] to explicitly deal with DNA flexibility. HADDOCK uses non-structural experimental data to drive the docking during a rigid-body energy minimization, and semi-flexible and water refinement stages. The latter allow for flexibility of all DNA nucleotides and the residues of the protein at the predicted interface. We evaluated our approach on the monomeric repressor-DNA complexes formed by bacteriophage 434 Cro, the Escherichia coli Lac headpiece and bacteriophage P22 Arc. Starting from unbound proteins and canonical B-DNA we correctly predict the correct spatial disposition of the complexes and the specific conformation of the DNA in the published complexes. This information is subsequently used to generate a library of pre-bent and twisted DNA structures that served as input for a second docking round. The resulting top ranking solutions exhibit high similarity to the published complexes in terms of root mean square deviations, intermolecular contacts and DNA conformation. Our two-stage docking method is thus able to successfully predict protein-DNA complexes from unbound constituents using non-structural experimental data to drive the docking.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16820531 PMCID: PMC1500871 DOI: 10.1093/nar/gkl412
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Average DNA base pair and base pair step parameters
| Parameters | Cro | Lac | Arc | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Ref. | Docking from | Ref. | Docking from | Ref. | Docking from | |||||
| 3CRO | B-DNA | DNA lib. | 1LCC | B-DNA | DNA lib. | 1BDT | B-DNA | DNA lib. | ||
| Twist ( | 34.45.3 | 36.41.0 | 34.93.5 | 34.25.4 | 36.80.7 | 36.53.6 | 32.54.1 | 34.61.2 | 35.53.3 | |
| Roll (
| 2.53.2 | −0.22.0 | 1.08.1 | 2.611.2 | 0.31.7 | 0.210.4 | 3.35.5 | 4.21.9 | 1.07.7 | |
| Tilt (
| 0.53.7 | 0.02.0 | 0.45.4 | −2.77.9 | 0.21.6 | 0.24.9 | −0.33.3 | 0.91.5 | 0.45.9 | |
| Rise (3.40.0 Å) | 3.40.3 | 3.30.2 | 3.40.4 | 3.20.2 | 3.30.2 | 3.30.3 | 3.30.2 | 3.30.1 | 3.30.4 | |
| Slide (0.30.2 Å) | −0.40.4 | 0.00.1 | −0.60.6 | −0.40.7 | 0.20.2 | 0.10.7 | −0.40.7 | 0.01.7 | −0.30.5 | |
| Shift (0.00.1 Å) | 0.00.5 | 0.10.1 | 0.00.6 | −0.10.7 | 0.10.3 | 0.00.5 | −0.10.9 | 0.10.3 | 0.10.8 | |
| Opening (−3.32.5 Å) | −4.54.8 | −4.62.2 | −3.34.0 | −6.77.9 | −2.02.8 | −2.03.8 | 0.44.3 | −0.82.0 | −0.84.7 | |
| Propeller ( | −145 | −7.54.4 | −0.912.7 | −14.64.4 | −8.55.0 | −9.310.1 | −4.38.3 | −4.73.8 | −1.114.1 | |
| Buckle ( | 1.08.1 | −1.45.1 | −0.610.8 | −6.913.2 | 4.63.5 | −0.211.8 | −2.76.5 | 5.24.9 | −2.513.5 | |
| Stagger (0.10.0 Å) | −0.10.5 | −0.10.2 | −0.30.6 | 0.10.8 | −0.10.2 | 0.10.5 | 0.00.3 | −0.20.3 | −0.20.5 | |
| Stretch (−0.10.0 Å) | −0.30.2 | −0.10.1 | −0.20.1 | −0.10.2 | −0.20.1 | −0.10.1 | −0.20.1 | −0.10.1 | −0.10.1 | |
| Shear (0.00.1 Å) | 0.20.5 | 0.10.0 | 0.00.3 | −0.30.5 | 0.10.1 | −0.10.2 | −0.10.3 | 0.00.4 | −0.10.2 | |
| Correlations | ||||||||||
| Roll-twist (0.26) | −0.47 | −0.55 | −0.44 | −0.65 | −0.61 | −0.76 | −0.85 | −0.16 | −0.23 | |
| Roll-slide (0.30) | −0.40 | −0.43 | −0.37 | −0.65 | −0.48 | −0.61 | −0.44 | 0.00 | −0.43 | |
Average parameters with standard deviations in subscript are shown for the published complexes (Ref.) and the top five ranking solutions from unbound flexible docking starting from canonical B-DNA (B-DNA) and from a library of pre-bent and twisted DNA structures (DNA lib.). For comparison, the average values for the canonical B-DNA input structure are shown in the left column between brackets next to each parameter.
Definition of the AIRs for the three repressor/operator systems
| Protein | DNA | Reference | |
|---|---|---|---|
| Cro – O1R | |||
| Active | K27a,Q29a,S30a,L33a,b | T3c,A4a,c,C5a,A6a,G30cT31a,b,c,T32a,b,c,T33a,G34a,T35a | (50–53) |
| Passive | R10,K40,R41,P42 | — | |
| Lac – O1 | |||
| Active | T5a,b,S16a,Y17b,Q18b,R22b,V30b | T4c,G5a,c,T6a,G7a,A8a,C14c, T15a,c,C16a,A17a,C18a | (54–57) |
| Passive | H29,S31 | — | |
| Arc – operon | |||
| Active | F10a,R13a,S32a | T1c,A2c,T3c,G5c,T6a,A7a,G8a,A9a,A14c,C15c,T16c,C17a,T18a,A19a | (58) |
| passive | Q9,N11,R16,D20,R23 | — | |
The Arc monomer is composed of two symmetric subunits and only the restraints for one subunit are shown.
aConserved residues derived from the database of homology-derived secondary structure of proteins (HSSP).
bMutagenesis data.
cEthylation interference.
The r.m.s.d. values from the target and fraction of native contacts for the top five ranking docking solutions of the best cluster
| r.m.s.d. (Å) | Fnate | ||||
|---|---|---|---|---|---|
| Totala | Interfaceb | Backbonec | DNAd | ||
| Cro–O1R | |||||
| Bound | 0.270.00 | 0.240.00 | 0.280.00 | 0.000.00 | 0.880.00 |
| Unbound rigid | 2.620.01 | 2.370.06 | 1.920.02 | 2.310.00 | 0.530.12 |
| Unbound flex. | 2.300.07 | 2.070.12 | 1.800.09 | 1.970.15 | 0.800.07 |
| DNA lib. | 1.990.05 | 1.690.06 | 1.510.09 | 1.460.07 | 0.940.00 |
| Lac–O1 | |||||
| Bound | 0.340.00 | 0.310.00 | 0.360.00 | 0.000.00 | 0.890.00 |
| Unbound rigid | 2.840.00 | 2.880.00 | 2.560.00 | 1.710.00 | 0.330.00 |
| Unbound flex. | 2.640.10 | 2.560.12 | 2.410.12 | 1.900.18 | 0.510.03 |
| DNA lib. | 2.330.06 | 2.290.08 | 2.060.08 | 1.570.09 | 0.540.01 |
| Arc–operator | |||||
| Bound | 0.220.00 | 0.230.00 | 0.190.00 | 0.000.00 | 0.950.00 |
| Unbound rigid | 2.580.01 | 2.580.01 | 1.970.02 | 2.520.00 | 0.430.00 |
| Unbound flex. | 2.240.08 | 2.130.10 | 1.640.10 | 1.880.15 | 0.500.04 |
| DNA lib. | 2.200.15 | 2.190.19 | 1.730.15 | 1.990.11 | 0.510.08 |
Average r.m.s.d. values (Å, standard deviation in subscript) calculated over the entire complex (a), the interface (b), the backbone (c) and the DNA (d) for the five top ranking solutions. The r.m.s.d. values are reported for the bound rigid-body docking (bound), unbound docking before (unbound rigid) and after semi-flexible refinement (unbound flex.) starting from canonical B-DNA, and unbound semi-flexible docking using a library of pre-bent and twisted DNA as input structures (DNA lib.). eFnat is the fraction of native contacts.
Figure 1HADDOCK score versus r.m.s.d. from the target (all heavy atoms of the complex) for the Cro (A), Lac (B) and Arc (C) repressors in complex with their operator. Solutions of the unbound flexible docking with canonical B-DNA are shown as small black squares with the five top ranking solutions identified by red squares. Solutions from the docking using a library of pre-bent and twisted DNA structures are shown as small orange circles with the top five ranking solutions identified by red circles. False positives for Arc are shown within a solid ellipse: These correspond to solutions in which the repressor is shifted by 1 or 2 bp along the DNA.
Figure 2Major groove width, slide and twist parameters of the five top ranking solutions of the Cro (A, D and G), Lac (B, E and H) and Arc (C, F and I) repressor/operator complexes. Average values plus standard deviations for the solutions of the unbound flexible docking with canonical B-DNA are shown as grey bars and those using a library of pre-bent and twisted DNA structures are shown as white bars. The values as measured in the published complexes are presented as black bars and those of the canonical B-DNA input structures as striped bars for slide (D,E,F) and twist (G,H,I) and as a horizontal solid line for the major groove width. All values for the major groove width are corrected by 5.8 Å to account for van der Waals radii of the phosphate groups. A value of 36° twist is presented as a dashed line for clarification of the twist-slide relationship.
Figure 3Best solutions of the unbound flexible docking using a library of pre-bent and twisted DNA structures (blue) superimposed on the reference structure (yellow): Cro-O1R (A), Lac-O1 (B) and Arc-operator (C). The structures were superimposed on all heavy atoms of the interface residues (interface r.m.s.d. values: Cro, 1.62 Å; Lac, 2.02 Å; Arc, 1.90 Å). The figures were generated using Pymol (DeLano Scientific, ).