| Literature DB >> 17576671 |
Hana Faiger1, Marina Ivanchenko, Tali E Haran.
Abstract
TBP recognizes its target sites, TATA boxes, by recognizing their sequence-dependent structure and flexibility. Studying this mode of TATA-box recognition, termed 'indirect readout', is important for elucidating the binding mechanism in this system, as well as for developing methods to locate new binding sites in genomic DNA. We determined the binding stability and TBP-induced TATA-box bending for consensus-like TATA boxes. In addition, we calculated the individual information score of all studied sequences. We show that various non-additive effects exist in TATA boxes, dependent on their structural properties. By several criterions, we divide TATA boxes to two main groups. The first group contains sequences with 3-4 consecutive adenines. Sequences in this group have a rigid context-independent cooperative structure, best described by a nearest-neighbor non-additive model. Sequences in the second group have a flexible, context-dependent conformation, which cannot be described by an additive model or by a nearest-neighbor non-additive model. Classifying TATA boxes by these and other structural rules clarifies the different recognition pathways and binding mechanisms used by TBP upon binding to different TATA boxes. We discuss the structural and evolutionary sources of the difficulties in predicting new binding sites by probabilistic weight-matrix methods for proteins in which indirect readout is dominant.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17576671 PMCID: PMC1935006 DOI: 10.1093/nar/gkm451
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Various analyses of the TATA boxes studied here
| Name | sequence | yTBPc- induced TATA-box bending | half life of fraction B (minutes) | ‘A’ fraction | ‘B’ fraction | EPD occur. (8 bp) | EPD frequency@ positions 7–8 | Dinuc. slide flexibility (KJ/mol) | Tetranuc. conform. energy@ positions 6–9g | Total conform. tetranuc. energy (KJ/mol) | Helical twist@ A4A5 or A4T5 | Z statistics for positions 6–9 | Inf. score mono | Inf. score dinuc |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| I. MLP-like TATA boxes: | ||||||||||||||
| MLP* | CGGGC | 65(±4)° | 255(±24) | 0.17(±3) | 0.83(±3) | 184 | 0.243 | 7.58 | −166.8 | −2662 | 11.4° | 2.7 | 0.0 | 0.0 |
| T7A8* | CGGGC | 63(±3)° | 230(±12) | 0.19(±8) | 0.81(±8) | 142 | 0.222 | 7.13 | −170.3 | −2666 | 5.0 | 0.32 | 0.21 | |
| A8 | CGGGC | 53(±2)° | 239(±4) | 0.13(±1) | 0.87(±1) | 74 | 0.085 | 13.72 | 175.6 | −2679 | 4.9° | −4.1 | 0.02 | 0.68 |
| T8 | CGGGC | 50(±3)° | 122(±8) | 0.74(±3) | 0.26(±3) | 27 | 0.021 | 11.69 | −171.6 | −2672 | 13.5° | −2.3 | 1.27 | 2.33 |
| T7 | CGGGC | 45(±4)° | 110(±5) | 0.75(±3) | 0.25(±3) | 19 | 0.036 | 1.35 | −163.1 | −2656 | 14.0° | −4.0 | 0.30 | 1.89 |
| II. E4-like TATA boxes: | ||||||||||||||
| T5* | CGGGC | 76(±4)° | 139(±13) | 0.29(±4) | 0.71(±4) | 99 | 0.266 | 7.58 | −166.8 | −2648 | 8.3° | 4.0 | 0.71 | 0.71 |
| (TA)4* | CGGGC | 65(±4)° | 163(±6) | 0.26(±4) | 0.74(±4) | 75 | 0.190 | 7.13 | −170.3 | −2652 | 1.5° | 1.9 | 1.03 | 0.92 |
| T5A8 | CGGGC | 63(±3)° | 332(±16) | 0.19(±3) | 0.81(±3) | 91 | 0.370 | 13.72 | −175.6 | −2664 | 4.8° | 1.6 | 0.73 | 1.39 |
| T5T8 | CGGGC | 53(±3)° | 195(±8) | 0.20(±2) | 0.80(±2) | 9 | 0.027 | 11.69 | −171.6 | −2657 | 8.3° | −1.0 | 1.98 | 3.04 |
| T5T7* | CGGGC | 43(±3)° | 78(±6) | 0.13(±2) | 0.87(±2) | 12 | 0.033 | 1.35 | −163.1 | −2641 | −3.1 | 1.01 | 2.6 | |
Numbers in parenthesis are the SE of the mean. It includes the experimental error between the different independent experiments and the difference between the experimental points and the curve-fitting model.
aThe 20-bp sequence in the stem of the hairpin constructs. The letters in bold are the 8-bp core TATA box.
bBend angles determined at 30°C. The bend center is between the 5th and the 6th bp, or on the 6th bp, and is pointing into the major groove at the bend center. Bend angles for sequences with an asterisk are from Bareket-Samish et al. (13).
cEquation used: F(0)/F(t) = Ae−1 + Be−2 and B are the fraction of molecules dissociating with macroscopic rate constants k1 and k2, respectively.
The half-life was determined from t1/2 = ln2/k2. Asterisk denotes sequences for which dissociation kinetics data was measured by Bareket-Samish et al. (13) and re-analyzed here by this equation.
dNumber of occurrences of 8-bp TATA boxes in EPD release 89.
eFrequency of occurrence of dinucleotide in position 7/8 in the EPD release 89, calculated separately from the YWTAAADN and YWTATADN datasets, for groups I and II, respectively.
fSlide flexibility (in KJ mol−1) of the dinucleotide at position 7/8, determined from the curvature of the slide/shift stacking potential at the minimum energy Packer et al. (54).
gMinimum conformational energy (in KJ/mol) for the tetranucleotides at positions 6–9, calculated from the data of Packer et al. (39).
hMinimum tetranucleotide conformational energy (in KJ/mol) summed along each sequence. Calculations are based on the data of Packer et al. (39).
iObserved in TBP/TATA-box co-crystal structures having these TATA-box sequences.
jZ statistics, the deviation of the observed frequency of DNA tracts from that expected based on mononucleotide composition. See text for details.
kIndividual information score calculated from either mononucleotide or dinucleotide weight matrices. See text for details.
Figure 2.Dissociation kinetics of yTBPc (27 nM) from consensus-like TATA-box variants embedded in hairpin constructs (0.4 nM). The number below each gel denotes the time after adding competitor DNA (1.76 μM).
Figure 4.Dissociation kinetics experiments using methylated DNA targets. Left: double-stranded stem of DNA hairpin containing the MLP target with methylated cytosine residues (denoted by M). Right: stem of DNA hairpin containing the T7 target with methylated cytosine residues. For other details see Figure 2.
Figure 1.Phasing analysis of yTBPc-induced TATA-box bending. Shown are the relative mobilities of the bound DNA divided by the relative mobilities of the free DNA as a function of the linker length. The values shown are of one representative experiment (of 3–4 independent experiments). The line is from the best fit to a cosine function (44).
Figure 3.Plot of the fraction of molecules bound to consensus-like TATA-box variants at time (t) divided by the fraction of molecules bound at time zero is plotted as a function of time. The lines are from the best fit to a double exponential curve. Solid squares, MLP; solid circles, T7A8; solid down triangles, A8; solid up triangles, T8; solid diamonds, T7; open squares, T5; open circles, (TA)4; open down triangles, T5A8, open up triangles, T5T8; open diamonds, T5T7. The shown experimental points are those from only one experiment, out of 3–6 independent experiments conducted with each DNA target. Hence, they may deviate slightly from the averaged values presented in Table 1.
Figure 5.Comparison between the dissociation kinetics of yTBPc from methylated and non-methylated TATA boxes. Fraction of yTBPc molecules bound at time (t) divided by the fraction of molecules bound at time (0) is plotted as a function of time. Solid squares, MLP; open squares, methylated MLP; solid circles, T7; open circles, methylated T7. For other details see Figure 3.