| Literature DB >> 16522199 |
Victor X Jin1, Gregory A C Singer, Francisco J Agosto-Pérez, Sandya Liyanarachchi, Ramana V Davuluri.
Abstract
BACKGROUND: The canonical core promoter elements consist of the TATA box, initiator (Inr), downstream core promoter element (DPE), TFIIB recognition element (BRE) and the newly-discovered motif 10 element (MTE). The motifs for these core promoter elements are highly degenerate, which tends to lead to a high false discovery rate when attempting to detect them in promoter sequences.Entities:
Mesh:
Year: 2006 PMID: 16522199 PMCID: PMC1475891 DOI: 10.1186/1471-2105-7-114
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The schematic diagram of the core promoter elements.
Figure 2Average sequence similarity between orthologous human and mouse promoter regions is very high at the transcription start site (vertical dotted line), and drops sharply both up- and downstream of this point. Points indicate the mean percent identity in sliding 20-base windows along our dataset of 9,010 orthologous mouse-human promoter pairs. 95% confidence interval bars are plotted at every 10 bases.
Enumeration of core promoter elements in the human genome, with and without considering conservation in the mouse genome
| Motif (core, PWM score cutoffs) | Number of promoter elements found in 9,010 promoter sequences | Number of promoter elements that are conserved in the orthologous mouse promoters | Significance of increase in signal-to-noise ratio (one-tailed)a | ||
| Real sequences | Randomized sequences | Real sequences | Randomized sequences | ||
| BRE (N/Ab, 0.81) | 2696 (29.9%) | 2503 (27.8%) | 1952 (21.7%) | 1476 (16.4%) | 1.90 × 10-6 |
| TATA (0.73, 0.58) | 1848 (20.5%) | 1167 (13.0%) | 1483 (16.5%) | 567 (6.3%) | 1.70 × 10-16 |
| INR (0.72, 0.62) | 5949 (66.0%) | 4527 (50.2%) | 5648 (62.7%) | 4040 (44.8%) | 0.02 |
| MTE (0.79, 0.53) | 5833 (64.7%) | 5464 (60.6%) | 5123 (56.9%) | 4555 (50.6%) | 0.03 |
| DPE (0.92, 0.92) | 1733 (19.2%) | 1650 (18.3%) | 1078 (12.0%) | 695 (7.7%) | 2.90 × 10-11 |
aOne-tailed Fisher's exact test of the real:randomized ratio in columns 4 and 5 versus columns 2 and 3. The p-value in the BRE row, for example, indicates that 1952/1476 is significantly greater than 2696/2503.
bThe "core score" was found to be uninformative for the BRE element, so only the PWM score was used in this case.
Figure 3The number of core promoter elements in the real promoter sequences (dark bars) significantly exceeds the numbers found in the randomized sequences (light bars) in all cases. When we add the criterion that the element must be conserved in the mouse genome, we find that the gap between the number of elements found in the real data versus the random data widens, indicating an increase in the signal-to-noise ratio.
Figure 4The number of each motif discovered in its expected position relative to the true TSS (position zero) represents a local maximum when compared to the sequence immediately upstream. The dotted lines show the results for a single-genome scan, while the solid lines show the results when only those motifs that are conserved in the orthologous mouse promoter are accepted.
Figure 5Pairs of motifs are much more likely to occur at the TSS than within sequences up- or downstream of the TSS. The dotted and solid lines are as described in the legend for Figure 4.
Conditional probabilities that two core promoter elements are spaced in a synergistically-favorable manner, given that both elements are present
| Core promoter element pairs | p(synergy | both elements present) | One-tailed significance of synergy | p(synergy | both elements present and conserved in mouse counterparts) | One-tailed significance of synergy | ||
| Real sequences | Randomized sequences | Real sequences | Randomized sequences | |||
| TATA-MTE | 0.75 | 0.57 | 2.3 × 10-17 | 0.72 | 0.45 | 2.3 × 10-17 |
| BRE-Inr | 0.54 | 0.46 | 4.0 × 10-5 | 0.45 | 0.31 | 7.2 × 10-8 |
| TATA-Inr | 0.68 | 0.69 | 0.67 | 0.67 | 0.60 | 0.01 |
| BRE-DPE | 0.43 | 0.42 | 0.46 | 0.32 | 0.24 | 0.10 |
| Inr-DPE | 0.48 | 0.47 | 0.20 | 0.42 | 0.33 | 0.00 |
| MTE-DPE | 0.45 | 0.49 | 0.98 | 0.33 | 0.24 | 0.00 |
| Inr-MTE | 0.53 | 0.46 | 1.9 × 10-8 | 0.44 | 0.34 | 3.9 × 10-17 |
| BRE-TATA | 0.45 | 0.38 | 0.08 | 0.46 | 0.08 | 0.00 |
| BRE-MTE | 0.31 | 0.27 | 0.00 | 0.2 | 0.12 | 5.4 × 10-7 |
| TATA-DPE | 0.5 | 0.45 | 0.11 | 0.45 | 0.22 | 0.00 |
BRE PWM with defined core elements in bold
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
| A | 0 | 0 | 26 | 0 | 0 | 0 | 0 |
| C | 51 | 50 | 0 | 74 | 0 | 74 | 74 |
| G | 23 | 24 | 48 | 0 | 74 | 0 | 0 |
| T | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Con |
TATA PWM with defined core elements in bold
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | |
| A | 224 | 228 | 186 | 146 | 213 | 67 | 756 | 0 | 865 | 670 | 770 | 444 | 282 | 131 | 182 | 164 | 185 | 184 | 141 | 179 | 165 |
| C | 226 | 203 | 239 | 229 | 310 | 84 | 0 | 5 | 0 | 0 | 0 | 15 | 99 | 274 | 272 | 273 | 273 | 264 | 286 | 283 | 304 |
| G | 301 | 291 | 333 | 302 | 225 | 71 | 0 | 8 | 2 | 0 | 119 | 224 | 437 | 394 | 301 | 337 | 275 | 303 | 328 | 276 | 275 |
| T | 146 | 175 | 139 | 220 | 149 | 675 | 141 | 884 | 30 | 201 | 8 | 214 | 79 | 98 | 142 | 123 | 164 | 146 | 142 | 159 | 153 |
| Con | N | N | N | N | N | N | N | N | N | N | N | N | N |
Inr PWM with defined core elements in bold
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
| A | 0 | 0 | 506 | 81 | 143 | 0 | 0 |
| C | 248 | 388 | 0 | 223 | 0 | 193 | 320 |
| G | 0 | 0 | 0 | 102 | 0 | 0 | 0 |
| T | 258 | 118 | 0 | 100 | 363 | 313 | 186 |
| Con |
MTE PWM with defined core elements in bold
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |
| A | 2 | 14 | 51 | 5 | 1 | 1 | 6 | 25 | 7 | 3 | 1 | 10 |
| C | 20 | 24 | 2 | 3 | 55 | 24 | 26 | 0 | 5 | 50 | 3 | 20 |
| G | 35 | 18 | 5 | 43 | 0 | 31 | 26 | 33 | 39 | 2 | 52 | 27 |
| T | 1 | 2 | 0 | 7 | 2 | 2 | 0 | 0 | 7 | 3 | 2 | 1 |
| Con |
DPE PWM with defined core elements in bold
| 1 | 2 | 3 | 4 | 5 | |
| A | 514 | 0 | 585 | 0 | 214 |
| C | 0 | 0 | 0 | 549 | 303 |
| G | 481 | 995 | 0 | 0 | 478 |
| T | 0 | 0 | 410 | 446 | 0 |
| Con |