| Literature DB >> 17626644 |
Domènec Farré1, Nicolás Bellora, Loris Mularoni, Xavier Messeguer, M Mar Albà.
Abstract
BACKGROUND: Understanding the constraints that operate in mammalian gene promoter sequences is of key importance to understand the evolution of gene regulatory networks. The level of promoter conservation varies greatly across orthologous genes, denoting differences in the strength of the evolutionary constraints. Here we test the hypothesis that the number of tissues in which a gene is expressed is related in a significant manner to the extent of promoter sequence conservation.Entities:
Mesh:
Year: 2007 PMID: 17626644 PMCID: PMC2323216 DOI: 10.1186/gb-2007-8-7-r140
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Mouse tissue expression distribution. We define three groups: low expression breadth (Restricted; 1-10 tissues), intermediate expression breadth (Intermediate; 11-50 tissues), high expression breadth (Housekeeping; 51-55 tissues).
Sequence divergence versus tissue expression breadth
| No. of tissues | N (total = 3,893) | dSM | Kp | Ka | Ks | Ka/Ks |
| 01-10 | 986 | 0.688 | 0.337 | |||
| 0.735 | 0.328 | 0.084 | 0.673 | 0.119 | ||
| 0.221 | 0.110 | 0.093 | 0.299 | 0.122 | ||
| 11-50 | 1,889 | 0.701 | 0.333 | 0.708 | 0.116 | |
| 0.752 | 0.328 | 0.058 | 0.633 | 0.089 | ||
| 0.216 | 0.093 | 0.073 | 0.307 | 0.103 | ||
| 51-55 | 1,018 | 0.328 | ||||
| 0.791 | 0.323 | 0.031 | 0.572 | 0.054 | ||
| 0.208 | 0.079 | 0.057 | 0.305 | 0.085 | ||
| <10-5 | 0.226 | <10-75 | <10-18 | <10-62 |
N, number of genes; dSM, promoter divergence (see text); Kp, promoter substitution rate; Ka, non-synonymous substitution rate; Ks, synonymous substitution rate. Mean (top), median (middle), and standard deviation (bottom) are indicated for each variable. Numbers in bold indicate significant differences at p < 0.001 in each expression group with respect to the rest (two-sample Wilcoxon-Mann-Whitney test). The last row shows the p value of Kruskal-Wallis (K-W) test that evaluates differences between the three tissue expression breadth groups.
Figure 2Promoter sequence conservation in HK and non-HK genes. The x-axis shows 100 nucleotide bins along 2 Kb upstream of the TSS. The y-axis shows percent conservation ((1 - dSM) × 100). Genes were grouped according to the presence or absence of a CpG island and Ka/Ks values. Significant p values for 2 Kb promoter sequence divergence comparisons are indicated below the curves. Beneath these, the p values obtained for regions -2,000 to -500 (left), and -500 to the TSS (right), are given in smaller font size.
Average promoter divergence values (dSM) for HK and non-HK genes classified in different GO classes
| All | CpG+ | CpG- | ||||||||
| GO term | Description | N | dSM (HK) | dSM (nonHK) | N | dSM (HK) | dSM (nonHK) | N | dSM (HK) | dSM (nonHK) |
| Molecular function | ||||||||||
| GO:0000166 | Nucleotide binding | 464 | 0.727 | 0.699 | 101 | 0.684 | 0.700 | |||
| GO:0004872 | Receptor activity | 259 | 0.734 | 0.675 | 131 | 0.747 | 0.656 | 128 | 0.655 | 0.692 |
| GO:0004871 | Signal transducer activity | 440 | 0.689 | 0.658 | 246 | 0.692 | 0.656 | 194 | 0.663 | 0.661 |
| GO:0003700 | Transcription factor activity | 183 | 0.673 | 0.602 | 113 | 0.657 | 0.600 | 70 | 0.766 | 0.605 |
| GO:0043169 | Cation binding | 485 | 0.711 | 0.671 | 177 | 0.582 | 0.671 | |||
| Biological process | ||||||||||
| GO:0044249 | Cellular biosynthesis | 73 | 0.629 | 0.741 | ||||||
| GO:0045184 | Establishment of protein transport | 162 | 0.720 | 0.737 | 138 | 0.723 | 0.731 | 24 | 0.677 | 0.760 |
| GO:0007049 | Cell cycle | 188 | 0.697 | 0.706 | 152 | 0.703 | 0.724 | 36 | 0.656 | 0.646 |
| GO:0019538 | Protein metabolism | 177 | 0.682 | 0.713 | ||||||
| GO:0044260 | Cellular macromolecule metabolism | 201 | 0.686 | 0.713 | ||||||
| GO:0050874 | Organismal physiological process | 183 | 0.756 | 0.685 | ||||||
| GO:0009605 | Response to external stimulus | 209 | 0.676 | 0.711 | 85 | 0.758 | 0.699 | |||
| GO:0007166 | Cell surface receptor linker signal transduction | 221 | 0.683 | 0.626 | 113 | 0.659 | 0.645 | 108 | 0.762 | 0.609 |
| GO:0048513 | Organ development | 111 | 0.633 | 0.598 | ||||||
| GO:0009653 | Morphogenesis | 130 | 0.664 | 0.615 | ||||||
| GO:0009607 | Response to biotic stimulus | 166 | 0.761 | 0.723 | 92 | 0.680 | 0.745 | |||
| GO:0007165 | Signal transduction | 563 | 0.684 | 0.656 | 342 | 0.687 | 0.668 | 221 | 0.666 | 0.643 |
| Cellular component | ||||||||||
| GO:0005739 | Mitochondrion | 171 | 0.785 | 0.756 | 148 | 0.780 | 0.770 | 23 | 0.869 | 0.707 |
| GO:0005737 | Cytoplasm | 194 | 0.728 | 0.707 | ||||||
| GO:0005783 | Endoplasmic reticulum | |||||||||
| GO:0005576 | Extracellular region | 219 | 0.653 | 0.621 | 77 | 0.718 | 0.591 | 142 | 0.523 | 0.635 |
| GO:0005886 | Plasma membrane | 184 | 0.663 | 0.666 | ||||||
Entries in bold are those that have a significantly different dSM distribution (p < 0.05). The number of genes (N) is indicated for each GO class. Results for CpG+ and CpG- genes are shown.
Transcription factors with predicted binding motifs over-represented in HK gene promoters
| Transcription factor | Description | Expression breadth |
| AHR and ARNT | Aryl hydrocarbon receptor; it can interact with ARNT (AHR:ARNT heterodimer) | INT |
| ATF family | Activating transcription factor | HK |
| CREB family | cAMP responsive element binding protein | INT |
| E2F family | E2F transcription factor | INT and HK |
| HIF1A | Hypoxia inducible factor 1, alpha subunit; as AHR, it can interact with ARNT | HK |
| MYC and MAX | Proto-oncogene protein c-myc and MYC associated factor X; they can form MYC:MAX heterodimers | INT and HK |
| NRF1 and NRF2 | Nuclear respiratory factor 1 and 2 | INT and HK |
| SP1 | SP1 transcription factor | HK |
| USF | Upstream transcription factor (USF1 and USF2) | INT |
HK, housekeeping; INT, intermediate.