| Literature DB >> 18045502 |
Alison P Lee1, Yuchen Yang, Sydney Brenner, Byrappa Venkatesh.
Abstract
BACKGROUND: Transcription factors (TFs) regulate gene transcription and play pivotal roles in various biological processes such as development, cell cycle progression, cell differentiation and tumor suppression. Identifying cis-regulatory elements associated with TF-encoding genes is a crucial step in understanding gene regulatory networks. To this end, we have used a comparative genomics approach to identify putative cis-regulatory elements associated with TF-encoding genes in vertebrates. DESCRIPTION: We have created a database named TFCONES (Transcription Factor Genes & Associated COnserved Noncoding ElementS) (http://tfcones.fugu-sg.org) which contains all human, mouse and fugu TF-encoding genes and conserved noncoding elements (CNEs) associated with them. The CNEs were identified by gene-by-gene alignments of orthologous TF-encoding gene loci using MLAGAN. We also predicted putative transcription factor binding sites within the CNEs. A significant proportion of human-fugu CNEs contain experimentally defined binding sites for transcriptional activators and repressors, indicating that a majority of the CNEs may function as transcriptional regulatory elements. The TF-encoding genes that are involved in nervous system development are generally enriched for human-fugu CNEs. Users can retrieve TF-encoding genes and their associated CNEs by conducting a keyword search or by selecting a family of DNA-binding proteins.Entities:
Mesh:
Substances:
Year: 2007 PMID: 18045502 PMCID: PMC2148067 DOI: 10.1186/1471-2164-8-441
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1A flowchart of protocol used for identifying human-mouse and human-fugu conserved noncoding elements.
DNA-binding TF-encoding genes in human, mouse and fugu genomes. The TFs were classified based on their DNA-binding domains using a classification scheme adapted from Messina et al. [1]
| AP-2 | 5 | 5 | 6 |
| ARID | 11 | 9 | 15 |
| Beta-scaffold – CCAAT | 8 | 8 | 9 |
| Beta-scaffold – MADS | 5 | 5 | 9 |
| Beta-scaffold – p53 | 3 | 3 | 3 |
| Beta-scaffold – RUNT | 3 | 3 | 5 |
| Beta-scaffold – Others | 28 | 28 | 31 |
| BHLH | 96 | 94 | 134 |
| BZIP | 69 | 65 | 83 |
| Dwarfin | 8 | 8 | 13 |
| E2F | 11 | 9 | 11 |
| Forkhead | 40 | 35 | 60 |
| GCM | 2 | 2 | 2 |
| Heat shock factor | 7 | 4 | 10 |
| High mobility group box | 35 | 39 | 56 |
| Homeobox | 221 | 199 | 282 |
| Nuclear hormone receptor | 49 | 49 | 69 |
| Paired box | 9 | 9 | 14 |
| RFX | 6 | 6 | 7 |
| T-box | 16 | 15 | 19 |
| TEA | 4 | 4 | 5 |
| Trp cluster – Ets | 29 | 29 | 31 |
| Trp cluster – IRF | 9 | 9 | 15 |
| Trp cluster – Myb | 11 | 11 | 13 |
| ZnF-C2H2 | 472 | 290 | 254 |
| ZnF-C3H | 6 | 4 | 7 |
| ZnF-DM | 9 | 7 | 6 |
| ZnF-GATA | 10 | 10 | 12 |
| ZnF-Others | 61 | 55 | 73 |
| Others (e.g., SAND, RBPSUH) | 84 | 72 | 74 |
Figure 2Plot of CNE density against the total length of CNEs associated with DNA-binding TF-encoding genes. CNE density is defined as the number of bases located in CNEs per unit length (1 kb in human; 100 bp in fugu) of non-repetitive noncoding sequence in a gene locus.
Top twenty TF-encoding genes associated with the highest density of human-fugu CNEs. For genes that are part of conserved clusters, we averaged out the number or length of CNEs present in the whole cluster over the number of genes in that cluster. CNE density is defined here as the number of bases located in CNEs per kilobase of non-repetitive noncoding sequence in a gene locus.
| ENSG00000164853 | PREDICTED: similar to Uncx4.1 | 37.61 | |
| ENSG00000188620 | PREDICTED: similar to homeodomain protein | 31.82 | |
| ENSG00000188816 | Homeobox (H6 family) 2 | 31.82 | |
| ENSG00000166407 | Rhombotin-1 | 29.17 | |
| ENSG00000167081 | Pre-B-cell leukemia transcription factor 3 | 29.16 | |
| ENSG00000130940 | Probable transcription factor CST | 26.43 | |
| ENSG00000139800 | Zinc finger protein ZIC 5 | 25.94 | |
| ENSG00000109132 | Paired mesoderm homeobox protein 2B | 24.93 | |
| ENSG00000075891 | Paired box protein Pax-2. | 24.27 | |
| ENSG00000177508 | Iroquois-class homeodomain protein IRX-3 | 24.04 | |
| ENSG00000165655 | zinc finger protein 503 | 21.39 | |
| ENSG00000143032 | BarH-like 2 homeobox protein. | 21.28 | |
| ENSG00000171540 | orthopedia | 21.25 | |
| ENSG00000125285 | Transcription factor SOX-21 | 19.99 | |
| ENSG00000159387 | Iroquois-class homeodomain protein IRX-6 | 19.51 | |
| ENSG00000176842 | Iroquois-class homeodomain protein IRX-5 | 19.51 | |
| ENSG00000134138 | Homeobox protein Meis2 | 18.05 | |
| ENSG00000128652 | Homeobox protein Hox-D3 | 17.34 | |
| ENSG00000128709 | Homeobox protein Hox-D9 | 17.34 | |
| ENSG00000128710 | Homeobox protein Hox-D10 | 17.34 |
Top twenty TF-encoding genes associated with the highest number and total length of human-fugu CNEs. For genes that are part of conserved clusters, we averaged out the number or length of CNEs present in the whole cluster over the number of genes in that cluster.
| ENSG00000134138 | Homeobox protein Meis2 | 79 | 11.98 | |
| ENSG00000177508 | Iroquois-class homeodomain protein IRX-3 | 68 | 10.87 | |
| ENSG00000169554 | Zinc finger homeobox protein 1b | 76 | 10.25 | |
| ENSG00000143032 | BarH-like 2 homeobox protein. | 56 | 10.09 | |
| ENSG00000169946 | Zinc finger protein ZFPM2 | 49 | 9.90 | |
| ENSG00000167081 | Pre-B-cell leukemia transcription factor 3 | 57 | 9.10 | |
| ENSG00000091656 | zinc finger homeodomain 4 | 77 | 9.05 | |
| ENSG00000151514 | Sal-like protein 3 | 59 | 8.94 | |
| ENSG00000170549 | Iroquois-class homeodomain protein IRX-1 | 48 | 8.67 | |
| ENSG00000128573 | Forkhead box protein P2 | 57 | 8.41 | |
| ENSG00000121297 | Teashirt homolog 3 | 59 | 7.38 | |
| ENSG00000143995 | Homeobox protein Meis1. | 63 | 7.00 | |
| ENSG00000148737 | Transcription factor 7-like 2 | 42 | 6.91 | |
| ENSG00000153234 | Orphan nuclear receptor NR4A2 | 38 | 6.73 | |
| ENSG00000159387 | Iroquois-class homeodomain protein IRX-6 | 40.5 | 6.54 | |
| ENSG00000176842 | Iroquois-class homeodomain protein IRX-5 | 40.5 | 6.54 | |
| ENSG00000110693 | Transcription factor SOX-6. | 47 | 6.41 | |
| ENSG00000143013 | LIM domain transcription factor LMO4 | 32 | 5.92 | |
| ENSG00000164853 | PREDICTED: similar to Uncx4.1 | 31 | 5.74 | |
| ENSG00000075891 | Paired box protein Pax-2. | 37 | 5.68 |
Figure 3Human-fugu and mouse-fugu VISTA alignments of MEIS2 gene locus. Pink peaks denote conserved noncoding elements (CNEs) and blue peaks denote conserved exonic sequences. MEIS2 locus contains the highest number of CNEs (79 CNEs with a total length of 12.0 kb) among TF-encoding genes in the human, mouse and fugu genomes.
Figure 4A keyword search of genes in the TFCONES database.
Figure 5Gene record for forkhead transcription factor gene FOXA2.
Figure 6List of human-fugu CNEs associated with FOXA2.
Figure 7An image of the location of CNEs relative to the associated TF-encoding gene FOXA2.