| Literature DB >> 19128492 |
David M McGaughey1, Zachary E Stine, Jimmy L Huynh, Ryan M Vinton, Andrew S McCallion.
Abstract
BACKGROUND: Transcriptional regulatory elements are central to development and interspecific phenotypic variation. Current regulatory element prediction tools rely heavily upon conservation for prediction of putative elements. Recent in vitro observations from the ENCODE project combined with in vivo analyses at the zebrafish phox2b locus suggests that a significant fraction of regulatory elements may fall below commonly applied metrics of conservation. We propose to explore these observations in vivo at the human PHOX2B locus, and also evaluate the potential evidence for genome-wide applicability of these observations through a novel analysis of extant data.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19128492 PMCID: PMC2630312 DOI: 10.1186/1471-2164-10-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Conserved and non-conserved amplicons tiled across the . (a) The human PHOX2B promoter proximal region (chr4:41,440,000–41,456,600; hg18) was divided into 11 amplicons (total size 11,965 base pairs) excluding exons, 5' UTR, and 3' UTR, according to whether intervals contained PhastCons Placental Mammal Conserved Elements, 28-way Multiz Alignment sequences [41]. The amplicons are represented as gray scale rectangles: black (PHOX2B-HCS); gray (PHOX2B-HNCS); black (zebrafish alignment). Amplicon names are defined by their distance from the PHOX2B transcriptional start site and are displayed as custom tracks on the UCSC Genome browser [54] (b) Lateral images of G0 48-hpf zebrafish embryos exhibiting PHOX2B appropriate expression with element name marked on picture. Fb, Forebrain; VDi, Ventral Diencephalon; Hb, Hindbrain; CG, Cranial Ganglia; SC, Spinal Cord; ENS, Enteric Nervous System. *G1 embryo at 72-hpf. **G1 embryo at 48-hpf. *** Dorsal photo.
Figure 2. The interval displayed as a custom track on UCSC Genome browser [54]. The amplicons are represented as gray scale rectangles: black (conserved), gray (non-conserved), black (zebrafish alignment). (A) Region containing region aligning to phox2b-ZCS -8.3 (chr4:41,516,361–41,521,080; hg18) (B) Region containing aligning to phox2b-ZCS -16.6,phox2b-ZCS-20.1, phox2b-ZCS -23.7 and phox2b-ZCS-30.0. (chr4:41,549,434–41,580,142; hg18) (C) Lateral images of G0 transgenic zebrafish embryos corresponding to functional human conserved (PHOX2B-HCS -73.5, PHOX2B-HCS -108.3, PHOX2B-HCS-114.8, PHOX2B-HCS -116.7, PHOX2B-HCS -130.4 and PHOX2B-HCS-133.5), and human non-conserved (PHOX2B-HNCS -112.3) amplicons. Fb, Forebrain; OT, Oculomotor and Trochlear Motor Progenitors; Hb, Hindbrain; CG, Cranial Ganglia; SC, Spinal Cord; ENS, Enteric Nervous System. Closed arrow-heads point to hindbrain expression. Open arrow-heads point to cranial ganglia expression.
Functional non-conserved elements exhibit non-uniform distribution.
| Functional non-conserved sequences | Zebrafish | Human |
| ≤ 10 kb from gene | 3/5 functional | 2/3 functional |
| >10 kb from gene | 1/8 functional | 1/6 functional |
The distribution and function of tested non-conserved elements at the zebrafish phox2b locus [29] and human PHOX2B locus are detailed; elements are grouped according to position (less than or greater than 10 kb of the PHOX2B gene region).
Human PHOX2B elements conserved to zebrafish phox2b locus demonstrate activity consistent with orthologous zebrafish sequences.
| Human Amplicon | Expression | Zebrafish Amplicon | Expression | Coincident Control |
| Fb, CG | Hb, SC | No | ||
| VDi, Hb | Hb, SC | Yes | ||
| Fb, Hb | Mb, Hb, SC | Yes | ||
| OT, CG, Hb | Mb, Hb, CG | Yes | ||
| Fb, Hb | Mb, Hb | Yes | ||
| Fb, Hb, SC | Hb | Yes | ||
| CG | Hb, SC, ENS | No | ||
| ENS | Hb, SC, ENS | Yes |
HCS expression is pattern driven by human amplicon in G0 zebrafish embryos (* indicates G1 expression pattern). ZCS expression is pattern driven by orthologous zebrafish amplicon in G1 zebrafish embryos [29]. Overlap is categorized as Yes, tissue overlap in expression patterns, but additional tissues seen; No, no overlap in expression. Fb, Forebrain; OT, Oculomotor and trochlear motor Progenitors; VDi, Ventral diencephalon; Hb, Hindbrain; CG, Cranial ganglia,; SC, Spinal cord ENS, Enteric nervous system.
Distribution of putative transcriptional regulatory regions (pTRRs) identified by King et al. [26].
| Type of gene | ENCODE Sub-regions Analyzed | Conserved pTRRs in sub-region | Non-conserved pTRRs in sub-region | Base pairs in sub-region |
| All | 5' UTR | 71 | 46 | 99,440 |
| All | 3' UTR | 15 | 12 | 382,329 |
| All | Intergenic Proximal | 61 | 163 | 2,429,196 |
| All | Intergenic Distal | 48 | 171 | 11,055,834 |
| All | Intronic Proximal | 173 | 457 | 8,903,959 |
| All | Intronic Distal | 55 | 122 | 6,462,925 |
| All | Coding sequences | 0 | 0 | 671,166 |
| Developmental | Intergenic Proximal | 20 | 22 | 392,692 |
| Developmental | Intergenic Distal | 5 | 20 | 1,636,075 |
| Non-developmental | Intergenic Proximal | 24 | 86 | 733,487 |
| Non-developmental | Intergenic Distal | 10 | 51 | 2,309,353 |
| Non-gene Desert | Intergenic Distal | 39 | 159 | 7,147,316 |
| Gene Desert | Intergenic Distal | 9 | 12 | 3,908,518 |
| Gene Desert | Intergenic Proximal | 0 | 3 | 75,000 |
| Non-gene Desert | Intergenic proximal | 61 | 160 | 2,354,196 |
pTRRs partitioned to ENCODE defined regions and grouped as conserved versus non-conserved based on pTRR overlap with PhastCons Placental Mammal Conserved Elements, 28-way Multiz Alignment [43]. Gene type of all represents analysis of whole ENCODE region. Developmental genes represent regions flanking genes labeled with Gene Ontology term GO:0032502, while non-developmental genes were those that were not labeled with GO:0032502. Gene deserts were ENCODE intervals overlapping regions ≥500 kb without a Reference Sequence gene. Non-gene deserts were all sub-regions that did not overlap gene deserts. The "Base pairs in sub-region" column represents the sum of the genomic intervals represented by each type of sub-regions.
Non-conserved pTRR density is higher in intergenic proximal regions than intergenic distal regions.
| pTRR Density | |||||
| Gene Type | ENCODE Sub-region | Conserved pTRRs/Conserved bp | Non-conserved pTRRs/Non-conserved bp | Conserved pTRRS/Conserved Non-repeat bp | Non-conserved pTRRs/Non-conserved Non-repeat bp |
| All | 5' UTR | 1/457 | 1/1,456 | 1/410 | 1/931 |
| All | 3' UTR | 1/5,625 | 1/24,829 | 1/5,460 | 1/18,610 |
| All | Intergenic Proximal | 1/1,185 | 1/14,460 | 1/1,094 | 1/7,005 |
| All | Intergenic Distal | 1/7,528 | 1/62,541 | 1/7,073 | 1/29,635 |
| All | Intronic Proximal | 1/1,463 | 1/18,930 | 1/1,356 | 1/10,801 |
| All | Intronic Distal | 1/4,259 | 1/51,055 | 1/4,026 | 1/28,231 |
| Developmental | Intergenic Proximal | 1/1,591 | 1/16,403 | 1/1,508 | 1/8,796 |
| Developmental | Intergenic Distal | 1/11,370 | 1/78,961 | 1/10,714 | 1/47,792 |
| Non-developmental | Intergenic Proximal | 1/867 | 1/8,287 | 1/821 | 1/3,619 |
| Non-developmental | Intergenic Distal | 1/6,160 | 1/44,074 | 1/5,776 | 1/18,446 |
| Non-gene Desert | Intergenic Proximal | 1/1,078 | 1/14,303 | 1/991 | 1/6,877 |
| Gene Desert | Intergenic Proximal | N/A | 1/22,824 | N/A | 1/13,830 |
| Non-gene Desert | Intergenic Distal | 1/5,307 | 1/43,650 | 1/4,954 | 1/19,047 |
| Gene Desert | Intergenic Distal | 1/17,151 | 1/312,847 | 1/16,254 | 1/151,566 |
Density of putative transcriptional regulatory regions (pTRRS) identified by King et al. [26]. The ENCODE interval was partitioned into sub-regions. Gene type of all represents analysis of whole ENCODE region. Developmental genes represent regions flanking genes labeled with Gene Ontology term GO:0032502, while non-developmental genes were those that were not labeled with GO:0032502. Gene deserts were ENCODE intervals overlapping ≥500 kb regions without a Reference Sequence gene. Non-gene deserts were all sub-regions that did not overlap gene deserts. Density of pTRRs was calculated by dividing the total number of conserved or non-conserved base pairs in the ENCODE defined region by number of conserved or non-conserved pTRRs in the ENCODE defined region. N/A= not applicable due to lack of conserved pTRRs in gene desert intergenic proximal regions
Fold change in pTRR density of intergenic versus intergenic distal regions.
| Fold change of pTRR density between Intergenic Proximal and Intergenic Distal Sub-regions | ||||
| Gene Type | Conserved pTRRs/ | Non-conserved pTRRs/ | Conserved pTRRs/Conserved | Non-conserved pTRRs/ |
| All | 6.35 | 4.33 | 6.47 | 4.23 |
| Developmental | 7.15 | 4.81 | 7.10 | 5.43 |
| Non-developmental | 7.10 | 5.32 | 7.04 | 5.10 |
| Gene Desert | N/A | 13.71 | N/A | 10.96 |
| Non-gene Desert | 4.92 | 3.05 | 5.00 | 2.77 |
Gene type of all represents analysis of whole ENCODE region. Developmental genes represent regions flanking genes labeled with Gene Ontology term GO:0032502, while non-developmental genes were those that were not labeled with GO:0032502. Gene deserts were ENCODE intervals overlapping ≥500 kb regions without a reference sequence gene. Non-gene deserts were all sub-regions that did not overlap gene deserts. Fold change in gene density was calculated for each gene type by dividing the intergenic proximal pTRR density by the intergenic distal pTRR density (Table 4). The pTRR densities were calculated for conserved pTRRs divided by conserved bp, non-conserved pTRRs divided by non-conserved bp, conserved pTRRs divided by conserved non-repeat bp and non-conserved pTRRs divided by non-conserved non-repeat bp (Table 3). N/A = not applicable due to lack of conserved pTRRs in gene desert intergenic proximal regions.