| Literature DB >> 23613719 |
Mukta Kundu1, Alexander Kuzin, Tzu-Yang Lin, Chi-Hon Lee, Thomas Brody, Ward F Odenwald.
Abstract
Analysis of cis-regulatory enhancers has revealed that they consist of clustered blocks of highly conserved sequences. Although most characterized enhancers reside near their target genes, a growing number of studies have shown that enhancers located over 50 kb from their minimal promoter(s) are required for appropriate gene expression and many of these 'long-range' enhancers are found in genomic regions that are devoid of identified exons. To gain insight into the complexity of Drosophila cis-regulatory sequences within exon-poor regions, we have undertaken an evolutionary analysis of 39 of these regions located throughout the genome. This survey revealed that within these genomic expanses, clusters of conserved sequence blocks (CSBs) are positioned once every 1.1 kb, on average, and that a typical cluster contains multiple (5 to 30 or more) CSBs that have been maintained for at least 190 My of evolutionary divergence. As an initial step toward assessing the cis-regulatory activity of conserved clusters within gene-free genomic expanses, we have tested the in-vivo enhancer activity of 19 consecutive CSB clusters located in the middle of a 115 kb gene-poor region on the 3(rd) chromosome. Our studies revealed that each cluster functions independently as a specific spatial/temporal enhancer. In total, the enhancers possess a diversity of regulatory functions, including dynamically activating expression in defined patterns within subsets of cells in discrete regions of the embryo, larvae and/or adult. We also observed that many of the enhancers are multifunctional-that is, they activate expression during multiple developmental stages. By extending these results to the rest of the Drosophila genome, which contains over 70,000 non-coding CSB clusters, we suggest that most function as enhancers.Entities:
Mesh:
Year: 2013 PMID: 23613719 PMCID: PMC3632565 DOI: 10.1371/journal.pone.0060137
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
D. melanogaster CSB clusters within non-coding regions >65 kb.
| Chromosome Location | Flanking Genes (5′ – 3′) | Length (kb) | # of CSCs |
| chr2R:2,416,250–2,487,059 | jing→–CG15233←
| 70.8 | 63 |
| chr2R:4,156,848–4,234,973 | CG30371←–Pdm3→ | 78.1 | 70 |
| chr2R:10,944,185–11,025,474 | CG33467←–chn→ | 81.3 | 59 |
| chr2R:15,893,693–15,999,491 | CG16898←–18 w→ | 105.8 | 89 |
| chr2R:16,243,641–16,316,603 | CG11192←–CG12484→ | 73.0 | 57 |
| chr2R:17,300,726–17,367,629 | Sdc←–Sdc← | 67.0 | 73 |
| chr2L:1,614,340–1,690,699 | RFeSP–chinmo | 76.4 | 65 |
| chr2L:8,564,841–8,631,270 | Sema-1a→–Sema-1a→ | 66.4 | 53 |
| chr2L:12,822,253–12,911,000 | Kek1←–ACXC→ | 88.7 | 88 |
| chr2L:17,827,895–17,914,722 | CadN2←–CG43271← | 86.8 | 85 |
| chr2L:20,970,616–21,052,985 | CG42238←–betaInt-nu→ | 82.4 | 66 |
| chr3L:5,013,564–5,097,509 | CG12027←–CG34047← | 83.9 | 68 |
| chr3L:5,269,532–5,337,221 | Shep←–Lama← | 65.6 | 61 |
| chr3L:6,788,236–6,903,573 | vvl→–Prat2← | 115.3 | 90 |
| chr3L:10,691,176–10,765,559 | NijA←–CG43245→ | 74.4 | 78 |
| chr3L:13,658,522–13,771,330 | bru3←–CG34243← | 112.8 | 98 |
| chr3L:15,335,752–15,401,048 | Toll6→–CG33259→ | 65.3 | 76 |
| chr3L:15,722,537–15,801,891 | Comm←–CG6244→ | 79.4 | 66 |
| chr3L:18,297,975–18,389,946 | grim←–rpr← | 92.0 | 99 |
| chr3L:19,140,122–19,229,706 | Fz2←–CG33647→ | 89.6 | 86 |
| chr3L:21,835,505–21,900,787 | CG14563←–mub→ | 65.3 | 50 |
| chr3L:22,076,308–22,158,687 | msopa→–Olf413→ | 99.2 | 86 |
| chr3R:814,412–910,094 | CG2022←–corto→ | 95.7 | 58 |
| chr3R:2,735,634–2,878,317 | Antp←–Sod-1→ | 142.6 | 140 |
| chr3R:4,269,254–4,335,814 | PQBP-1←–OR85b← | 66.6 | 57 |
| chr3R:6,262,387–6,337,098 | Cyp12e1←–hth← | 74.7 | 63 |
| chr3R:7,096,987–7,172,245 | CG31386←–KP78b← | 75.3 | 63 |
| chr3R:10,258,903–10,335,707 | cv-c←–HtrA1← | 76.8 | 70 |
| chr3R:10,747,590–10,840,491 | CG3837←–CG14861→ | 92.9 | 92 |
| chr3R:11,378,494–11458604 | CG18516←–CG5302→ | 80.1 | 65 |
| chr3R:18,671,035–18,741,129 | CG4704←–klg→ | 70.1 | 66 |
| chr3R:19,238,330–19,307,587 | CG4374←–CG31225→ | 69.3 | 52 |
| chr3R:21,226,341–21,299,237 | CG31439←–CG5127→ | 72.9 | 70 |
| chr3R:24,179,225–24,287,776 | Or98b→–beat-VI→ | 108.6 | 97 |
| chrX:972,189–1,039,710 | CG3655←–CG14626→ | 67.5 | 62 |
| chrX:3,866,525–3,971,906 | CG6414←–CG32790→ | 105.4 | 83 |
| chrX:7,004,130–7,091,461 | fz4←–CG9650→ | 87.3 | 75 |
| chrX:16,018,498–16,105,580 | disco-r←–disco← | 87.1 | 72 |
| chrX:17,218,418–17,291,800 | B-H2→–BH-1→ | 73.4 | 73 |
Arrows indicate direction of transcription.
Figure 1DNA conservation spanning the Drosophila vvl locus.
(A) Shown is an UCSC Genome Browser view covering 150 kb of DNA located on the left arm of the D. melanogaster 3rd chromosome (http://genome.ucsc.edu). The linear representation includes the vvl transcribed region and flanking DNA starting from the upstream neighboring CG32392 gene and extending to the 3′ end of the downstream Prat2 transcribed sequence. A 12 species Drosophila DNA conservation track [19] reveals the presence of conserved sequence clusters throughout the locus. The red bar (positioned 30 kb downstream of the vvl transcribed sequence) covers the 27 kb of the non-coding intergenic region that was examined in this study for the presence of independent cis-regulatory enhancers. Aligned below the conservation track are identified Line and LTR repeat elements present within the D. melanogaster DNA that are not present in the same orthologous positions within many of the other species included in the conservation analysis. (B) An expanded view of the intergenic region studied for its cis-regulatory activity (highlighted in panel A) reveals 19 consecutive conserved sequence clusters that were independently tested for their cis-regulatory activity. Cluster numbers correspond to their designation in the cis-Decoder D. melanogaster genome-wide sequence conservation database [17].
Figure 2Gene-distant conserved sequence clusters are made up of multiple conserved sequence blocks.
Shown, is a D. melanogaster relaxed EvoPrint spanning the first (most 5′) 6,351 bp of the vvl 3′ flanking intergenic region that includes the conserved sequence block (CSB) clusters vvl-37 through vvl-41 (indicated by vertical bars in left margin). CSB clusters are resolved by their flanking less-conserved inter-cluster sequences of 150 or more bp. Capital letters represent bases in the D. melanogaster reference sequence that are conserved in all, or all but one, of the orthologous regions within the D. simulans, D. sechellia, D. erecta, D. yakuba, D. ananassae, D. pseudoobscura, D. persimilis, D. willistoni, D. virilis, D. mojavensis and D. grimshawi genomes. Less or non-conserved DNA is shown as lower case gray letters and the lower-case red-font bases indicate invariant spacer length DNA between CSBs. Colored highlighted conserved sequences within the vvl-38 (blue), vvl-39 (yellow), and vvl-41 (purple) clusters represent repeat elements that are discussed in Text S1.
Figure 3Evolutionary constraints on CSB cluster structure.
Multi-species analysis of CSB clusters and their flanking spacer regions reveals that the less-conserved spacer DNA has greater evolutionary sequence length variability when compared to their flanking CSB clusters. Shown, are percentage base pair length differences between D. melanogaster (blue) D. virilis (red) and D. grimshawi (yellow) vvl clusters 38 through 49 and the percent differences within their flanking spacer regions (each column represents 100%).
Figure 4Conserved cluster cis-regulatory enhancer activity during embryonic development.
Enhancer/reporter transgene expression analysis during embryonic development reveals that many of the tested CSB clusters are functionally independent embryonic enhancers that direct expression in different spatial/temporal patterns within the developing embryo. Shown are enhancer-reporter embryo expression patterns for 16 of the 19 consecutive clusters tested. Whole-mount mRNA stained embryos (staging according to Hartenstein and Campos-Ortega [58]; dorsal or ventral views adjacent to lateral views are shown for each cluster-reporter transgene; anterior up) to reveal peak reporter mRNA expression detected by a digoxigenin labeled Gal4 riboprobe for each of the cluster/enhancer-reporter constructs. The numbers in the lower right corner of each panel correspond to the clusters shown in Figure 1 and Figures S1, S2, S3, S4 and described in Table 2. (A) Dorsal and lateral view of a stage 13 embryo. vvl-38 activates transgene reporter expression in a small cluster of cells within or near the developing antenno-maxillary complex and within a cluster of anterior gut epidermal cells positioned adjacent to the cephalic lobes. (B) Dorsal and lateral surface views of a stage 13 embryo. vvl-39 drives expression in putative PNS cells. (C) Dorsal and lateral views of a stage 10 embryo. vvl-40 activates expression in two adjacent NBs within each cephalic brain lobe. (D) ventral and lateral view of a stage 11 embryo. vvl-41 drives expression in a set of NBs after they have generated their first GMC progeny. (E) Dorsal and lateral view of a stage 15 embryo. vvl-42 drives expression in cells of the gut ectoderm. (F) Dorsal and later surface view of stage 13 embryo. vvl-43 drives expression in late lateral ectodermal cells. (G) Dorsal and lateral view of a stage 10 embryo. vvl-44 drives expression in a single midline cell per segment and in segmentally repeated lateral cells, possibly PNS cells. (H) Dorsal and lateral view of a stage 14 embryo. vvl-45 drives expression in a bilateral pair of brain neurons. (I) Dorsal and dorsal-lateral views of a stage 13 embryo. vvl-46 drives expression in the posterior midgut. (J) Dorsal and lateral view of a stage 11 embryo vvl-47 drives in a few unidentified cells per hemisegment in the neuroectoderm and CNS. (K) Deep ventral and lateral view of a stage 10 embryo. vvl-48 drives expression in segmentally repeated clusters that appear to be tracheal placodes. (L) Ventral and lateral view of stage 11 embryo. The vvl-49 cluster activates reporter expression in ventral cord midline glial cells (also shown in Figure 8). (M) Ventral and lateral view of a stage 13 embryo. vvl-51 drives expression in segmentally repeated putative neurons in the peripheral nervous system. (N) Dorsal and lateral views of a stage 14 embryo. The vvl-52 cluster activates reporter expression in two bilaterally symmetrical cells within the antenno-maxillary complex. (O) Ventral and lateral views of a stage 12 embryo. vvl-53 cluster drives expression in CNS NBs (both brain and ventral cord) during late NB linage development. (P) Ventral and lateral surface views of a stage 14 embryo. vvl-55 activates expression in cells that line tracheal branches.
Figure 5Expression of enhancer/reporter transgenes in the larval CNS.
(A–L) During 3rd-instar larval development, most enhancer-Gal4 transgenes from vvl-37 to vvl-55 (twelve are illustrated) activate UAS/GFP-CD8 tagged reporter expression in neural precursors, neurons or glia within sub-regions of the cephalic lobes and in the thoracic ventral cord. Shown are stacked images of dorsal views of dissected CNS preparations from wandering third-instar larva (anterior up). (A) vvl-39 activates reporter expression in a subset of brain and ventral cord glia. (B) vvl-41 drives reporter expression in a set of subesophageal ganglion (SOG) interneurons. (C-E) vvl-43, -45 and -46 activate expression in different subsets of ventral cord and/or brain neurons. (F) vvl-48 drives expression in cells that line the tracheal tubes associated with the brain and ventral cord. (G) vvl-49 activates reporter expression in CNS midline cells, presumably glia. (H and I) vvl-50 and -51 drive expression in subsets of brain and ventral cord neurons. (J) vvl-53 activates reporter expression in brain and ventral cord NB lineages and in their neurons. (K) The vvl-54 cluster drives reporter expression in subsets of brain and ventral cord neurons. (L) vvl-55 activates reporter expression in a subset of both brain and ventral cord neurons. Based on the presence of membrane tagged GFP with in axons that exit the ventral cord, many of the neurons are most likely motor neurons.
Figure 6Expression of enhancer/reporters within the adult brain.
Many of the tested conserved regions (six are illustrated) activate reporter expression in neurons or glia positioned within different sub-regions of the central brain. Shown are ventral (A) or anterior (B–F) views of adult brain. (A) vvl-37 drives expression in several SOG neurons whose axons project across the midline. (B) vvl-41 drives expression in several SOG neurons that project across the midline or dorsally. (C-F) vvl-44, -45, -51 and -55 all activate reporter expression in putative insulin-producing neurons (IPCs) [59] (C) vvl-44 drives expression in IPCs and a set of lateral neurons whose dendrites fill the olfactory lobe. (D) vvl-45 drives expression in a set of ventral brain neurons whose dendrites fill the olfactory lobe and the lateral brain. (E) vvl-51 drives expression in IPCs and a set of ventral neurons whose axons and dendrites project into the olfactory lobe. (F) vvl-55 drives expression in IPCs and in presumptive ellipsoid body neurons.
cis-Regulatory activity of consecutive vvl conserved sequence clusters.
| Cluster | Embryo | Larva | Adult | Figure |
|
| Negative | Negative | Subset of brain neurons | 6A |
|
| antenno-maxillary complex & anterior gut | Negative | negative | 4A |
|
| PNS glia | Putative ventral cord glia | Putative glia | 4B, 5A |
|
| Pair of cephalic lobe NBs | Negative | Putative glia | 4C |
|
| CNS neuroblastlate lineage | Subset of brain neurons | Subset of brain neurons | 4D, 5B, 6B |
|
| Posterior gut and ectoderm | Negative | Negative | 4E |
|
| Late ectoderm | Ventral cord neurons | Subset of brain neurons | 4F, 5C |
|
| Midline & PNS precursors | Negative | Central brain neurons and IPCs | 4G, 6C |
|
| At stage 15, single neuron per cephalic lobe | Subset of brain and ventral cord neurons | Subset of central brain neurons and IPCs | 4H, 5D, 6D |
|
| Gut | Ventral midline and brain neurons | Subset of brain neurons | 4I, 5E |
|
| CNS and Neuroectoderm | A few cells in the SOG | Negative | 4J |
|
| Trachea | Tracheal tubes associated with the brain & ventral cord | Optic lobe and central brain trachea | 4K, 5F |
|
| Ventral cord midline glia | Ventral cord midline glia | Putative glia | 4L, 5G,8 |
|
| Negative | Subset of brain and ventral cord neurons | Brain and optic lobe neurons | 5H |
|
| Ventral cord and PNS | Subset of brain and ventral cord neurons including motor neurons | Subset of Optic lobe, brain neurons and IPCs | 4M, 5I, 6E |
|
| Anterior Tip | Negative | Negative | 4N, 5J |
|
| Late temporal network NBs and neurons | Brain lineages including NBs and ventral cord neurons | Negative | 4O |
|
| Gut ring | Subset of brain and ventral cord neurons | Subset of central Brain neurons | 5K |
|
| placode cells or PNS neurons | motor neurons | Subset of central brain neurons and IPCs | 4L, 6F |
Figure 8Expression analysis of vvl-49 midline enhancer sub-domains.
Temporal expression of vvl-49 enhancer/reporter transgenes during embryonic development. (A) The entire vvl-49 cluster drives reporter expression in a set of midline cells continuously from stages 11 through 13. (B) At stage 11, the D. melanogaster vvl-49a sub-fragment onsets expression in a subset of midline cells that is indistinguishable from that of the whole cluster, however, expression progressively declines in stage 12 and stage 13 embryos. (C) Expression of the vvl-49b sub-region activates expression in only a subset of the midline cells compared to the full vvl-49 enhancer activity.
Figure 7Species-specific flexibility within the vvl-49 ventral cord midline enhancer.
Twelve species EvoPrint analysis of the vvl-49 CSB cluster reveals that its central non-conserved region has experienced a 466 bp insertion in D. grimshawi that is missing in the other drosophilids. (A) The D. melanogaster reference sequence EvoPrint of the vvl-49 cluster. cis-Decoder analysis of vvl-49 CSBs reveals four consensus Single-minded/Tango TF DNA-binding sites (ACGTG). Two different repeat elements were identified that contain different flanking repeat sequences (highlighted green and blue). The yellow highlighted 94 bp non-conserved region corresponds to the central D. grimshawi region shown in panel (B). Cluster sub-fragments (49a and 49b) that were tested for enhancer activity are indicated by vertical bars on the left-margin. (B) An EvoPrint of the vvl-49 CSB cluster using D. grimshawi as the reference sequence. The EvoPrint identified a 466 bp insertion (highlighted yellow) within the non-conserved central region (when compared to the D. melanogaster EvoPrint).