| Literature DB >> 22115527 |
Marc S Halfon1, Qianqian Zhu, Elizabeth R Brennan, Yiyun Zhou.
Abstract
BACKGROUND: Cis-regulatory modules are bound by transcription factors to regulate gene expression. Characterizing these DNA sequences is central to understanding gene regulatory networks and gaining insight into mechanisms of transcriptional regulation, but genome-scale regulatory module discovery remains a challenge. One popular approach is to scan the genome for clusters of transcription factor binding sites, especially those conserved in related species. When such approaches are successful, it is typically assumed that the activity of the modules is mediated by the identified binding sites and their cognate transcription factors. However, the validity of this assumption is often not assessed.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22115527 PMCID: PMC3235160 DOI: 10.1186/1471-2164-12-578
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
CRM predictions tested in this study
| CRM name | coordinates (r5/dm3) | reporter activity | putative target gene | primary expression pattern |
|---|---|---|---|---|
| chr3L:12306977-12307990 | + | ventral nerve cord, amnioserosa | ||
| chr3R:27173861-27175568 | + | ventral nerve cord (midline*), amnioserosa | ||
| chr2L:17140480-17141559 | + | pupal wing, eye, leg and other tissues | ||
| chr3R:19617439-19618229 | - | none | ||
| chr2L:16364204-16365055 | + | somatic and visceral muscle from stage 12 onward | ||
| chr3L:16013542-16014711 | + | segmentally repeated stripes, mostly ectodermal with limited mesoderm expression in dorsal regions and in the anal ring |
*a shorter version of cooc404 also had ventral nerve cord activity but ectopic relative to CG34347 expression (data not shown)
Figure 1Successful prediction of new CRMs. Reporter gene expression is shown in the left-hand panels (anti-GFP: A, C, E, G; GFP fluorescence, I), in situ hybridization to mRNA of the assigned target gene in the right-hand panels (B, D, F, H). Embryos are oriented anterior to the left and ventral side up (A-D) or laterally with dorsal to the top (G-H). (A) cooc164 drives reporter gene expression primarily in the ventral nerve cord in a pattern similar to that of target gene CG32105 (B). (C) cooc404 drives gene expression in the midline of the ventral nerve cord. (D) CG34347, putative target gene for cooc404, is expressed in the same midline cells (arrowheads). (E) Reporter gene expression from cooc102 can be observed throughout the mesoderm (black arrows, arrowheads). Expression in the visceral mesoderm (not shown) and anterior segments (white arrows) is not observed for the assigned target gene jhamt (F). Arrows and arrowheads in panel F mark somatic mesodermal cells corresponding to those similarly marked in panel E. (G) cooc310 reporter gene expression is observed in segmentally repeated stripes, primarily in the embryonic ectoderm. Inset shows cells in the mesoderm co-labeled (white cells, marked with arrows) for GFP (green) and the mesodermal marker Mef2 (magenta). (H) Corresponding stripes of expression are seen for target gene notum. (I) cooc110 drives gene expression in pupal tissues including the wing.
CRM activity and conservation
| CRMa | activity | AVID | AVID+ | phastCons | peakPhastCons | peakPhastCons | peakPhastCons | TFBS presenceb | TFBS mismatchesc |
|---|---|---|---|---|---|---|---|---|---|
| + | 0.6002 | 0.7710 | 0.5532 | 0.6651 | 0.6337 | 0.6431 | 0.8 | 0.21 | |
| + | 0.6345 | 0.8230 | 0.4246 | 0.4836 | 0.5038 | 0.5325 | 0.8 | 0.15 | |
| DME2 | + | 0.6694 | 0.7860 | 0.4472 | 0.5416 | 0.5296 | 0.5133 | 0.8 | 0.19 |
| + | 0.6766 | 0.5250 | 0.6649 | 0.7616 | 0.7459 | 0.8099 | 0.8 | 0.26 | |
| + | 0.6874 | 0.7530 | 0.6466 | 0.6718 | 0.6740 | 0.6778 | 0.8 | 0.29 | |
| + | 0.6867 | 0.7150 | 0.5072 | 0.5808 | 0.5767 | 0.5930 | 0.8 | 0.34 | |
| DME31 | - | 0.5037 | 0.5010 | 0.4509 | 0.6249 | 0.4786 | 0.4628 | 0.6 | 0.38 |
| DME30 | - | 0.5597 | 0.5840 | 0.5403 | 0.6183 | 0.5900 | 0.5292 | 0.4 | 0.35 |
| DME7 | - | 0.5819 | 0.5350 | 0.5822 | 0.5609 | 0.5103 | 0.4809 | 0.6 | 0.20 |
| DME3 | - | 0.6095 | 0.6920 | 0.5168 | 0.4784 | 0.4826 | (<500 bp) | 0.8 | 0.22 |
| DME25 | - | 0.7401 | 0.8830 | 0.4480 | 0.5255 | 0.4964 | 0.4640 | 1 | 0.14 |
| - | 0.6430 | 0.7840 | 0.5899 | 0.5970 | 0.5971 | 0.6145 | 0.6 | 0.12 | |
| DME4 | - | nd | nd | 0.5025 | 0.6021 | 0.5307 | 0.5200 | nd | nd |
| p value (Wilcoxon one-sided) | 0.08983 | 0.2424 | 0.4726 | 0.2226 | 0.0507 | 0.06287 | 0.5909 | ||
| with DME25 switched | 0.05303 | 0.6859 | 0.4178 | 0.1474 | 0.1010 | 0.3194 | |||
a Bold type indicates CRMs tested for activity in this paper; others are from [11].
b TFBS presence was scored as number of sites present in D. pse. divided by number of sites in D. mel.
cTFBS mismatches were scored as the fraction of unaligned nucleotides in each D. pse. vs. aligned D. mel. site averaged over all of the D. mel TFBSs for the CRM region (e.g., a fully conserved site = 0, a completely unaligned site = 1, average for the CRM if these were the only two sites = 0.5).
Figure 2Effects of binding site mutagenesis on reporter gene expression. Flies with the wild-type cooc310, cooc102, or mib2_Fcenhancer driving nuclear lacZ were crossed to flies with mutated versions of the CRMs driving cytoplasmic GFP and the resulting embryos double-stained for both reporters (lacZ, magenta; GFP, green; overlap, white). Areas of direct overlap are limited due to the nuclear vs. cytoplasmic expression of the two reporters but coincident expression can readily be observed. Embryos are oriented anterior to the left and dorsal to the top, except for panel H, which is a dorsal view. (A-D) Mutagenesis of dTcf, Tin, Pnt, or Twi predicted binding sites in the notum_cooc310 CRM have no discernable effect on CRM activity. (E) Mutation of the Tin site in CRM jhamt_cooc102 causes a quantitative reduction in reporter gene expression but has no effect on expression pattern (compare with Fig. 1E). (F) An intact dTcf site is required for expression mediated by jhamt_cooc102 in the anterior (arrow) but has no effect on the remainder of the reporter gene expression. (G) Mutation of the jhamt_cooc102 Pnt site leads to a near-total loss of reporter gene expression. Arrows indicate a few remaining GFP-positive cells. (H) Putative Pnt binding sites in the mib2_FCenhancer CRM lead to an expansion of reporter gene expression throughout the trunk visceral mesoderm when mutated (arrows) and to additional cells with reporter gene expression in the ventral midline (panel I). Somatic mesoderm cells around the periphery of the pictured embryo express both reporters. (I) Close-up view of the stage 11 ventral midline showing additional cells expressing the mutated mib2_FCenhancer reporter gene (arrows). Arrowheads mark cells which also have expression driven by the wild-type enhancer. (J) In a yanbackground (yan), additional cells express the wild-type mib2_FCenhancer reporter gene (arrows). The smaller apparent size of these cells compared to the similar arrow-marked cells in panel I is mainly due to the cytoplasmic vs. nuclear nature of the two reporter genes, although we cannot fully rule out additional ectopic expression using the mutated enhancer. Arrowheads indicate the same wild-type mib2_FCenhancer-expressing cells marked with arrowheads in panel I.