| Literature DB >> 26567182 |
Pierre B Cattenoz1, Anna Popkova1, Tony D Southall2, Giuseppe Aiello1, Andrea H Brand2, Angela Giangrande3.
Abstract
High-throughput screens allow us to understand how transcription factors trigger developmental processes, including cell specification. A major challenge is identification of their binding sites because feedback loops and homeostatic interactions may mask the direct impact of those factors in transcriptome analyses. Moreover, this approach dissects the downstream signaling cascades and facilitates identification of conserved transcriptional programs. Here we show the results and the validation of a DNA adenine methyltransferase identification (DamID) genome-wide screen that identifies the direct targets of Glide/Gcm, a potent transcription factor that controls glia, hemocyte, and tendon cell differentiation in Drosophila. The screen identifies many genes that had not been previously associated with Glide/Gcm and highlights three major signaling pathways interacting with Glide/Gcm: Notch, Hedgehog, and JAK/STAT, which all involve feedback loops. Furthermore, the screen identifies effector molecules that are necessary for cell-cell interactions during late developmental processes and/or in ontogeny. Typically, immunoglobulin (Ig) domain-containing proteins control cell adhesion and axonal navigation. This shows that early and transiently expressed fate determinants not only control other transcription factors that, in turn, implement a specific developmental program but also directly affect late developmental events and cell function. Finally, while the mammalian genome contains two orthologous Gcm genes, their function has been demonstrated in vertebrate-specific tissues, placenta, and parathyroid glands, begging questions on the evolutionary conservation of the Gcm cascade in higher organisms. Here we provide the first evidence for the conservation of Gcm direct targets in humans. In sum, this work uncovers novel aspects of cell specification and sets the basis for further understanding of the role of conserved Gcm gene regulatory cascades.Entities:
Keywords: DamID; Drosophila; glide/gcm; mGcm; screen
Mesh:
Substances:
Year: 2015 PMID: 26567182 PMCID: PMC4701085 DOI: 10.1534/genetics.115.182154
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.562
Figure 1The DamID peaks are enriched for GBSs. (A) Schematic of the MICRA algorithm used to identify enriched motifs in the DamID peaks. For each peak, 1000 nt of sequence was extracted and filtered for conserved sequence, and then the frequency of every 6–10 mers was compared to the background frequency in nonexonic DNA and ranked accordingly [for details, see Southall and Brand (2009)]. The most highly represented motif corresponds to the canonical GBS. (B) Canonical GBS reported with the strength of Gcm binding and references. (C) Distance between the DamID peaks and the closest canonical GBS. (D) Distribution of the number of canonical GBSs per kilobase in the whole genome and under the DamID peaks. The box delimits the second and third quartiles; the thick black bar indicates the median for the two populations; the P-values are indicated as follows: ns = nonsignificant (P > 0.05); *P = 0.05–0.01; **P = 0.01–0.001; ***P < 0.001. (E) Distribution of conservation scores of canonical GBSs in the whole genome and under the DamID peaks. Box, thick black bar; asterisk, as in D. (F) Coding status of the genomic region covered by the DamID.
Figure 2Known targets of Gcm are found in the DamID screen. (A) Schematic representation of the gcm locus. The gene is indicated by the blue rectangles, thin ones indicating the untranslated regions (UTR) and thick ones indicating the coding exons (CDSs); pale blue arrowheads indicate the direction of transcription. In this and the following figures, GBSs are indicated in red, and the histograms above the locus show a region of 1 kb on each side of a DamID peak scoring a FDR < 0.001. The histograms in gray indicate the nonsignificant DamID peaks with a FDR > 0.001. The genomic coordinates of the loci (genome version BDGP R5/Dm3) are indicated above the histograms. (B) Euler diagram representing the overlap between the downstream targets of Gcm identified by Altenhein et al. (2002), Freeman , and Egger . The size of each area is proportional to the number of genes included in the category. (C) Subset of genes identified in the three transcriptome assays mentioned in B that are also identified as direct targets by the DamID screen. The size of each area is proportional to the number of genes included in the category. (D) Names of the genes in common between the screens of Altenhein et al. (2002), Freeman , and Egger et al. (2003) and the DamID. (E) Distribution of the genes whose expression decreases in gcm LOF according to the earliest developmental stages at which they were identified [data set Altenhein et al. (2002)]. (F) Same distribution as in E but for genes present in both the DamID screen and the Altenhein et al. (2002) LOF data set.
Targets of Gcm involved in nervous system development
| Gene symbol | Annotation | References |
|---|---|---|
| Axon guidance | ||
| Axon ensheatment and glial cell migration | ||
| Blood-brain barrier (amino acid transport, septate junction) | ||
| Blood-brain barrier, axon ensheatment, and glial cell migration | ||
| Dendritic plasticity | ||
| Epileptic seizure | ||
| Insulin regulation | ||
| Glial cell development | ||
| Late embryonic brain development | ||
| Longitudinal glia precursor division | ||
| Neural stem cell regulation | ||
| Optic lobe development | ||
| Enriched in glia | ||
| Expressed in the CNS |
Genes not coexpressed in embryos according to the Berkeley Drosophila Genome Project in situ database. No mark for genes that were not assayed.
Genes coexpressed in embryos with gcm. No mark for genes that were not assayed.
Targets of Gcm involved in immune system development
| Gene symbol | Annotation | References |
|---|---|---|
| Antimicrobial humoral response (response to fungi, response to Gram-negative bacteria) | ||
| Autophagy | ||
| Coagulation | ||
| Hemocyte migration | Zanet | |
| Melanotic encapsulation of foreign targets | ||
| Phagocytosis | ||
| Wnt mediated inflammatory cascade | ||
| Hemocytes proliferation | ||
| Lymph gland development | ||
| Plasmatocytes differentiation | ||
| Repression of lamellocyte differentiation |
Genes not coexpressed in embryos according to the Berkeley Drosophila Genome Project in situ database. No mark for genes that were not assayed.
Genes coexpressed in embryos with gcm. No mark for genes that were not assayed.
Figure 4Gcm regulates the Hedgehog signaling pathway in the embryonic epidermis. (A–H) Immunolabeling of Ci (in gray), Ptc (in red), and Smo (in green) in stage 15 embryos of the following genotypes: enGal4/+ (control) and enGal4/+; UASgcm/+ (gcm GOF). The areas delimited in white in A, C, E, and G are magnified in B–B″, D–D″, F and F′, and H and H′, respectively. Full projections of the embryos are shown in A, C, E, and G, and projections of four optical sections taken at 2-µm interval are shown in the magnified regions of B, D, F, and H. DAPI-labeled nuclei are in blue. Note that the three proteins involved in the Hedgehog signaling pathway are expressed at higher levels in the gcm GOF than in the control embryo. Bar, 50 µm. (I and J) Histograms showing the expression of gcm (I), ci, Pka-C1, and rdx (J) in enGal4/+ (control, red) and enGal4/+; UASgcm/+ (blue) embryos at stage 13–14. The y-axis represents the relative expression levels compared to that observed in the control embryos (red columns). Error bars and the P-values are calculated as in Figure 3D.
Figure 5Gcm overlaps with and regulates Dh31 expression in the larval CNS. (A) Histogram representing the endogenous expression levels of Dh31 in S2 cells with and without transfected Gcm. Error bars and P-values are calculated as in Figure 3, A–D. (B–F″) Immunolabeling of Dh31 and GFP in larval CNS (Dh31 in red; GFP in green). White arrowheads indicate cells coexpressing Dh31 and Gcm (GFP+ cells); asterisks indicate cells expressing only Dh31. The gcm-expressing cells correspond to the dorsolateral neuronal cluster. Areas delimited in white in B and D are magnified in C–C″ and E–E″, respectively. (B) Full projection of a larval CNS of a heterozygous gcmGal4, UASmCD8GFP at the third instar. (C–C″) Projection of three optical sections taken at 2-µm interval: (C) overlay of Dh31 and GFP, (C′) Dh31 alone, and (C″) GFP alone. (D–E″) Same as B–C″ in a gcmGal4, UASmCD8GFP homozygous larva. (F–F″) Same as C–C″ in a gcm knockdown (KD) larva of the following genotype: gcmGal4, repoGal80, UASmCD8GFP/+; UASgcmRNAi/+. In all genotypes, the three sections contain the whole dorsolateral cluster (white oval). Note that in the gcmGal4 homozygous and in the gcm KD (arrowheads) larvae, the intensity of Dh31 labeling is reduced in the GFP+ cell compared to that observed in the control gcmGal4 heterozygous animals. Also take for comparison the surrounding Dh31+ cells (asterisk). (B and D) DAPI staining shows the nuclear labeling is in blue. Bar, 50 µm.
Figure 3Validation of new Gcm targets identified by the DamID screen. (A–D) The left panels represent the loci containing the DamID peaks and the GBSs for CG30002 (A), CycA (B), E(spl)m8 (C), and ptc (D) (as described in Figure 2A), and the histograms on the right (in red) represent the results of the luciferase assays carried out on each GBS. The red bars indicate enrichment of luciferase activity in the presence of Gcm compared to no transfected Gcm (“no Gcm”); bars indicate SEM; and n represents the number of independent transfection assays. p-values are as indicated as Figure 1D. (E) Histogram representing the endogenous levels of expression of CycA, E(spl)m8, CG30002, ptc, and repo in S2 cells with and without transfected Gcm. The y-axis is in log10 scale; error bars and P-values are calculated as indicated in Figure 3, A–D. (F) Same as E after FACS sorting of the Gcm+ cells. (G–I) FACS analyses of S2 cells (G), of S2 cells transfected with the Gal4 expression vector ppacGal4 and the GFP reporter UAS-GFP (H), and of S2 cells transfected with ppacGcm and repoGFP (I). The dotplots show the forward scatter area (FSC-A) on the y-axis and the GFP intensity on the x-axis. The area in blue indicates the GFP+ cells that were sorted for further analysis, and the number under the area indicates the percentage of cells that are GFP+. (J) Diagram representing the distribution of the DamID targets according to their levels of expression in stage 10–11 embryo (4- to 8-hr embryo in modENCODE development RNA-seq).
Oligonucleotides used to generate the pGL4.23 vectors used in S2 cells
| Probe name | |
|---|---|
| CG30002_GBS1_mutF | |
| CG30002_GBS1_mutR | |
| CG30002_GBS1_wtF | |
| CG30002_GBS1_wtR | |
| CG30002_GBS2_mutF | |
| CG30002_GBS2_mutR | |
| CG30002_GBS2_wtF | |
| CG30002_GBS2_wtR | |
| cycA_GBS_mutF | |
| cycA_GBS_mutR | |
| cycA_GBS_wtF | |
| cycA_GBS_wtR | |
| E(spl)m8_GBS_mutF | |
| E(spl)m8_GBS_mutR | |
| E(spl)m8_GBS_wtF | |
| E(spl)m8_GBS_wtR | |
| ptc_GBS1_mutF | |
| ptc_GBS1_mutR | |
| ptc_GBS1_wtF | |
| ptc_GBS1_wtR | |
| ptc_GBS2_mutF | |
| ptc_GBS2_mutR | |
| ptc_GBS2_wtF | |
| ptc_GBS2_wtR |
Figure 9Conservation of Gcm targets in mammals. (A and B) Schematic representation of GCM1 (A) and GCM2 (B) loci in humans. The genes are represented as in Figure 2A. GBSs are indicated as bars. The color coding refers to their similarity with the canonical GBSs used in mammals in Yu from 62.5% (blue) to 100% (red). The genomic coordinates of the loci (genome version GRCh37/hg19) are indicated above the GBSs. Red rectangles indicate the regions used for the luciferase reporter assays, and the mutated sites are indicated by red asterisks. (C and D) Characterization of mGcm1 and mGcm2 effects on GCM1 (C) and GCM2 expression (D). Histograms represent the endogenous expression levels of transcripts of each gene in HeLa cells on transfection with mGcm1 or mGcm2; values are relative to those of housekeeping genes GAPDH and ACTB. (E) Histogram representing the luciferase activity in HeLa cells transfected with mGcm1 or mGcm2 and luciferase reporters containing the region of GCM1 with WT or mutant GBSs. Levels of luciferase activity are relative to those observed in the absence of mGcm transfection. (F) Histogram representing the luciferase activity in HeLa cells transfected with mGcm2 and luciferase reporters containing the region of GCM2 with WT or mutant GBSs as in E. (G–M) Characterization of mGcm1 and mGcm2 effects on TBX1 (G), GATA3 (H), GATA4 (I), GATA6 (J), FGFR1 (K), FGFR2 (L), and DLL1 expression (M), as in C and D. (N) Histogram representing the S2 cell endogenous levels of Doc1, pnr, and btl on transfection with a Gcm expression vector, as indicated in Figure 3D.
Oligonucleotides used to generate the pGL4.23 vectors used in HeLa cells
| Probe name | |
|---|---|
| GCM1 F | |
| GCM1 R | |
| GCM1 GBS1mut forward | |
| GCM1 GBS2mut reverse | |
| GCM1 GBS2mut forward | |
| GCM1 GBS3mut reverse | |
| GCM2 F | |
| GCM2 R | |
| GCM2 GBS1mut forward | |
| GCM2 GBS2mut reverse | |
| GCM2 GBS2mut forward | |
| GCM2 GBS3mut reverse |
Figure 6Gcm targets encode Ig domain proteins. (A) Histogram summarizing the protein domain enrichment analysis. The x-axis indicates the enrichment score; the grade of gray is representative of the P-value: lightest gray, P = 10−5; black, P = 10−14. The y-axis indicates the name of the protein domain; n indicates the number of genes in the DamID screen containing that domain. (B) List of genes from the DamID screen containing immunoglobulin (Ig) domains. The genes indicated in green are known to be involved in nervous system development; the genes in blue are expressed in the nervous system. (C) Histogram showing the endogenous expression of Ig domain–containing genes on S2 cell transfection with a Gcm expression vector and FACS sorting. The y-axis represents the relative expression levels in cells transfected with Gcm compared to cells without Gcm. The y-axis is in log10 scale; error bars and p-values are calculated as in Figure 3D.
Figure 7Gcm targets involved in nervous system development and function. (A) List of DamID genes involved in neural development and function. The genes in red were previously characterized as downstream targets of Gcm, and those underlined were confirmed by qPCR in FACS-sorted S2 cells in this study. (B) Schematic representation of E(spl)-C. Top panel represents the DamID peak histogram on the 0.8 Mb around E(spl)-C. Note that the peaks are all localized within E(spl)-C. Bottom panel shows the 50-kb window of E(spl)-C. (C) Histogram represents the S2 cell endogenous levels of E(spl)-C transcripts on transfection of a Gcm expression vector, as indicated in Figure 3, A–D.
Figure 8Gcm targets involved in immune system development and function. (A) List of the DamID genes. Green area covers those involved in immune system function; red area covers those involved in immune system development. The genes in red were previously characterized as downstream targets of Gcm, and those underlined were confirmed by qPCR in FACS-sorted S2 cells in this study. (B) Histogram representing the S2 cell endogenous levels of genes involved in immune system development on transfection with a Gcm expression vector, as indicated in Figure 3, A–D.
Figure 10Molecular pathways affected by Gcm. (A) Hedgehog signaling pathway. (B) JAK/STAT signaling pathway. (C) Notch signaling pathway. The proteins and genes in black and red are targeted by Gcm, according to the DamID screen. The genes in red were previously characterized as downstream targets of Gcm, and underlined genes were validated by qPCR in FACS-sorted S2 cells in this study. The proteins in gray are part of the pathway but are not targeted by Gcm.