| Literature DB >> 25908307 |
Magali Naville1, Minaka Ishibashi2, Marco Ferg3, Hemant Bengani4, Silke Rinkwitz2, Monika Krecsmarik5, Thomas A Hawkins6, Stephen W Wilson6, Elizabeth Manning2, Chandra S R Chilamakuri7, David I Wilson8, Alexandra Louis1, F Lucy Raymond9, Sepand Rastegar3, Uwe Strähle3, Boris Lenhard10, Laure Bally-Cuif5, Veronica van Heyningen4, David R FitzPatrick4, Thomas S Becker11, Hugues Roest Crollius1.
Abstract
Enhancers can regulate the transcription of genes over long genomic distances. This is thought to lead to selection against genomic rearrangements within such regions that may disrupt this functional linkage. Here we test this concept experimentally using the human X chromosome. We describe a scoring method to identify evolutionary maintenance of linkage between conserved noncoding elements and neighbouring genes. Chromatin marks associated with enhancer function are strongly correlated with this linkage score. We test >1,000 putative enhancers by transgenesis assays in zebrafish to ascertain the identity of the target gene. The majority of active enhancers drive a transgenic expression in a pattern consistent with the known expression of a linked gene. These results show that evolutionary maintenance of linkage is a reliable predictor of an enhancer's function, and provide new information to discover the genetic basis of diseases caused by the mis-regulation of gene expression.Entities:
Mesh:
Year: 2015 PMID: 25908307 PMCID: PMC4423230 DOI: 10.1038/ncomms7904
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Figure 1Scoring evolutionary linkage.
(a) Strategy to compute the linkage score. The presence of human genes in a 1-Mb radius around a CNE are recorded, as well as the simultaneous presence/absence of their orthologs in the vicinity of the orthologous CNEs in different species (green ticks/red crosses, respectively, in the middle panel; hash signs indicate genes located beyond the 1 Mb threshold). The presence of an orthologue is weighted by the degree of conserved synteny R between this genome and the human genome, while the costs for the absence of a gene account for the sequencing coverage C of the genome. The final linkage score S is the sum of these weights in the different genomes where the CNE is present (right panel). The gene(s) showing the maximum linkage score to a given CNE is considered to be the most likely target. (b) The linkage score of the CNE-target predictions were grouped in bins according to the genomic distance between the CNE and its predicted target (x axis). The median linkage score of the distributions (y axis) is stable for genes located up to ∼600 kb from the RegHsa element. (c) The linkage score is strongly correlated with an enrichment in annotations linked to enhancer function. An asterisk indicates data generated during this project.
Figure 2Cis-regulatory interactions predicted by the linkage score are experimentally tested in developing zebrafish.
(a) Individual exons of the predicted target gene are depicted in green and of neighbouring genes in pink. The arrowhead indicates the direction of transcription. Distance in kilobases between the CNE and the promoter of the predicted gene are indicated. (b) The predictions are supported by transgenic analysis in zebrafish. Expression at 48 hpf: NX_hs79: telencephalon (scale bar, 125 μm); NX_hs54: hindbrain, telencephalon (scale bar, 125 μm); NX_hs162: telencephalon, hypothalamus, otic vesicle (scale bar, 125 μm); NX_hs226: hindbrain (scale bar, 200 μm); NX_hs375: midbrain (scale bar, 200 μm).
Figure 3Neuroanatomical characterization of the element NX_hs54.
This element includes RegHsa0032185 and was characterized in transgenic adult and juvenile zebrafish. (a–d) Immunohistochemical analysis of S100β (grey, radial glial stem cells), GFP (green), and Hu (magenta, neurons) expression in the telencephalon (level in g) in two different transgene integrations (2–1 and 4–1). Radial glial stem cells outline the telencephalic surface (yellow arrows, b) and generate neurons (white arrows, b)40. In one integration, GFP is expressed by virtually all neurons and their fibres underneath the radial glial cell layer (b). In the other integration (c,d), likely due to positional effects, GFP expression is restricted to individual neuronal clones (grey arrows). (e) in situ hybridization for endogenous bcor mRNA in the adult zebrafish telencephalon (level in g). bcor mRNA is expressed by the newborn neurons (white arrow, f) underlying the first cell layer of radial glial stem cells (yellow arrow, f). The extended GFP expression in transgenic lines is in agreement with GFP protein stability in neurons after endogenous bcor expression is switched off, and/or with the absence of a repressor element. (g) schematic lateral and dorsal views of an adult zebrafish brain showing the region (red line) examined in a,c,e.(h,j) Immunohistochemical characterization of juvenile GFP expression in NX_hs54#4-1 demonstrates overlap with endogenous bcor expression (l,m). Use of two anatomical markers: acetylated tubulin (h,i, j,k; magenta) and nuclear staining (i,k; greyscale) permits describing GFP expression in the telencephalon at two different section levels by confocal microscopy (h anterior to j). At 3dpf in NX_hs54#4-1 transgenic embryos GFP is widely expressed at a low level but also shows strong expression in the dorsal and lateral area adjacent to the ventricle (h,j; white arrowheads). This is similar to endogenous bcor mRNA, which also shows low level expression throughout the telencephalon and whole brain but has an area of strong expression next to the ventricle (l,m; yellow arrowheads, ventricle boundary marked by red dashed line). Abbreviations: AC, anterior commissure, tel, telencephalon, OB, olfactory bulb. Scale bars, a,c,e, 100 μm; b,f, 60 μm; d, 40 μm; h,i,j,k, 100 μm; l,m, 40 μm.
Figure 4Motifs shared between RegHsa elements suggest co-regulated genes.
(a) The NEUROD1/NEUROD2 binding site is recurrently found in multiple RegHsa elements linked to nine genes on the human X chromosome. (b) AFF2 and IL1RAPL1 share five overrepresented motifs in their linked RegHsa elements. Each motif logo is indicated together with the number of occurrences (occ.) in the set of RegHsa elements. Motif 3 is similar to the binding site of the KLF12 transcription factor. (c) BCOR and MAGEB10 share four overrepresented motifs in their linked RegHsa elements.