| Literature DB >> 27247329 |
Maria Warnefors1, Britta Hartmann2, Stefan Thomsen3, Claudio R Alonso4.
Abstract
Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa.Entities:
Keywords: Hox genes; UCEs; alternative splicing; epigenetic regulation; organismal development; transcriptional regulation; ultraconserved elements
Mesh:
Year: 2016 PMID: 27247329 PMCID: PMC4989106 DOI: 10.1093/molbev/msw101
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Fig. 1.Genomic distribution of Drosophila UCEs. (A) Relative frequencies of UCEs overlapping ncRNAs, exons, intron–exon junctions, introns and intergenic regions, in comparison to reference elements with the same length distribution drawn from the entire genome. Of the 1,516 UCEs we identified, 186 overlapped with exons of protein-coding genes, 393 were purely intronic, 919 were located in intergenic regions and 18 overlapped with annotated ncRNAs, primarily tRNAs. (B) Frequencies of Drosophila UCEs of various lengths. (C) Chromosomal location of the Drosophila UCEs in the D. melanogaster and D. pseudoobscura genomes. For each chromosome, coordinates increase from left to right (D. melanogaster) or down to up (D. pseudoobscura). UCE clusters with at least 10 members are indicated by red circles. (D) Validation of the 466 UCEs on D. melanogaster chromosome arm 3R in the 15-way alignment available from the dm3 release of the UCSC Genome Browser. The majority of the UCEs from our pipeline were immediately detected in the alignment (“intact UCEs”), while others could be retrieved after taking into account ambiguous gap placements, sequences spanning two or more alignment blocks and other alignment artifacts. See main text for further details.
The 15 Largest UCE Clusters in the D. melanogaster Genome.
| Cluster ID | Total UCEs | Intergenic | Intronic | Junction | Exonic | ncRNA | |
|---|---|---|---|---|---|---|---|
| cluster_186 | 21 | 15 | 6 | – | – | – | |
| cluster_247 | 14 | 14 | – | – | – | – | |
| cluster_47 | 11 | 11 | – | – | – | – | |
| cluster_147 | 11 | – | 10 | – | 1 | – | |
| cluster_180 | 11 | 2 | 7 | – | 2 | – | |
| cluster_194 | 11 | 10 | 1 | – | – | – | |
| cluster_235 | 11 | – | – | 10 | 1 | – | |
| cluster_253 | 11 | 11 | – | – | – | – | |
| cluster_5 | 10 | 6 | 4 | – | – | – | |
| cluster_10 | 10 | 10 | – | – | – | – | |
| cluster_94 | 10 | 10 | – | – | – | – | |
| cluster_157 | 10 | 10 | – | – | – | – | |
| cluster_187 | 10 | 10 | – | – | – | – | |
| cluster_195 | 10 | 10 | – | – | – | – | |
| cluster_208 | 10 | – | 10 | – | – | – |
aGenes in underlined are conserved across all 12 investigated Drosophila species and associated with the same cluster in D. pseudoobscura and D. virilis.
Fig. 2.Involvement of Drosophila UCEs in transcriptional regulation. (A) Significant (P < 0.05) enrichment or depletion of 34 TFs in UCEs relative to reference elements in early Drosophila development. A χ2-test was applied to each factor and UCE type, followed by Benjamini–Hochberg correction for multiple tests. The datasets used to generate this figure are listed in supplementary table 5, Supplementary Material online. Bab1, Bric a brac 1; Cad, Caudal; Chinmo, Chronologically inappropriate morphogenesis; Cnc, Cap-n-collar; CTCF, CTCF; D, Dichaete; Disco, Disconnected; Dll, Distal-less; En, Engrailed; Fru, Fruitless; Ftz-f1, Ftz transcription factor 1; GATAe, GATAe; H, Hairy; Hkb, Huckebein; Hr46, Hormone receptor-like in 46; Hth, Homothorax; Inv, Invected; Jumu, Jumeau; Kn, Knot; Kr, Kruppel; Lola, Longitudinals lacking; Pan, Pangolin; Pcl, Polycomblike; Prd, Paired; Run, Runt; Sc, Scute; Sens, Senseless; Sin3A, Sin3A; Stat92E, Signal-transducer and activator of transcription protein at 92E; Su(H), Suppressor of Hairless; Zfh1, Ttk, Tramtrack; Ubx, Ultrabithorax; Usp, Ultraspiracle; Zfh1, Zn finger homeodomain 1. (B) Enrichment and depletion of four PcG and one Trithorax-group proteins. Analyses were performed as in (A). Ash1, absent, small, or homeotic discs 1; E(z), Enhancer of zeste; Pc, Polycomb; Psc, Posterior sex combs; Sce, Sex combs extra. (C) Enrichment and depletion of Cad binding at five points of Drosophila development. Analyses were performed as in (A). Double asterisks indicate 0.01 < P < 0.001 and triple asterisks P < 0.001.
Fig. 3.Overlapping functions of eUCEs. (A) Proportion of UCEs and reference elements which overlap with between zero and four types of functional sequences. The investigated functions were protein-coding capacity, splicing (overlap with a splice site), RNA editing (from the RADAR database; Ramaswami and Li 2014) and TF binding (based on modENCODE data; Celniker et al. 2009). (B) Venn diagram of the number of UCEs that overlap with different combinations of functions. The four function types are the same as in (B). The number of UCEs in each category is displayed. Colors indicate enrichment or depletion of UCEs relative to the proportion of reference elements that fall within each category.
Fig. 4.Analysis of an ultraconserved exon within the Ubx gene. (A) The Ubx gene (purple) contains 12 UCEs (red triangles) within the transcribed region. Coding sequences are shown as thick boxes and untranslated regions (UTRs) as narrow boxes. The gene is transcribed from left to right. One of the UCEs overlaps with the alternatively spliced mI exon. An alignment of the mI-UCE sequence from 12 Drosophila species and three more distantly related fly species is shown. Substitutions relative to the Drosophila sequence are shown in orange and insertions as a vertical line, with the number of inserted bases added above the sequence. The light blue box highlights the mI exon, while the surrounding sequences are intronic. The amino acid sequence (corresponding to the Drosophila nucleotide sequences) is shown above the alignment. For positions with observed substitutions, it is noted below the alignment whether these are synonymous (s) or nonsynonymous (n). (B) To explore the roles of microexon mI ultraconservation in Ubx splicing control we engineered a series of Ubx minigene constructs so that they included wild type (wt) or mutated versions of microexon mI (mutA) in which the protein coding potential of the gene was maintained while the ultraconserved nucleotide sequence of mI was disrupted by means of synonymous mutations (red). Approximate positions of splicing primers Ubx_E1F (forward) and Ubx_3′U (reverse) and expression primers expF/R (forward/reverse) are indicated. (C) Experiments in Drosophila Schneider 2 (S2) cells. Semi-quantitative RT-PCR analysis of wild type and mutA Ubx minigenes expressed in S2 cells reveals distinct patterns of Ubx mRNA splicing where the mutA minigene construct shows a marked reduction of Ubx Ia isoform production. Ubx.AS refers to the signal detected with primers Ubx_E1F and Ubx_3′R (see B) which detects all alternative splicing variants of the gene; Ubx.exp denotes signal amplified with primers expF/R (see B) which are positioned in the 3′ exon, a constitutive segment of Ubx mRNAs. (D) Expression of Ubx wild type and Ubx mutA minigenes in the Drosophila embryo. We produced HA-tagged UAS versions of wt and mutA Ubx minigenes (see A) and generated independent transgenic UAS-lines with insertions in identical chromosomal loci by means of site-specific recombination. The resulting UAS-Ubx lines (wt and mutA) were crossed with the elav-gal4 (elav) driver to express Ubx transgenes selectively within the developing embryonic nervous system. Note that expression patterns obtained with anti-HA antibodies in the embryonic CNS and PNS were identical across genotypes confirming comparable gene expression conditions. (E) Semi-quantitative RT-PCR analysis of wt and mutA Ubx minigene expression in the embryonic Drosophila nervous system reveals effects of mI on Ubx splicing control. In line with the results obtained in S2 cells (see C) we observed that the mutA minigene produced a reduced amount of Ubx Ia isoform when compared with its wild-type counterpart. (see C for definition of labels Ubx_AS and Ubx.exp and text for further details). Statistical analyses: **P < 0.01 and *P < 0.05 obtained in one-tailed t-test (P-value S2 cells = 0.0035 (**); P-value embryos = 0.0318 (*). Error bars indicate standard error of the mean. HA, haemagglutinin tag; B, Ubx B-element; mI, microexon I; mII, microexon II.