| Literature DB >> 22962458 |
Slavica Dimitrieva1, Philipp Bucher.
Abstract
MOTIVATION: Genomic context analysis, also known as phylogenetic profiling, is widely used to infer functional interactions between proteins but rarely applied to non-coding cis-regulatory DNA elements. We were wondering whether this approach could provide insights about utlraconserved non-coding elements (UCNEs). These elements are organized as large clusters, so-called gene regulatory blocks (GRBs) around key developmental genes. Their molecular functions and the reasons for their high degree of conservation remain enigmatic.Entities:
Mesh:
Year: 2012 PMID: 22962458 PMCID: PMC3436827 DOI: 10.1093/bioinformatics/bts400
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Alternative models of UCNE action and corresponding retention patterns after whole-genome duplication. Grey rectangles represent UCNEs (supposed to be remote control elements), the black circles represent the promoter of the target gene (not supposed to be ultraconserved). (A) Stand-alone model: each UCNE drives independently of the other UCNEs the expression of the target gene in one particular tissue. (B) Cooperative model: the simultaneous activity of at least two UCNEs is required for target gene expression. Different combinations of UCNEs drive expression in different tissues. (C) Reciprocal retention pattern after WGD expected under the stand-alone model: UCNEs get randomly distributed over the two daughter genes. (D) ‘Winner-takes-all’ retention pattern expected under the cooperative model. UCNEs need to be retained by the same daughter gene in order to ensure expression in all tissues
Fig. 2.Schematic representation of a typical winner-takes-all example: one of the two orthologs of the OLA1 gene in Fugu retains many UCNEs, while the other ortholog retains none (introns are not drawn to scale)
Classification of retention patterns of intronic/UTR UCNEs in fish orthologs for the top UCNE-enriched genes
| Gene | #UCNEs | Medaka | Stickleback | Zebrafish | Classification | ||
|---|---|---|---|---|---|---|---|
| NPAS3 | 53 | 6–0–0 | n/a2 | n/a1 | 1–0–1 | n/a2 | win: 1 rec: 1 |
| DACH1 | 39 | 22–1–0 | 21–2–0 | 16–3–0 | 20–1–0 | 18–6–0 | win: 4 conc:1 |
| FOXP2 | 38 | 29–2–0 | n/a2 | n/a2 | 27–1–1 | n/a2 | winner: 2 |
| EBF3 | 38 | n/a2 | 27–1–0 | 24–4–1 | 28–0–0 | 31–0–0 | winner: 4 |
| FOXP1 | 38 | 25–0–0 | 19–0–0 | 22–0–0 | 24–0–0 | 24–0–0 | winner: 5 |
| AUTS2 | 34 | n/a2 | n/a2 | n/a2 | n/a2 | 15–4–0 | concord: 1 |
| ZEB2 | 27 | 22–0–0 | 15–0–0 | 6–1–2 | 19–0–0 | 8–5–2 | w:3 r:1 c:1 |
| ZFPM2 | 25 | n/a2 | n/a2 | n/a2 | n/a2 | 14–7–0 | concord: 1 |
| SOX6 | 22 | 14–0–0 | 13–2–0 | 14–1–0 | 12–0–0 | n/a2 | winner: 4 |
| ESRRG | 22 | 7–0–0 | 7–0–0 | 8–0–0 | 5–0–0 | 17–0–0 | winner: 5 |
| EBF1 | 21 | 2–1–2 | 4–0–2 | 3–2–0 | 3–1–0 | 5–5–0 | rec:2 conc:3 |
| PBX3 | 21 | n/a2 | n/a2 | n/a2 | n/a2 | 16–2–0 | winner: 1 |
| MEIS2 | 18 | n/a2 | 15–0–0 | 13–3–0 | 14–1–0 | 7–3–0 | win:3 conc:1 |
| OLA1 | 16 | 14–0–0 | 13–1–0 | 14–0–0 | 11–0–0 | n/a2 | winner: 4 |
| EHBP1 | 15 | 9–0–0 | n/a2 | n/a2 | 7–0–0 | n/a2 | winner: 2 |
| DACH2 | 12 | n/a3 | n/a1 | n/a1 | n/a3 | 9–0–0 | winner: 1 |
| MEIS1 | 12 | 8–0–0 | n/a1 | n/a1 | n/a2 | 9–0–0 | winner: 2 |
| NBEA | 12 | 8–0–1 | 8–0–1 | 8–0–1 | 7–0–0 | 5–0–0 | winner: 5 |
| POLA1 | 12 | n/a2 | n/a2 | n/a2 | 5–0–0 | n/a2 | winner: 1 |
| SATB1 | 10 | 5–0–0 | 11–0–0 | 7–0–0 | 5–0–0 | 7–0–0 | winner: 5 |
An a—b—c pattern stands for: a—number of UCNEs in the ‘major ortholog’ only, b—number of UCNEs in both orthologs, c—number of UCNEs in the ‘minor ortholog’ only. The classification column denotes the number of cases where the corresponding pattern is observed. Notation: n/a1—no orthologous gene present in the corresponding fish; n/a2only one ortholog present; n/a3—no UCNEs retained in the fish orthologous genes.
Fig. 3.(A) Distribution of the amount of conserved UCNEs retained by the ‘major’ orthologous gene in Zebrafish; (B) Expected distribution from a random retention model based on shuffled data. Error bars represent the standard deviation computed from 500 simulations
Classification of retention patterns of UCNE clusters in fish genomes for the top 25 clusters
| #UCNEs | Associated genes | Medaka | Stickleback | Zebrafish | Classification | ||
|---|---|---|---|---|---|---|---|
| 134 | 67–0–0 | 47–0–0 | 36–1–3 | 56–0–0 | 24–6–15 | winner:4, recipr:1 | |
| 96 | CCNE1; | 37–11–4 | 31–11–4 | 39–15–3 | 13–8–18 | 59–4–0 | winner: 1, concord: 3, recipr:1 |
| 96 | 48–0–0 | 48–0–0 | 45–2–0 | 48–0–0 | 34–2–26 | winner: 4, recipr:1 | |
| 92 | 19–6–1 | 24–7–0 | 23–6–1 | 18–4–1 | 47–3–0 | winner: 1, concord: 4 | |
| 83 | 57–1–0 | 42–0–0 | 59–0–0 | 53–0–1 | 43–0–24 | winner: 4, recipr:1 | |
| 79 | 36–0–0 | 35–0–0 | 37–0–0 | 30–0–0 | 58–0–0 | winner: 5 | |
| 73 | 35–1–0 | 32–1–0 | 29–1–0 | 34–1–0 | 29–6–0 | winner: 5 | |
| 72 | 18–0–0 | 19–0–0 | 22–1–1 | 10–0–0 | 49–0–1 | winner: 5 | |
| 71 | AKAP6; EGLN3; NPAS3; SPTSSA | 7–0–2 | 9–0–0 | 1–0–0 | 6–0–0 | 39–0–0 | winner: 4, recipr:1 |
| 67 | 51–0–0 | 46–2–0 | 46–4–1 | 41–0–0 | 25–3–0 | winner: 5 | |
| 67 | ANKRD32; | 36–0–0 | 34–0–0 | 39–0–0 | 32–0–0 | 40–0–0 | winner: 5 |
| 60 | 34–3–0 | 37–0–0 | 41–0–0 | 22–0–8 | 34–3–0 | winner: 4, recipr:1 | |
| 60 | 15–22–8 | 17–20–6 | 16–22–8 | 16–22–4 | 40–0–4 | winner: 1, concord: 2, recipr:2 | |
| 59 | 12–12–6 | 14–11–2 | 11–11–6 | 9–8–12 | 28–0–0 | winner: 1, concord: 1, recipr:3 | |
| 57 | 23–0–2 | 23–1–0 | 26–1–0 | 24–0–0 | 35–0–0 | winner: 5 | |
| 49 | 30–0–0 | 23–0–0 | 28–0–0 | 30–0–0 | 30–0–0 | winner: 5 | |
| 45 | C1D; ETAA1; MEIS1; PNO1; PPP3R1; SPRED2; WDR92 | 22–0–0 | 14–0–0 | / | 21–0–0 | 25–0–0 | winner: 4 |
| 44 | MRPS9; | 9–0–0 | 17–1–0 | 2–0–2 | 3–0–0 | 23–1–0 | winner: 4, recipr:1 |
| 44 | 8–13–3 | 10–9–4 | 13–10–3 | 10–11–3 | 17–2–0 | winner: 1, concord: 4 | |
| 43 | 24–0–0 | 26–0–0 | 25–0–0 | 24–0–0 | 17–0–0 | winner: 5 | |
| 42 | 8–1–0 | 8–0–0 | 8–1–0 | 7–1–0 | 9–0–0 | winner: 5 | |
| 42 | 32–0–0 | 34–0–0 | 35–0–0 | 30–0–0 | 27–1–3 | winner: 5 | |
| 41 | C8orf83; | 13–0–0 | 11–0–0 | 12–0–0 | 9–0–0 | 17–0–0 | winner: 5 |
| 40 | 16–1–0 | 15–0–0 | 17–1–0 | 14–0–0 | 18–3–0 | winner: 5 | |
| 40 | 21–0–0 | 19–0–0 | 19–0–0 | 15–0–0 | 5–0–1 | winner: 5 |
The potential target genes of a cluster are marked in bold. An a-b-c pattern stands for: a—number of UCNEs in the ‘major’ cluster only, b—number of UCNEs commonly present in the ‘major’ and ‘minor(s)’ clusters, c—number of UCNEs present in the ‘minor(s)’ cluster only. The classification column denotes the number of cases where the corresponding pattern is observed.
Fig. 4.Flowchart of the analysis of UCNE clusters
Fig. 5.Retention patterns of genes and UCNEs of the ZEB2 cluster in five fish species. Genes are shown as colored boxes above or below the chromosomes according to their orientation. UCNEs are shown as vertical segments. Line breaks indicate discontinuities in the fish genome assemblies. Question marks indicate line breaks that may not be real. The break in the major zebrafish cluster corresponds to a local inversion potentially resulting from an assembly error