| Literature DB >> 29121339 |
Dimitris Polychronopoulos1,2, James W D King1,2, Alexander J Nash1,2, Ge Tan1,2, Boris Lenhard1,2,3.
Abstract
Comparative genomics has revealed a class of non-protein-coding genomic sequences that display an extraordinary degree of conservation between two or more organisms, regularly exceeding that found within protein-coding exons. These elements, collectively referred to as conserved non-coding elements (CNEs), are non-randomly distributed across chromosomes and tend to cluster in the vicinity of genes with regulatory roles in multicellular development and differentiation. CNEs are organized into functional ensembles called genomic regulatory blocks-dense clusters of elements that collectively coordinate the expression of shared target genes, and whose span in many cases coincides with topologically associated domains. CNEs display sequence properties that set them apart from other sequences under constraint, and have recently been proposed as useful markers for the reconstruction of the evolutionary history of organisms. Disruption of several of these elements is known to contribute to diseases linked with development, and cancer. The emergence, evolutionary dynamics and functions of CNEs still remain poorly understood, and new approaches are required to enable comprehensive CNE identification and characterization. Here, we review current knowledge and identify challenges that need to be tackled to resolve the impasse in understanding extreme non-coding conservation.Entities:
Mesh:
Year: 2017 PMID: 29121339 PMCID: PMC5728398 DOI: 10.1093/nar/gkx1074
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The phenomenon of extreme non-coding conservation. A conserved CNE (Human–Tetraodon CNE, on the left) shown here is more conserved than a protein-coding sequence (HIST1H4D, on the right). The multiple sequence alignment of 46 vertebrate species and the corresponding phyloP scores illustrate the evolutionary conservation of the CNE and protein-coding sequence. PhyloP scores range from negative to positive scores (red to blue) and indicate positive and negative selective pressure respectively. The 46-way alignment was downloaded from the UCSC genome browser and spans ∼600 million years of evolution since the last common ancestor of humans and lampreys.
Conserved non-coding elements (CNE) resources
| Abbreviation | Description | Identification | Reference |
|---|---|---|---|
| ANCORA | Atlas of non-coding conserved regions in animals | ≥70% seq. id. over 30 or 50 nt in different metazoa | ( |
| CEGA | Conserved elements from genomic alignments | threshold-free phylogenetic modeling | ( |
| cneViewer | Conserved non-encoding element viewer | user-specified | ( |
| CONDOR | COnserved Non-coDing Orthologous Regions database | multiple and multi-pairwise alignments of orthologous regions between | ( |
| UCbase | Ultraconserved elements database | 100% seq. id. over 200 nt between human and mouse | ( |
| UCNEbase | Ultraconserved CNEs | ≥95% seq. id. over 200 nt in the human and chicken genomes; coding regions are removed | ( |
| VISTA | ViSualization tool for alignment | extremely conserved sequences between human and rodents that have been tested | ( |
Figure 2.(A) The GRB model. The regulatory input for one or more target genes (red) is provided by long-range interactions (dashed lines) between CNEs (green) and the target gene’s promoter. Bystander genes (gray) often contain CNEs in their introns but remain unresponsive to CNE-mediated regulation. For more details, see text, (B) CNE clustering across chromosome 15 and at the MEIS2 locus (shown zoomed in below the whole chromosome track). The human MEIS2 GRB (brown) is a 3.3 Mb region defined by an array of conserved non-coding elements. MEIS2 (red) encodes a transcription factor involved in lens development through regulation of PAX6. Regardless of species used in the pairwise comparison against human, CNEs (black) clearly mark the boundaries of this GRB. The boundaries of the topologically associated domain (TAD) covering MEIS2 (TAD in blue; H1-ESC TAD calls are generated using HMM_calls) and the GRB spanning this locus are highly concordant.
Figure 3.Sequence heatmaps showing dinucleotide content within and outside vertebrate CNEs. Plots are generated using heatmaps package (https://bioconductor.org/packages/release/bioc/html/heatmaps.html). CNEs which show sequence identity >98% for >50 nt between human and chicken are identified using CNEr (https://bioconductor.org/packages/release/bioc/html/CNEr.html). Sequences are ordered from shortest to longest on the Y-axis (aligned on the center) and X-axis shows distance in nucleotides from the center of each CNE.
Figure 4.Methodology describing how CNEs are utilized as markers for constructing phylogenies (adopted and modified with permission from Brant Faircloth and John McCormack). Panels (A-G) describe the steps for constructing phylogenies starting from CNE sequences to generating species trees.