Literature DB >> 23193254

UCNEbase--a database of ultraconserved non-coding elements and genomic regulatory blocks.

Abstract

UCNEbase (http://ccg.vital-it.ch/UCNEbase) is a free, web-accessible information resource on the evolution and genomic organization of ultra-conserved non-coding elements (UCNEs). It currently covers 4351 such elements in 18 different species. The majority of UCNEs are supposed to be transcriptional regulators of key developmental genes. As most of them occur as clusters near potential target genes, the database is organized along two hierarchical levels: individual UCNEs and ultra-conserved genomic regulatory blocks (UGRBs). UCNEbase introduces a coherent nomenclature for UCNEs reflecting their respective associations with likely target genes. Orthologous and paralogous UCNEs share components of their names and are systematically cross-linked. Detailed synteny maps between the human and other genomes are provided for all UGRBs. UCNEbase is managed by a relational database system and can be accessed by a variety of web-based query pages. As it relies on the UCSC genome browser as visualization platform, a large part of its data content is also available as browser viewable custom track files. UCNEbase is potentially useful to any computational, experimental or evolutionary biologist interested in conserved non-coding DNA elements in vertebrates.

Entities: Chemical Disease Gene Species

Mesh：

Substances：
DNA, Intergenic

Year: 2012 PMID： 23193254 PMCID： PMC3531063 DOI： 10.1093/nar/gks1092

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Several comparative studies on whole vertebrate genomes uncovered non-coding sequences that exhibit extremely high conservation. DNA regions of perfect identity shared between human, mouse and rat with length >200 bp have been called ultraconserved elements (UCEs) (1). Similar DNA regions have been referred to by others as conserved non-genic regions (2), conserved non-coding elements (CNE) (3) or highly conserved non-coding elements (HCNE) (4). The exact number of such elements depends on the criteria used for their identification, see (5) for review. The multitude of different names may appear unfortunate and confusing. Nevertheless, we decided to use yet another term, ultraconserved non-coding elements (UCNE), to make clear that our resource is restricted to the most highly conserved class of such elements and excludes protein-coding regions. Although the strong conservation of these sequences points to an important biological role, a known molecular mechanism that would require such a high degree of conservation is currently unknown. Experimental studies in transgenic animals suggest that most of these sequences act as tissue-specific enhancers during developmental processes (6–8). A striking property of vertebrate UCNEs is that they cluster in genomic regions containing genes coding for transcription factors and developmental regulators (so called trans-dev genes) (4,9). These clusters show conserved synteny between distant genomes and are called ‘genomic regulatory blocks’ (GRBs) (10). The experimental characterization of UCNEs faces limitations. For instance, it is currently not possible to study molecular interactions of UCNEs within single cells of a developing organism. In the absence of adequate experimental techniques, comparative genomics approaches represent a promising alternative to gain some clues about their function. Understanding how UCNEs have evolved in the past may tell us something about what they do today. Here, we present UCNEbase, a comprehensive resource on the genomic organization and evolution of vertebrate UCNEs and ultraconserved genomic regulatory blocks (UGRBs). We will first explain the procedures by which the database was built, before we describe its contents and the user interfaces. A comparison of UCNEbase with already existing resources on CNEs will be presented in the ‘Discussion’ section.

DATABASE CONCEPT AND DATA ACQUISITION

UCNEbase provides information on the evolution and genomic organization of 4351 UCNEs in multiple vertebrate species. Around half of these elements are located within intergenic regions (2139) and the rest are located within non-coding parts of genes: introns (1713) and UTRs (499). As most UCNEs occur as arrays near key trans-dev genes, our resource is organized along two hierarchical levels: individual UCNEs and UGRBs. The information provided by UCNEbase is generated by a combination of automatic procedures and manual curation steps. The methodology used for the creation of UCNEbase is schematically shown in Figure 1. A brief description of each step follows below. Technical details are provided in the Supplementary Methods.

Figure 1.

Schematic representation of the methodology used for the creation of UCNEbase.

Definition of UCNEs

We defined UCNEs as non-coding human DNA regions that exhibit ≥95% sequence identity between human and chicken and are >200 bp. The sequence identity threshold corresponds to a base substitution rate of ∼1% per 100 million years. We have previously shown that sequences fulfilling such stringent criteria exist only in vertebrates (11). To compile a list of human UCNEs, we scanned whole-genome alignments between human and chicken downloaded from UCSC (12) with a sliding window technique. Human and chicken were selected as reference species for two main reasons: (i) their evolutionary distance provides high speciﬁcity in detecting functional elements (13) and (ii) both genome assemblies are of high quality and thus suitable for identifying large syntenic regions. From the initially extracted set of ultra-conserved sequence elements, we eliminated coding regions and a few human repetitive sequences aligning with the same chicken sequence. The remaining 4351 sequences composed our reference set of UCNEs. Each element of this set was then classified as either ‘intergenic’, ‘intronic’ or ‘UTR associated’ according to the human gene annotation from RefSeq. The length of the UCNEs identified in this way ranged from 200 to 1419 bp with a mean 325 bp and a median 283 bp. The total length is 1.4 Mb. The criteria used to identify UCNEs are admittedly arbitrary, like any other criteria used before. In particular, there is no objective boundary between UCNEs and HCNEs. However, the primary goal of UCNEbase is not to be a comprehensive resource. It should rather be considered an exploratory tool to study the general features of UCNEs with the aid of a stringently selected collection of prominent examples.

Definition of UGRBs

We defined ‘UGRBs’ (also referred to as ‘UCNE clusters’) as arrays of UCNEs that are syntenically conserved between the human and chicken genomes. Syntenic conservation means that the orthologues of the individual UCNEs of a human UGRB occur in the same order within a restricted area of a chicken chromosome. During the initial scan, we required that neighbouring UCNEs must not be separated by >0.5 Mb in both human and chicken. However, a few exceptions to this rule were made during subsequent manual curation based on visual inspections of the genomic context. Currently, UCNEbase comprises 239 UGRBs encompassing 3868 UCNEs. The number of UCNEs within a UGRB varies considerably from 134 in the ZEB2 cluster to only 2 in the ONECUT2 cluster (with an average of 16 and a median of 8 UCNEs). The genomic size of the identified UGRBs also varies significantly from 4.9 Mb (IRXB cluster) to ∼2 kb (CPEB4 cluster). For each UGRB, we defined a corresponding set of UGRB-associated genes comprising all genes that fall within or overlap with a genomic region spanned by the block. If the UGRB starts or ends with an intergenic UCNE, the upstream and downstream flanking genes were also included. The set UGRB-associated genes was sometimes expanded during subsequent manual curation steps, for instance by including paralogues of genes from a paralogous UGRB.

Identification of human UCNE paralogues

Human genomic regions that exhibit significant sequence similarity to a UCNE are considered paralogues of that UCNE. In general, CNEs have fewer paralogues compared with protein-coding genes. However, the relatively rare cases of UCNE paralogues are highly informative with regard to the origin of UGRBs. Most paralogous UCNEs originate from ancient whole-genome duplication events that happened at the root of the vertebrate tree (14). As we do not expect high sequence conservation between UCNE paralogues, we compared each UCNE against the complete human genome (split into overlapping pieces of 10 kb) using the program SSEARCH v34 from the FASTA package (15). This program is an implementation of the sensitive Smith–Waterman local alignment algorithm. The initial scan was performed with a permissive E-value threshold. We subsequently re-evaluated the statistical significance of each match by computing base composition-adjusted E-values using the classical window shuffling test (16). All matches with E-value ≤10−4 were retained as paralogues. Our systematic search for paralogues confirmed the expectation that most UCNEs are unique. Only 464 UCNEs have at least one human paralogue. Of the 1252 paralogous regions found, only 177 were UCNEs themselves.

Identification of paralogous UGRBs

We also tried to identify paralogous relationships at the level of GRBs. This was mostly done by manual curation. As a minimal condition, we required that two UGRBs share at least one paralogous gene. However, most paralogous blocks also share paralogous UCNEs. In some cases, synteny across paralogous blocks was used to redefine the extension of individual UGRBs. In total, 82 UGRBs were found to have at least one paralogous UGRB forming 39 groups.

Detection of UCNE orthologues in other species

Currently, UCNEbase contains information about UCNE homologues (orthologues and paralogues) in 18 vertebrate genome assemblies that include four mammals: mouse (mm10), armadillo (dasNov1), opossum (monDom5) and platypus (ornAna1); two birds: chicken (galGal3) and zebra finch (taeGut1); two reptiles: lizard (anoCar2) and painted turtle (chrPic1); one amphibian: Xenopus (xenTro3) and five fishes: fugu (fr2), medaka (oryLat2), stickleback (gasAcu1), Tetraodon (tetNig2) and zebrafish (danRer7). We also identified a few UCNEs (mainly located in UTRs) that have orthologues in lamprey (petMar1), Ciona intestinalis (ci2), sea urchin (strPur2) and lancelet (braFlo1). To identify these homologues, we performed Smith–Waterman searches against the complete genomes using the same protocol as for the identification of paralogues within the human genome. Once the homologous regions were defined, we classified each of these regions as either an orthologue or a paralogue. In doing so, we made the assumption that a homologue of a human UCNE could be an orthologue of the same UCNE or an orthologue of one of its paralogous regions in the human genome. To distinguish between these two cases, we compared each homologue of a human UCNE with all human paralogues of that UCNE (if there were any). If a better alignment score (lower E-value) was obtained with a paralogue, then the UCNE homologue was classified as paralogue. If the alignment scores were very close, we visually inspected the corresponding genomic regions and based our judgement on orthology annotation for nearby genes or synteny with other UCNE homologues.

Identification of syntenic subclusters of UCNEs in vertebrate genomes

For each human UGRB, we identified ‘orthologous syntenic subclusters’ of UCNEs in other vertebrates. An orthologous syntenic subcluster is a set of UCNE orthologues that occurs as a cluster on the same chromosome, scaffold or contig in another vertebrate genome assembly such that any two neighbouring UCNEs are separated by ≤0.5 Mb. For most species, we would expect only one orthologous cluster per UGRB. In reality, we often find one cluster plus a few isolated orthologous UCNEs located on sequence contigs not assigned to chromosomes. The situation could be different in the five fish species that have undergone a lineage-specific whole-genome duplication (see example in Figure 1).

Identification of possible target genes

GRBs are generally assumed to control only one target gene belonging to the so-called trans-dev family. With all the information on orthologous and paralogous regions in other genomes at hand, we tried to identify the most likely target gene for each cluster. To this end, we primarily relied on a genomic context analysis approach. We reasoned that target genes will always be conserved together with UCNEs after whole-genome duplication events. Based on the analysis of the gene content of paralogous UGRBs in human and the fate of UGRB-associated UCNEs in duplicated fish genomes, we were often able to identify a single target gene. In the cases where we were left with several candidates, we gave preference to genes encoding transcription factors. In fact, the overwhelming majority of target genes uniquely defined by genomic context analysis turned out to be transcription factors, most of them containing either zinc fingers or homeodomains, or both. Perhaps, we could have used transcriptional and epigenetic features as well to single out the most likely target genes from multiple candidates, as suggested by a recent study (17). Technically, however, it is not obvious how this should be done. We are currently exploring several ways how such experimental data could be exploited for target gene identification in the future.

UGRB and UCNE nomenclature

One of the distinctive features of UCNEbase is the establishment of a coherent nomenclature for UCNEs that ensures that orthologous UCNEs have the same names in all species. These names may also serve as unique identifiers for UCNEs, which is a real issue because nowadays non-coding elements are often referred to by genomic coordinates relating to a specific genome assembly. If the corresponding genome assembly is not mentioned, the location of the element will no longer be traceable in a few years from now. In UCNEbase, we try to define names that carry some information about the function and genomic location of a UCNE, as well as its evolutionary relationship to other UCNEs. UCNE names are typically composed of two parts: a UGRB name and an element name. For example, DACH1_Ava and DACH1_Benjamin are two UCNEs that belong to the DACH1 cluster. UGRBs have the same name as their putative target genes. Elements are identified by common people’s names or names from mythology. Within a UGRB, the alphabetical order of the elements reflects the linear arrangement of the elements along a chromosome. Importantly, paralogous UCNEs share the same element name (e.g. DACH1_Hana is a paralogue of DACH2_Hana). For elements that are not part of a UGRB, the corresponding chromosome name replaces the block name, e.g. chr2_Nemo. The rule that paralogues should have the same name extends to non-clustered UCNEs (e.g. chr10_Sherlock is a paralogue of CPEB2_Sherlock). A small number of UCNEs are very close to each other and thus could be part of the same functional entity. To specifically mark such cases, UCNEs that are separated by ≤50 bp in both human and chicken are given the same element name, however extended by different serial numbers (e.g. DACH1_Scheherazade_1 and DACH1_Scheherazade_2). Note further that our naming scheme is extensible by design. If we were to add a new UCNE to an existing UGRB in the future, it will be easy to find a name that alphabetically fits between the two neighbouring UCNEs.

CONTENT AND USER INTERFACES

As UCNEbase is organized along two hierarchical levels, there are two types of entries, UCNEs and UGRBs. About 90% of UCNE entries are related to UGRBs. There is only one entry per UCNE or UGRB, containing information for all vertebrate species covered by the resource. UCNEbase is organized in a human-centric fashion. Each entry type has two parts: one providing detailed information relating to the human genome and a second part providing information on homologous elements and clusters of elements in other species. There is a standard html display for each entry with internal links to paralogous conserved regions and external links to other databases and genome browsers. The UCSC genome browser is used as the major visualization platform. A large part of the information contained in UCNEbase is provided as browser-viewable BED files.

Content of a UCNE entry

The first part of a UCNE entry contains the following data items: a unique name; the location relative to the nearest genes (intergenic, intron or UTR); the genome coordinates in UCSC format; the length of the UCNE; the sequence in FASTA format; the names of overlapping genes (for intronic and UTR-associated UCNEs), or the nearest upstream and downstream genes (for intergenic UCNEs); a list of human paralogous UCNEs (identified by name and genomic coordinates) and other paralogous regions (identified by genomic coordinates only); the name of the corresponding UGRB (if any); cross-references to overlapping entries from the CONDOR database (18), VISTA Enhancer Browser (19) and Bejerano’s UCE collection (1). The second part contains information about homologous regions in other vertebrates. The regions are defined by genomic coordinates and classified as either orthologues or paralogues. In addition, the sequence identity, E-value and bitscore of the local alignments are stored. A web display of a UCNE entry is shown in Figure 2. Note that all genomic coordinates are linked to the UCSC genome browser through hyperlinks that automatically pre-load a number of custom tracks from UCNEbase. The web display also provides links to the Ancora (20), ECR (21) and Ensembl (22) genome browser. Some information contained in a UCNEs entry will not automatically be displayed. The DNA sequence is only accessible through a hyperlink. Under the section header ‘Conservation in other species’, only orthologous regions are displayed initially. The paralogous regions can be made visible by clicking on the ‘+’ button.

Figure 2.

Web display of a UCNE entry.

Content of a UGRB entry

The main section of a UGRB entry contains the following data items: a unique name corresponding to the most likely target gene; the genome coordinates in UCSC format; the number of UCNEs forming the block; a list of all human genes associated with the block; a list of possible target genes (in most cases only one); a list of all UCNEs forming the block; a list of paralogous UGRBs; The section on sequence conservation contains synteny maps of UGRBs across multiple vertebrate genomes. A complete synteny map for a given species consists of one or several syntenic blocks referred to as ‘subclusters’. The information associated with a subcluster comprises the genomic coordinates, the number of orthologous UCNEs and the names of these UCNEs. An example of a web display of a UGRB entry is shown in Figure 3. As for UCNE entries, the genomic coordinates are all linked to the UCSC genome browser and will automatically pre-load a number of custom tracks. The web display also includes a locally stored image (a UCSC browser snapshot) providing an overview of orthologous UCNEs in different vertebrate genomes (Figure 4A). It is initially hidden, but can be made visible through an on–off button. Note that the image shows only a part of the information contained in the custom tracks provided by UCNEbase. A mouse click on the image will open a UCSC genome browser window, in which the tracks can be explored in more detail (Figure 4B).

Figure 3.

Web display of a UGRB entry.

Figure 4.

UCSC browser view of the EBF1 cluster with custom tracks from UCNEbase. (A) Summary picture of cross-genome conservation provided by UCNEbase. (B) Detailed view of the ‘human/chicken UCNEs’ and ‘UCNE paralogues’ tracks accompanied by a dense view of the tracks indicating conserved elements from other resources.

Web display of a UGRB entry. UCSC browser view of the EBF1 cluster with custom tracks from UCNEbase. (A) Summary picture of cross-genome conservation provided by UCNEbase. (B) Detailed view of the ‘human/chicken UCNEs’ and ‘UCNE paralogues’ tracks accompanied by a dense view of the tracks indicating conserved elements from other resources.

Data access and visualization

UCNEbase provides several query mechanisms to find UCNEs and UGRBs based on different search criteria. All entries can be accessed by their chromosomal location in the human genome or by proximity to particular genes through the web links ‘Browse UCNE clusters’ and ‘Browse individual UCNEs’. The ‘Advanced search’ page allows searches by additional criteria, including genomic location in other vertebrate species. Yet another page provides access through external database IDs from the CONDOR database, VISTA Enhancer Browser and Bejerano’s UCEs collection. UCNEbase also provides three fully hyperlinked summary tables, one containing a list of paralogous UGRBs, another one containing a list paralogous UCNEs and a third one (entitled ‘species cluster summary’) showing the numbers of conserved UCNEs for each UGRB in all species (Figure 5).

Figure 5.

Species cluster summary. The table shows the number of orthologous UCNEs found in different genomes for the 25 largest UGRBs.

Species cluster summary. The table shows the number of orthologous UCNEs found in different genomes for the 25 largest UGRBs. UCNEbase relies on the UCSC genome browser for data visualization. A large part of the information content is available as custom track files. This has the principle advantage that information from UCNEbase can be explored together with a great variety of genome annotations from other sources. The UCSC browser also serves as a navigation platform. All data items from UCNEbase that can be displayed in a browser window are back-linked to the corresponding UCNE and UGRB entries. For instance, in the example shown in Figure 4B, clicking on the blue box labelled ‘EBF1_Oberon’ will take the user back to corresponding UCNEbase entry. From there, one could use a link back to the genome browser to view an orthologous UCNE from another species. For the human genome, UCNEbase provides custom tracks for UCNEs, UGRBs, UCNE paralogues, CONDOR CNEs, Vista elements and UCEs from Bejerano’s collections. In addition, there is a group of tracks showing the subset of UCNEs conserved in different species. For non-human species, there are tracks for UCNE orthologues, UCNE paralogues and subclusters of UCNEs corresponding to human UGRBs.

DISCUSSION

There are several other resources on CNEs with partially overlapping objectives, in particular: CONDOR, CORG (23), cneViewer (24), Ancora, VISTA Enhancer browser, ECR browser and TFCONES (25). Despite a common theme, the scopes of these resources are quite different which makes a direct comparison difficult. For instance, a significant portion of CONDOR and the VISTA Enhancer browser consists of experimental annotation of non-coding elements based on in vivo reporter gene assays in zebrafish and mouse. Such information is not within the scope of UCNEbase. Other resources are primarily genome browsers. In the following, we will present and discuss distinctive and unique features of UCNEbase. UCNEbase is block-centric and provides complete synteny maps of UGRBs for many different vertebrate genomes, including orphan UCNEs located in unassembled contigs. With the exception of CONDOR, Ancora and TFCONES, all other resources do not assign conserved regions to GRBs. Most other resources cover fewer species. For instance, cneViewer provides information for human and zebrafish only, TFCONES for human, mouse and fugu. UCNEbase covers 18 vertebrate genomes. Only Ancora and ECR browser offer data for a comparable number of species. Some existing resources are restricted to selected genomic regions. TFCONES considers only CNEs near transcription factor genes. The CONDOR database excludes elements outside synteny blocks. With the exception of CONDOR and VISTA Enhancer browser, none of the other resources uses unique identifiers. CNEs are simply defined by genomic coordinates which will be outdated when a new genome assembly replaces the current one. UCNEbase is the only resource that uses informative names, indicating the evolutionary relationships between elements. UCNEbase is highly interoperable with the UCSC genome browser. Most other resources display the data in their own browsers. Virtually all information from UCNEbase is available in custom tracks that are automatically pre-loaded by the hyperlinks to the USCS browser. This allows exploration of its content in a rich data environment.

DATABASE AVAILABILITY AND TECHNICAL SPECIFICATIONS

UCNEbase is publicly available at http://ccg.vital-it.ch/UCNEbase/ without need of preregistration. All data can be downloaded from the UCNEbase web site as flat files or as MySQL dumps, or by anonymous FTP from ftp://ccg.vital-it.ch/UCNEbase/. UCNEbase is maintained as relational database using MySQL as database management system. The web interface was created with PHP and Java scripts and runs on an Apache web server hosted by the Vital-IT high-performance computing centre. The database schema diagram (ER model) is available from the UCNEbase web site.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online: Supplementary Methods.

FUNDING

The Swiss National Science Foundation [PDFM33-120719 to S.D.]. Funding for open access charge: Swiss government. Conflict of interest statement. None declared.

25 in total

1. Numerous potentially functional but non-genic conserved sequences on human chromosome 21.

Authors: Emmanouil T Dermitzakis; Alexandre Reymond; Robert Lyle; Nathalie Scamuffa; Catherine Ucla; Samuel Deutsch; Brian J Stevenson; Volker Flegel; Philipp Bucher; C Victor Jongeneel; Stylianos E Antonarakis
Journal: Nature Date: 2002-12-05 Impact factor: 49.962

2. Scanning human gene deserts for long-range enhancers.

Authors: Marcelo A Nobrega; Ivan Ovcharenko; Veena Afzal; Edward M Rubin
Journal: Science Date: 2003-10-17 Impact factor: 47.728

3. CORG: a database for COmparative Regulatory Genomics.

Authors: C Dieterich; H Wang; K Rateitschak; H Luz; M Vingron
Journal: Nucleic Acids Res Date: 2003-01-01 Impact factor: 16.971

4. Ultraconserved elements in the human genome.

Authors: Gill Bejerano; Michael Pheasant; Igor Makunin; Stuart Stephen; W James Kent; John S Mattick; David Haussler
Journal: Science Date: 2004-05-06 Impact factor: 47.728

5. ECR Browser: a tool for visualizing and accessing data from comparisons of multiple vertebrate genomes.

Authors: Ivan Ovcharenko; Marcelo A Nobrega; Gabriela G Loots; Lisa Stubbs
Journal: Nucleic Acids Res Date: 2004-07-01 Impact factor: 16.971

Review 6. Organization of conserved elements near key developmental regulators in vertebrate genomes.

Authors: Adam Woolfe; Greg Elgar
Journal: Adv Genet Date: 2008 Impact factor: 1.944

7. Ensembl 2012.

Authors: Paul Flicek; M Ridwan Amode; Daniel Barrell; Kathryn Beal; Simon Brent; Denise Carvalho-Silva; Peter Clapham; Guy Coates; Susan Fairley; Stephen Fitzgerald; Laurent Gil; Leo Gordon; Maurice Hendrix; Thibaut Hourlier; Nathan Johnson; Andreas K Kähäri; Damian Keefe; Stephen Keenan; Rhoda Kinsella; Monika Komorowska; Gautier Koscielny; Eugene Kulesha; Pontus Larsson; Ian Longden; William McLaren; Matthieu Muffato; Bert Overduin; Miguel Pignatelli; Bethan Pritchard; Harpreet Singh Riat; Graham R S Ritchie; Magali Ruffier; Michael Schuster; Daniel Sobral; Y Amy Tang; Kieron Taylor; Stephen Trevanion; Jana Vandrovcova; Simon White; Mark Wilson; Steven P Wilder; Bronwen L Aken; Ewan Birney; Fiona Cunningham; Ian Dunham; Richard Durbin; Xosé M Fernández-Suarez; Jennifer Harrow; Javier Herrero; Tim J P Hubbard; Anne Parker; Glenn Proctor; Giulietta Spudich; Jan Vogel; Andy Yates; Amonida Zadissa; Stephen M J Searle
Journal: Nucleic Acids Res Date: 2011-11-15 Impact factor: 16.971

8. TFCONES: a database of vertebrate transcription factor-encoding genes and their associated conserved noncoding elements.

Authors: Alison P Lee; Yuchen Yang; Sydney Brenner; Byrappa Venkatesh
Journal: BMC Genomics Date: 2007-11-29 Impact factor: 3.969

9. Vertebrate conserved non coding DNA regions have a high persistence length and a short persistence time.

Authors: Dorota Retelska; Emmanuel Beaudoing; Cédric Notredame; C Victor Jongeneel; Philipp Bucher
Journal: BMC Genomics Date: 2007-10-31 Impact factor: 3.969

10. CONDOR: a database resource of developmentally associated conserved non-coding elements.

Authors: Adam Woolfe; Debbie K Goode; Julie Cooke; Heather Callaway; Sarah Smith; Phil Snell; Gayle K McEwen; Greg Elgar
Journal: BMC Dev Biol Date: 2007-08-30 Impact factor: 1.978

53 in total

1. Shark genomes provide insights into elasmobranch evolution and the origin of vertebrates.

Authors: Yuichiro Hara; Kazuaki Yamaguchi; Koh Onimaru; Mitsutaka Kadota; Mitsumasa Koyanagi; Sean D Keeley; Kaori Tatsumi; Kaori Tanaka; Fumio Motone; Yuka Kageyama; Ryo Nozu; Noritaka Adachi; Osamu Nishimura; Reiko Nakagawa; Chiharu Tanegashima; Itsuki Kiyatake; Rui Matsumoto; Kiyomi Murakumo; Kiyonori Nishida; Akihisa Terakita; Shigeru Kuratani; Keiichi Sato; Susumu Hyodo; Shigehiro Kuraku
Journal: Nat Ecol Evol Date: 2018-10-08 Impact factor: 15.460

2. A map of cis-regulatory elements and 3D genome structures in zebrafish.

Authors: Hongbo Yang; Yu Luan; Tingting Liu; Hyung Joo Lee; Li Fang; Yanli Wang; Xiaotao Wang; Bo Zhang; Qiushi Jin; Khai Chung Ang; Xiaoyun Xing; Juan Wang; Jie Xu; Fan Song; Iyyanki Sriranga; Chachrit Khunsriraksakul; Tarik Salameh; Daofeng Li; Mayank N K Choudhary; Jacek Topczewski; Kai Wang; Glenn S Gerhard; Ross C Hardison; Ting Wang; Keith C Cheng; Feng Yue
Journal: Nature Date: 2020-11-25 Impact factor: 49.962

3. Parrot Genomes and the Evolution of Heightened Longevity and Cognition.

Authors: Morgan Wirthlin; Nicholas C B Lima; Rafael Lucas Muniz Guedes; André E R Soares; Luiz Gonzaga P Almeida; Nathalia P Cavaleiro; Guilherme Loss de Morais; Anderson V Chaves; Jason T Howard; Marcus de Melo Teixeira; Patricia N Schneider; Fabrício R Santos; Michael C Schatz; Maria Sueli Felipe; Cristina Y Miyaki; Alexandre Aleixo; Maria P C Schneider; Erich D Jarvis; Ana Tereza R Vasconcelos; Francisco Prosdocimi; Claudio V Mello
Journal: Curr Biol Date: 2018-12-06 Impact factor: 10.834

4. FTO haplotyping underlines high obesity risk for European populations.

Authors: Vladimir Babenko; Roman Babenko; Junaid Gamieldien; Arcady Markel
Journal: BMC Med Genomics Date: 2019-03-13 Impact factor: 3.063

5. Accurate prediction of cis-regulatory modules reveals a prevalent regulatory genome of humans.

Authors: Pengyu Ni; Zhengchang Su
Journal: NAR Genom Bioinform Date: 2021-06-17

6. OncoBase: a platform for decoding regulatory somatic mutations in human cancers.

Authors: Xianfeng Li; Leisheng Shi; Yan Wang; Jianing Zhong; Xiaolu Zhao; Huajing Teng; Xiaohui Shi; Haonan Yang; Shasha Ruan; MingKun Li; Zhong Sheng Sun; Qimin Zhan; Fengbiao Mao
Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971

7. Human neural crest cells contribute to coat pigmentation in interspecies chimeras after in utero injection into mouse embryos.

Authors: Malkiel A Cohen; Katherine J Wert; Johanna Goldmann; Styliani Markoulaki; Yosef Buganim; Dongdong Fu; Rudolf Jaenisch
Journal: Proc Natl Acad Sci U S A Date: 2016-01-25 Impact factor: 11.205

8. Functional Conservation of a Developmental Switch in Mammals since the Jurassic Age.

Authors: Jayati Mookerjee-Basu; Xiang Hua; Lu Ge; Emmanuelle Nicolas; Qin Li; Philip Czyzewicz; Dai Zhongping; Suraj Peri; Juan I FuxmanBass; Albertha J M Walhout; Dietmar J Kappes
Journal: Mol Biol Evol Date: 2019-01-01 Impact factor: 16.240

9. The landscape of long noncoding RNAs in the human transcriptome.

Authors: Matthew K Iyer; Yashar S Niknafs; Rohit Malik; Udit Singhal; Anirban Sahu; Yasuyuki Hosono; Terrence R Barrette; John R Prensner; Joseph R Evans; Shuang Zhao; Anton Poliakov; Xuhong Cao; Saravana M Dhanasekaran; Yi-Mi Wu; Dan R Robinson; David G Beer; Felix Y Feng; Hariharan K Iyer; Arul M Chinnaiyan
Journal: Nat Genet Date: 2015-01-19 Impact factor: 38.330

10. A duplication on chromosome 16q12 affecting the IRXB gene cluster is associated with autosomal dominant cone dystrophy with early tritanopic color vision defect.

Authors: Susanne Kohl; Pablo Llavona; Alexandra Sauer; Peggy Reuter; Nicole Weisschuh; Melanie Kempf; Florian Alexander Dehmelt; Aristides B Arrenberg; Ieva Sliesoraityte; Eberhart Zrenner; Mary J van Schooneveld; Günther Rudolph; Laura Kühlewein; Bernd Wissinger
Journal: Hum Mol Genet Date: 2021-06-17 Impact factor: 6.150