| Literature DB >> 28605774 |
Richard J Challis1, Sujai Kumar1, Lewis Stevens1, Mark Blaxter1.
Abstract
Database URL: http://GenomeHubs.org. As the generation and use of genomic datasets is becoming increasingly common in all areas of biology, the need for resources to collate, analyse and present data from one or more genome projects is becoming more pressing. The Ensembl platform is a powerful tool to make genome data and cross-species analyses easily accessible through a web interface and a comprehensive application programming interface. Here we introduce GenomeHubs, which provide a containerized environment to facilitate the setup and hosting of custom Ensembl genome browsers. This simplifies mirroring of existing content and import of new genomic data into the Ensembl database schema. GenomeHubs also provide a set of analysis containers to decorate imported genomes with results of standard analyses and functional annotations and support export to flat files, including EMBL format for submission of assemblies and annotations to International Nucleotide Sequence Database Collaboration.Entities:
Mesh:
Year: 2017 PMID: 28605774 PMCID: PMC5467552 DOI: 10.1093/database/bax039
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Schematic diagram of GenomeHubs containers and data flows.
Comparing the predicted gene sets of two versions of the R. varieornatus genome
| RVARI v1 | RVARI v1.1 | |
|---|---|---|
| Total gene count | 21 493 | 13 920 |
| Single-exon genes | 5340 (25%) | 1711 (12%) |
| Single-exon genes with no structural domain hits | 3251 (15%) | 673 (5%) |
An excess of single-exon genes, which are typically rare in eukaryotic genomes, can indicate poor quality gene predictions. RVARI v1 was found to have a high proportion of single-exon genes, the majority of which have no structural annotation, and are therefore likely erroneous. The gene set was re-predicted prior to analysis (27).