| Literature DB >> 22116062 |
Flora J Logan-Klumpler1, Nishadi De Silva, Ulrike Boehme, Matthew B Rogers, Giles Velarde, Jacqueline A McQuillan, Tim Carver, Martin Aslett, Christian Olsen, Sandhya Subramanian, Isabelle Phan, Carol Farris, Siddhartha Mitra, Gowthaman Ramasamy, Haiming Wang, Adrian Tivey, Andrew Jackson, Robin Houston, Julian Parkhill, Matthew Holden, Omar S Harb, Brian P Brunk, Peter J Myler, David Roos, Mark Carrington, Deborah F Smith, Christiane Hertz-Fowler, Matthew Berriman.
Abstract
GeneDB (http://www.genedb.org) is a genome database for prokaryotic and eukaryotic pathogens and closely related organisms. The resource provides a portal to genome sequence and annotation data, which is primarily generated by the Pathogen Genomics group at the Wellcome Trust Sanger Institute. It combines data from completed and ongoing genome projects with curated annotation, which is readily accessible from a web based resource. The development of the database in recent years has focused on providing database-driven annotation tools and pipelines, as well as catering for increasingly frequent assembly updates. The website has been significantly redesigned to take advantage of current web technologies, and improve usability. The current release stores 41 data sets, of which 17 are manually curated and maintained by biologists, who review and incorporate data from the scientific literature, as well as other sources. GeneDB is primarily a production and annotation database for the genomes of predominantly pathogenic organisms.Entities:
Mesh:
Year: 2011 PMID: 22116062 PMCID: PMC3245030 DOI: 10.1093/nar/gkr1032
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
GeneDB genomes and curation statuses
Project type: Yellow indicates improved high-quality draft; cyan indicates non-contiguous finished; green indicates finished. Curation: blue indicates reactive curation; hatched blue indicates proactive curation; No curation white.
Project type is based on categories defined in (1).
aNot a whole-genome project.
Snapshot of annotation and curation—statistics from a 12-month period between August 2010 and August 2011
| Annotation event type | ||||
|---|---|---|---|---|
| Assigned or updated Product | 304 | 293 | 241 | 391 |
| Updated GO term | 751 | 121 | 1718 | 1163 |
| Phenotype curation term added | 165 | – | 6791 | 315 |
| Linked to publication (PMID) | 496 | 244 | 1456 | 693 |
| User Comment added | 252 | 110 | 312 | 519 |
| All unique genes with new functional annotations | 1220 | 839 | 8750 | 1869 |
| All unique genes with new structural annotations | 13 | 291 | 41 | – |
Annotation event types are shown for each of the four reference genomes in GeneDB.
aL. major represents ∼50% of all Leishmania species curation activity.
bUser comment entered at Tritrypdb.org.
Figure 1.The Rationaliser tool is used to remove inconsistencies from curated data. (A) Search boxes to locate terms. (B) List of colour-coded CV terms used within the annotations of selected organisms. (C) List of all available terms in the CV (also colour-coded). (D) Text box to insert new terms if the correct term does not exist.
Figure 2.Screen shot of GeneDB homepage with an example of an entry point into an individual organism homepage. (A) Links to available tools. (B) Available data sets. (C) Clicking on ‘Select an organism’ opens a drop-down box with available bacterial genomes. (D) Quick search option. (E) Blast search options. (F) Links to available tools. (G) Links to ongoing projects and data release policy.
Figure 3.Genes can be accessed by browsing clickable whole chromosome or contig maps. (A) Screen shot of an organism home page showing annotation statistics, blast tools and search tools. (B) List of all chromosomes. (C) Shown is part of a chromosome map. A mouse over shows the systematic IDs of the genes, a click on the gene of interest opens the gene page.
Figure 4.Screen shot of GeneDB gene page. (A) Graphical display of the gene in web-artemis. The chromosomal location of the gene is shown on the top. (B) Basic gene information. (C) Additional information, including relevant literature. (D) Phenotype information. (E) Gene Ontology terms.