| Literature DB >> 22135293 |
Ioanna Pagani1, Konstantinos Liolios, Jakob Jansson, I-Min A Chen, Tatyana Smirnova, Bahador Nosrat, Victor M Markowitz, Nikos C Kyrpides.
Abstract
The Genomes OnLine Database (GOLD, http://www.genomesonline.org/) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2011, GOLD, now on version 4.0, contains information for 11,472 sequencing projects, of which 2907 have been completed and their sequence data has been deposited in a public repository. Out of these complete projects, 1918 are finished and 989 are permanent drafts. Moreover, GOLD contains information for 340 metagenome studies associated with 1927 metagenome samples. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about any (x) Sequence specification and beyond.Entities:
Mesh:
Year: 2011 PMID: 22135293 PMCID: PMC3245063 DOI: 10.1093/nar/gkr1100
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Statistical information from GOLD data as of September 2011. (A) Evolution of the complete, incomplete and total number of projects monitored in GOLD. Genome projects in GOLD: 11 472. (B) Evolution of the complete projects monitored in GOLD separated into finished and permanent drafts. Complete genome projects in GOLD: 2907. (C) Distribution of the 340 metagenome projects in GOLD across the three major metagenome classification categories. Classification distribution of metagenome projects 340. (D) Phylogenetic distribution of the 8.448 bacterial genome projects.
Figure 2.Project and metadata information available from GOLD. (A) Distribution of the 11 472 genomic and metagenomic projects in GOLD as of September 2011, across the major sequencing centers. (B) Distribution of the 5831 genome projects in GOLD in September 2009 across the major sequencing centers. Abbreviations: JGI, Joint Genome Institute; JCVI, J. Craig Venter Institute; Broad, Broad Institute; Univ of Maryland–IGS, University of Maryland, Institute for Genome Sciences; WashU, Washington University; Sanger, the Wellcome Trust Sanger Institute; BCM-HGSC, Baylor College of Medicine Human Genome Sequencing Center; WORLD, all other sequencing centers. (C) Distribution of the 11 132 total genome projects in GOLD according to type strain. (D) Percentage of the 8473 bacterial genome projects for which a culture of the sequenced strain is available from one of the public culture collections. (E) Percentage of archaeal genome projects for which a culture of the sequenced strain is available from one of the public culture collections. (F) Distribution of publications of finished genome projects across publication journal.
Project and sequencing status definitions and number of projects
| Project/sequencing Status | Definition | Projects | |
|---|---|---|---|
| 1. | Complete | Genome project has been completed and the final sequence is deposited in INSDC | 2907 |
| Finished | Completely sequenced and deposited in INSDC | 1918 | |
| Permanent Draft | Draft sequenced and deposited in INSDC | 989 | |
| 2. | Incomplete | Genome project is incomplete | 7629 |
| Complete | Completely sequenced but not yet deposited in INSDC | 25 | |
| Draft | Draft sequenced and deposited in INSDC | 1568 | |
| In progress | Sequencing is in progress but no available sequence yet | 3404 | |
| DNA received | DNA has been received by sequencing center | 211 | |
| Awaiting DNA | DNA has not yet been received by sequencing center | 437 | |
| 3. | Targeted | Project is targeted, but has not yet been picked by any sequencing center | 445 |
Number of classified subdivisions with genome projects over the number of classified subdivisions of this phylogenetic group and the coverage of genome projects per taxonomic level
| Domain | Projects | Phyla | Class | Order | Family | Genus | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2011 | 2009 | 2011 | 2009 | 2011 | 2009 | 2011 | 2009 | 2011 | 2009 | 2011 | 2009 | |
| Archaea | 327 | 179 | 5/5 | 5/5 | 10/10 | 10/10 | 18/18 | 18/18 | 28/29 | 24/26 | 96/118 | 85/109 |
| Percentage coverage | 100 | 100 | 100 | 100 | 100 | 100 | 97 | 92 | 81 | 78 | ||
| Bacteria | 8458 | 4184 | 32/34 | 27/29 | 51/53 | 45/47 | 109/118 | 234/281 | 254/298 | 234/281 | 885/2106 | 730/1930 |
| Percentage coverage | 94 | 93 | 100 | 96 | 92 | 83 | 85 | 83 | 42 | 38 | ||
| Eukarya | 2205 | 1280 | 33/57 | 29/55 | 93/182 | 80/188 | 258/1037 | 350/6288 | 458/6689 | 350/6288 | 729/54 K | 536/48 K |
| Percentage coverage | 58 | 53 | 51 | 43 | 25 | 6 | 7 | 6 | 1 | 1 | ||
Figure 3.Metagenome sample list.