| Literature DB >> 25197443 |
Nikos C Kyrpides1, Tanja Woyke1, Jonathan A Eisen2, George Garrity3, Timothy G Lilburn4, Brian J Beck4, William B Whitman5, Phil Hugenholtz6, Hans-Peter Klenk7.
Abstract
The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project with the objective of sequencing 250 bacterial and archaeal genomes. The two major goals of that project were (a) to test the hypothesis that there are many benefits to the use the phylogenetic diversity of organisms in the tree of life as a primary criterion for generating their genome sequence and (b) to develop the necessary framework, technology and organization for large-scale sequencing of microbial isolate genomes. While the GEBA pilot project has not yet been entirely completed, both of the original goals have already been successfully accomplished, leading the way for the next phase of the project. Here we propose taking the GEBA project to the next level, by generating high quality draft genomes for 1,000 bacterial and archaeal strains. This represents a combined 16-fold increase in both scale and speed as compared to the GEBA pilot project (250 isolate genomes in 4+ years). We will follow a similar approach for organism selection and sequencing prioritization as was done for the GEBA pilot project (i.e. phylogenetic novelty, availability and growth of cultures of type strains and DNA extraction capability), focusing on type strains as this ensures reproducibility of our results and provides the strongest linkage between genome sequences and other knowledge about each strain. In turn, this project will constitute a pilot phase of a larger effort that will target the genome sequences of all available type strains of the Bacteria and Archaea.Entities:
Year: 2013 PMID: 25197443 PMCID: PMC4148999 DOI: 10.4056/sigs.5068949
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Figure 1Genome project coverage of bacterial and archaeal type strains as of June 2011. From a total of approximately 9,000 bacterial and archaeal type strains, 1219 (13%) (non-redundant) have a publicly known genome project.
Figure 2Phylogenetic diversity of the type strains of Bacteria and based on the SSU rRNA genes as of June 2011. Blue: phylogenetic diversity of current complete and ongoing genome projects from 1,219 type strains (GOLD 1/2011); Red: of the 1,000 type strains proposed to be sequenced here; Pink: phylogenetic diversity of all type strains available at the Living Tree Project (LTP) [9]. All our calculations are based on the LTP tree from September 2010 (latest version) * which contains 8,029 of the about 9,000 type strains.
Summary table for KMG project (including non-redundant non-type strains)
| | | | | | | | |
|---|---|---|---|---|---|---|---|
| 57 | 61 | 6.6 | 33 | 0 | 57.9 | | |
| 314 | 388 | 19.1 | 143 | 4 | 45.5 | | |
| 1 | 1 | 0.0 | 1 | 0 | 100.0 | | |
| 29 | 31 | 6.5 | 8 | 0 | 27.6 | | |
| 37 | 38 | 2.6 | 14 | 1 | 37.8 | | |
| 7 | 8 | 12.5 | 2 | 1 | 28.6 | | |
| 71 | 76 | 6.6 | 19 | 0 | 26.8 | | |
| 4 | 4 | 0.0 | 2 | 0 | 50.0 | | |
| 27 | 28 | 3.6 | 11 | 0 | 40.7 | | |
| 12 | 12 | 0.0 | 2 | 0 | 16.7 | | |
| 12 | 12 | 0.0 | 6 | 0 | 50.0 | | |
| 88 | 90 | 2.2 | 9 | 2 | 10.2 | | |
| 16 | 22 | 27.3 | 9 | 0 | 56.3 | | |
| 3,541 | 4,323 | 18.1 | 364 | 35 | 10.3 | | |
| 1,875 | 2,263 | 17.1 | 311 | 14 | 16.6 | | |
| 234 | 258 | 9.3 | 25 | 0 | 10.7 | | |
| 2,439 | 2,953 | 17.4 | 145 | 5 | 5.9 | | |
| 15 | 19 | 21.1 | 10 | 0 | 66.7 | | |
| 17 | 20 | 15.0 | 8 | 0 | 47.1 | | |
| 112 | 127 | 11.8 | 25 | 0 | 22.3 | | |
| 3 | 5 | 40.0 | 1 | 0 | 33.3 | | |
| 11 | 11 | 0.0 | 3 | 0 | 27.3 | | |
| 767 | 914 | 16.1 | 131 | 9 | 17.1 | | |
| 38 | 47 | 19.1 | 12 | 0 | 31.6 | | |
| 35 | 35 | 0.0 | 6 | 2 | 17.1 | | |
| 1 | 1 | 0.0 | 1 | 0 | 100.0 | | |
| 2 | 3 | 33.3 | 2 | 0 | 100.0 | | |
| 2 | 2 | 0.0 | 1 | 0 | 50.0 | | |
| 17 | 18 | 5.6 | 10 | 1 | 58.8 | | |
| 1 | 1 | 0.0 | 1 | 0 | 100.0 | | |
| 1 | 1 | 0.0 | 1 | 0 | 100.0 | | |
| | | | | | | |