| Literature DB >> 18845571 |
Pavel S Novichkov1, Igor Ratnere, Yuri I Wolf, Eugene V Koonin, Inna Dubchak.
Abstract
The database of Alignable Tight Genomic Clusters (ATGCs) consists of closely related genomes of archaea and bacteria, and is a resource for research into prokaryotic microevolution. Construction of a data set with appropriate characteristics is a major hurdle for this type of studies. With the current rate of genome sequencing, it is difficult to follow the progress of the field and to determine which of the available genome sets meet the requirements of a given research project, in particular, with respect to the minimum and maximum levels of similarity between the included genomes. Additionally, extraction of specific content, such as genomic alignments or families of orthologs, from a selected set of genomes is a complicated and time-consuming process. The database addresses these problems by providing an intuitive and efficient web interface to browse precomputed ATGCs, select appropriate ones and access ATGC-derived data such as multiple alignments of orthologous proteins, matrices of pairwise intergenomic distances based on genome-wide analysis of synonymous and nonsynonymous substitution rates and others. The ATGC database will be regularly updated following new releases of the NCBI RefSeq. The database is hosted by the Genomics Division at Lawrence Berkeley National laboratory and is publicly available at http://atgc.lbl.gov.Entities:
Mesh:
Year: 2008 PMID: 18845571 PMCID: PMC2686458 DOI: 10.1093/nar/gkn684
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The distribution of the number of ATGCs among the main taxonomic groups of bacteria and archaea.
Figure 2.Distribution of the ATGCs by the number of included genomes.
Characteristics of large (>10 species) ATGCs
| Phylum | Genus | Number of species | Genome size, Mb | GC content | ||
|---|---|---|---|---|---|---|
| Proteobacteria | 34 | 4.8 (3.0–5.5) | 50.8 (50.3–56.7) | (0.00–1.99) | (0.00–0.27) | |
| Proteobacteria | 21 | 7.0 (5.2–7.5) | 68.3 (67.6–68.5) | (0.00–0.35) | (0.00–0.14) | |
| Proteobacteria | 16 | 4.6 (4.3–5.1) | 47.6 (46.9–49.0) | (0.00–1.08) | (0.00–0.23) | |
| Proteobacteria | 13 | 1.9 (1.8–2.0) | 38.0 (38.0–38.2) | (0.00–0.09) | (0.00–0.13) | |
| Firmicutes | 13 | 2.9 (2.7–2.9) | 32.8 (32.7–32.9) | (0.00–0.05) | (0.00–0.04) | |
| Firmicutes | 13 | 3.0 (2.8–3.1) | 37.8 (36.4–38.0) | (0.00–1.03) | (0.01–0.14) | |
| Proteobacteria | 13 | 4.0 (3.8–4.1) | 47.5 (47.4–47.6) | (0.00–0.04) | (0.01–0.26) | |
| Firmicutes | 12 | 1.9 (1.8–1.9) | 38.5 (38.3–38.7) | (0.00–0.02) | (0.00–0.04) | |
| Firmicutes | 12 | 5.2 (4.1–5.6) | 35.4 (34.8–35.9) | (0.00–1.82) | (0.00–0.24) | |
| Proteobacteria | 11 | 5.0 (4.7–5.3) | 46.2 (44.4–47.9) | (0.01–1.72) | (0.00–0.18) | |
| Proteobacteria | 11 | 1.7 (1.6–1.8) | 30.5 (30.3–31.1) | (0.00–1.88) | (0.00–0.09) | |
| Firmicutes | 11 | 2.1 (2.0–2.2) | 39.7 (39.6–39.8) | (0.00–0.01) | (0.00–0.08) |
Median, minimum and maximum values are given for genome size and GC content; minimum and maximum values are given for synonymous substitution rate (dS) and synteny distance (dY).
Figure 3.A screen shot of the ATGC web page. On the left panel is the taxon explorer with one of the clusters selected, and on the right panel are: site navigation menu on the top, and genomes and properties tables for the selected cluster in the bottom.
Figure 4.Clusters properties table sorted by genome size and applying a filter to a cluster size (number of genomes in a cluster). A filtered column can be then easily identified by a bold italic header.
Figure 5.The ‘Download pull-down menu’ allows for the download of the various types of data for a selected list of genomes.