| Literature DB >> 24501643 |
Jan P Meier-Kolthoff1, Megan Lu2, Marcel Huntemann3, Susan Lucas3, Alla Lapidus4, Alex Copeland3, Sam Pitluck3, Lynne A Goodwin5, Cliff Han5, Roxanne Tapia5, Gabriele Pötter1, Miriam Land6, Natalia Ivanova3, Manfred Rohde7, Markus Göker1, John C Detter5, Tanja Woyke3, Nikos C Kyrpides3, Hans-Peter Klenk1.
Abstract
Saccharomonospora cyanea Runmao et al. 1988 is a member of the genus Saccharomonospora in the family Pseudonocardiaceae that is moderately well characterized at the genome level thus far. Members of the genus Saccharomonospora are of interest because they originate from diverse habitats, such as soil, leaf litter, manure, compost, surface of peat, moist, over-heated grain, and ocean sediment, where they probably play a role in the primary degradation of plant material by attacking hemicellulose. Species of the genus Saccharomonospora are usually Gram-positive, non-acid fast, and are classified among the actinomycetes. S. cyanea is characterized by a dark blue (= cyan blue) aerial mycelium. After S. viridis, S. azurea, and S. marina, S. cyanea is only the fourth member in the genus for which a completely sequenced (non-contiguous finished draft status) type strain genome will be published. Here we describe the features of this organism, together with the draft genome sequence, and annotation. The 5,408,301 bp long chromosome with its 5,139 protein-coding and 57 RNA genes was sequenced as part of the DOE funded Community Sequencing Program (CSP) 2010 at the Joint Genome Institute (JGI).Entities:
Keywords: CSP 2010; Gram-positive; Pseudonocardiaceae; aerobic; chemoheterotrophic; draft genome; non-motile; soil bacterium; spore-forming; vegetative and aerial mycelia
Year: 2013 PMID: 24501643 PMCID: PMC3910552 DOI: 10.4056/sigs.4207886
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Figure 1Phylogenetic tree highlighting the position of relative to the type strains of the other species within the family . The tree was inferred from 1,371 aligned characters [16,17] of the 16S rRNA gene sequence under the maximum likelihood (ML) criterion [18]. Rooting was done initially using the midpoint method [19] and then checked for its agreement with the current classification (Table 1). The branches are scaled in terms of the expected number of substitutions per site. Numbers adjacent to the branches are support values from 600 ML bootstrap replicates [20] (left) and from 1,000 maximum-parsimony bootstrap replicates [21] (right) if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [22] are labeled with one asterisk, those also listed as 'Complete and Published' with two asterisks [4,23,24] ( [25] and [26] miss their second asterisk due to very recent publication). Ruan et al. 1994 was ignored in the tree, because a proposal for the transfer of this species to the genus [27] was recently rejected on formal criteria [3].
Classification and general features of NA-134T according to the MIGS recommendations [28] published by the Genome Standards Consortium [29].
| | | | |
|---|---|---|---|
| Current classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Subclass | TAS [ | ||
| Order | TAS [ | ||
| Suborder | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| Type-strain NA-134 | TAS [ | ||
| Gram stain | positive | NAS | |
| Cell shape | variable, substrate and aerial mycelia | TAS [ | |
| Motility | non-motile | TAS [ | |
| Sporulation | small, non-motile spores with warty surface; single and mostly from aerial mycelium | TAS [ | |
| Temperature range | mesophile, 24-40°C | TAS [ | |
| Optimum temperature | 28-37°C | TAS [ | |
| Salinity | grows well in up to 10% (w/v) NaCl | TAS [ | |
| MIGS-22 | Oxygen requirement | aerobic | TAS [ |
| Carbon source | pentoses, hexoses, but not D-glucose | TAS [ | |
| Energy metabolism | chemoheterotrophic | NAS | |
| MIGS-6 | Habitat | soil | TAS [ |
| MIGS-15 | Biotic relationship | free living | TAS [ |
| MIGS-14 | Pathogenicity | none | NAS |
| Biosafety level | 1 | TAS [ | |
| MIGS-23.1 | Isolation | soil | TAS [ |
| MIGS-4 | Geographic location | Guangyan City, Sichuan, China | TAS [ |
| MIGS-5 | Sample collection time | 1988 or before | NAS |
| MIGS-4.1 | Latitude | 32.450 | TAS [ |
| MIGS-4.2 | Longitude | 105.843 | TAS [ |
| MIGS-4.3 | Depth | not reported | |
| MIGS-4.4 | Altitude | about 40 m | NAS |
Evidence codes - TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from of the Gene Ontology project [41].
Figure 2Scanning electron micrograph of NA-134T
Genome sequencing project information
| | | |
|---|---|---|
| MIGS-31 | Finishing quality | Non-contiguous finished |
| MIGS-28 | Libraries used | Three genomic libraries: one 454 pyrosequence standard library, one 454 PE library (12 kb insert size), one Illumina library |
| MIGS-29 | Sequencing platforms | Illumina GAii, 454 GS FLX Titanium |
| MIGS-31.2 | Sequencing coverage | 1,005.1 × Illumina; 8.6 × pyrosequence |
| MIGS-30 | Assemblers | Newbler version 2.3, Velvet version 1.0.13, phrap version SPS - 4.24 |
| MIGS-32 | Gene calling method | Prodigal, GenePRIMP |
| INSDC ID | CM001440, AHLY00000000.1 | |
| GenBank Date of Release | February 3, 2012 | |
| GOLD ID | Gi07556 | |
| NCBI project ID | 61997 | |
| Database: IMG | 2508501013 | |
| MIGS-13 | Source material identifier | DSM 44106 |
| Project relevance | Bioenergy and phylogenetic diversity |
Genome Statistics
| Value | % of Total | |
|---|---|---|
| Genome size (bp) | 5,408,301 | 100.00% |
| DNA coding region (bp) | 4,926,834 | 91.10% |
| DNA G+C content (bp) | 3,771,475 | 69.74% |
| Number of replicons | 1 | |
| Extrachromosomal elements | 0 | |
| Total genes | 5,196 | 100.00% |
| RNA genes | 57 | 1.10% |
| rRNA operons | 3 | |
| tRNA genes | 47 | 0.90% |
| Protein-coding genes | 5,139 | 98.90% |
| Pseudo genes | 93 | 1.79% |
| Genes with function prediction (proteins) | 3,880 | 74.67% |
| Genes in paralog clusters | 2,852 | 54.89% |
| Genes assigned to COGs | 3,834 | 73.79% |
| Genes assigned Pfam domains | 4,014 | 77.25% |
| Genes with signal peptides | 1,512 | 29.10% |
| Genes with transmembrane helices | 1,206 | 23.21% |
| CRISPR repeats | 0 |
Figure 3Graphical map of the chromosome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew (purple/olive).
Number of genes associated with the general COG functional categories
| | | | |
|---|---|---|---|
| J | 184 | 4.3 | Translation, ribosomal structure and biogenesis |
| A | 1 | 0.0 | RNA processing and modification |
| K | 512 | 11.8 | Transcription |
| L | 182 | 4.2 | Replication, recombination and repair |
| B | 2 | 0.1 | Chromatin structure and dynamics |
| D | 34 | 0.8 | Cell cycle control, cell division, chromosome partitioning |
| Y | 0 | 0.0 | Nuclear structure |
| V | 79 | 1.8 | Defense mechanisms |
| T | 209 | 4.8 | Signal transduction mechanisms |
| M | 174 | 4.0 | Cell wall/membrane biogenesis |
| N | 5 | 0.1 | Cell motility |
| Z | 0 | 0.0 | Cytoskeleton |
| W | 0 | 0.0 | Extracellular structures |
| U | 35 | 0.8 | Intracellular trafficking and secretion, and vesicular transport |
| O | 130 | 3.0 | Post-translational modification, protein turnover, chaperones |
| C | 274 | 6.3 | Energy production and conversion |
| G | 327 | 7.6 | Carbohydrate transport and metabolism |
| E | 341 | 7.9 | Amino acid transport and metabolism |
| F | 96 | 2.2 | Nucleotide transport and metabolism |
| H | 202 | 4.7 | Coenzyme transport and metabolism |
| I | 213 | 4.9 | Lipid transport and metabolism |
| P | 210 | 4.9 | Inorganic ion transport and metabolism |
| Q | 196 | 4.5 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 588 | 13.6 | General function prediction only |
| S | 330 | 7.6 | Function unknown |
| - | 1,362 | 26.2 | Not in COGs |
Pairwise comparison of with and using the GGDC (Genome-to-Genome Distance Calculator).
| | | | ||
|---|---|---|---|---|
| 71 | 85 | 61 | ||
| 28 | 79 | 22 | ||
| 55 | 82 | 45 |
Pearson's correlation coefficients according to the similarity on the level of Pfam, COG category and TIGRfam (in this order and separated by slashes).
| | | | | |
|---|---|---|---|---|
| 1.00 / 1.00 / 1.00 | - | - | - | |
| 0.97 / 0.96 / 0.93 | 1.00 / 1.00 / 1.00 | - | - | |
| 0.95 / 0.90 / 0.87 | 0.96 / 0.93 / 0.90 | 1.00 / 1.00 / 1.00 | - | |
| 0.93 / 0.90 / 0.86 | 0.93 / 0.90 / 0.87 | 0.94 / 0.90 / 0.83 | 1.00 / 1.00 / 1.00 |
The comparison of the number of genes belonging to the different COG categories revealed only small differences in the genomes of and with 0.4% deviation between the same COG categories on average. A slightly higher fraction of genes belonging in the categories transcription ( 11.8%, 10.6%), carbohydrate metabolism ( 7.6%, 7.0%), secondary catabolism ( 4.5%, 4.1%), defense mechanisms ( 1.8%, 1.6%), inorganic ion transport and metabolism ( 4.9%, 4.7%) and lipid transport ( 4.9%, 4.8%) were identified in . The gene count in further COG categories such as cell cycle control, cell motility, cell biogenesis, lipid metabolism, secondary catabolism, post-translational modification and signal transduction was also slightly increased in but differed at most by 5 genes. In contrast, a slightly smaller fraction of genes belonging in the categories posttranslational modification ( 3.0%, 3.6%), coenzyme metabolism ( 4.7%, 5.2%), amino acid metabolism ( 7.9%, 8.4%), replication system ( 4.2%, 4.7%), translation ( 4.3%, 4.6%), signal transduction ( 4.8%, 5.1%), energy production/conversion ( 6.3%, 6.6%), nucleotide transport ( 2.2%, 2.4%) and cell wall biogenesis ( 4.0%, 4.2%) were identified in . The remaining COG categories intracellular transport, cell cycle control, cell motility and RNA modification differed by not more than a single gene.
Figure 4Synteny dot plot based on the genome sequences of vs. those of and . Blue dots represent regions of similarity found on parallel strands and red dots show regions of similarity found on anti-parallel strands.
Figure 5Venn-diagram depicting the intersections of protein sets (total numbers in parentheses) of , and . The diagram was created with [59].