| Literature DB >> 23450099 |
Ulrike Kappler1, Karen Davenport, Scott Beatson, Susan Lucas, Alla Lapidus, Alex Copeland, Kerrie W Berry, Tijana Glavina Del Rio, Nancy Hammon, Eileen Dalin, Hope Tice, Sam Pitluck, Paul Richardson, David Bruce, Lynne A Goodwin, Cliff Han, Roxanne Tapia, John C Detter, Yun-Juan Chang, Cynthia D Jeffries, Miriam Land, Loren Hauser, Nikos C Kyrpides, Markus Göker, Natalia Ivanova, Hans-Peter Klenk, Tanja Woyke.
Abstract
Starkeya novella (Starkey 1934) Kelly et al. 2000 is a member of the family Xanthobacteraceae in the order 'Rhizobiales', which is thus far poorly characterized at the genome level. Cultures from this species are most interesting due to their facultatively chemolithoautotrophic lifestyle, which allows them to both consume carbon dioxide and to produce it. This feature makes S. novella an interesting model organism for studying the genomic basis of regulatory networks required for the switch between consumption and production of carbon dioxide, a key component of the global carbon cycle. In addition, S. novella is of interest for its ability to grow on various inorganic sulfur compounds and several C1-compounds such as methanol. Besides Azorhizobium caulinodans, S. novella is only the second species in the family Xanthobacteraceae with a completely sequenced genome of a type strain. The current taxonomic classification of this group is in significant conflict with the 16S rRNA data. The genomic data indicate that the physiological capabilities of the organism might have been underestimated. The 4,765,023 bp long chromosome with its 4,511 protein-coding and 52 RNA genes was sequenced as part of the DOE Joint Genome Institute Community Sequencing Program (CSP) 2008.Entities:
Keywords: CSP 2008; Gram-negative; Xanthobacteraceae; facultatively chemoautotrophic; methylotrophic and heterotrophic; non-motile; rod-shaped; soil bacterium; strictly aerobic
Year: 2012 PMID: 23450099 PMCID: PMC3570799 DOI: 10.4056/sigs.3006378
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Figure 1Phylogenetic tree highlighting the position of relative to the type strains of the other species within the family (blue font color). The tree was inferred from 1,381 aligned characters [34,35] of the 16S rRNA gene sequence under the maximum likelihood (ML) criterion [36]. (green font color for those species that caused conflict according to the Parafit test, black color for the remaining ones; see below for the difference) were included in the dataset for use as outgroup taxa but then turned out to be intermixed with the target family; hence, the rooting shown was inferred by the midpoint-rooting method [29]. The branches are scaled in terms of the expected number of substitutions per site. Numbers adjacent to the branches are support values from 550 ML bootstrap replicates [37] (left) and from 1,000 maximum-parsimony bootstrap replicates [38] (right) if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [39] are labeled with one asterisk, those also listed as 'Complete and Published' with two asterisks (see [40] and CP000781 for , CP002083 for and CP002292 for ).
Figure 2Transmission electron micrograph of ATCC 8093T. Scale bar: 500 nm
Classification and general features of according to the MIGS recommendations [47] and the NamesforLife database [48].
| | | | |
|---|---|---|---|
| Current classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| Type strain ATCC 8093 | TAS [ | ||
| Gram stain | negative | TAS [ | |
| Cell shape | rod-shaped (some coccobacilli) | TAS [ | |
| Motility | non-motile | TAS [ | |
| Sporulation | not reported | ||
| Temperature range | mesophile, 10–37°C | TAS [ | |
| Optimum temperature | 25–30°C | TAS [ | |
| Salinity | not reported | ||
| MIGS-22 | Oxygen requirement | strictly aerobic | TAS [ |
| Carbon source | CO2, citrate, glutamic acid (among others) | TAS [ | |
| Energy metabolism | facultatively chemolithoautotroph and methylotroph, heterotroph | TAS [ | |
| MIGS-6 | Habitat | soil | TAS [ |
| MIGS-15 | Biotic relationship | free living | NAS |
| MIGS-14 | Pathogenicity | none | NAS |
| Biosafety level | 1 | TAS [ | |
| MIGS-23.1 | Isolation | soil | TAS [ |
| MIGS-4 | Geographic location | not reported (probably New Jersey) | |
| MIGS-5 | Sample collection time | 1934 or before | TAS [ |
| MIGS-4.1 | Latitude | not reported | |
| MIGS-4.2 | Longitude | not reported | |
| MIGS-4.3 | Depth | not reported | |
| MIGS-4.4 | Altitude | not reported |
Evidence codes - TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). Evidence codes are from the Gene Ontology project [56].
Genome sequencing project information
| | | |
|---|---|---|
| MIGS-31 | Finishing quality | Finished |
| MIGS-28 | Libraries used | Three genomic libraries: one 454 pyrosequence standard library, |
| MIGS-29 | Sequencing platforms | Illumina GAii, 454 GS FLX Titanium |
| MIGS-31.2 | Sequencing coverage | 44.3 × Illumina; 53.5 × pyrosequence |
| MIGS-30 | Assemblers | Newbler version 2.0.1-PreRelease-03-30-2009, Velvet, phrap version SPS - 4.24 |
| MIGS-32 | Gene calling method | Prodigal |
| INSDC ID | CP002026 | |
| GenBank Date of Release | November 21, 2011 | |
| GOLD ID | Gc01353 | |
| NCBI project ID | 37659 | |
| Database: IMG-GEBA | 648028054 | |
| MIGS-13 | Source material identifier | DSM 506 |
| Project relevance | Carbon cycle, Environmental |
Genome Statistics
| | | |
|---|---|---|
| Genome size (bp) | 4,765,023 | 100.00% |
| DNA coding region (bp) | 4,222,317 | 88.61% |
| DNA G+C content (bp) | 3,234,723 | 67.88% |
| Number of replicons | 1 | |
| Extrachromosomal elements | 0 | |
| Total genes | 4,563 | 100.00% |
| RNA genes | 52 | 1.14% |
| rRNA operons | 1 | |
| tRNA genes | 46 | 1.01% |
| Protein-coding genes | 4,511 | 98.86% |
| Pseudo genes | 80 | 1.75% |
| Genes with function prediction (proteins) | 3,413 | 74.80% |
| Genes in paralog clusters | 2,690 | 58.95% |
| Genes assigned to COGs | 3,582 | 78.50% |
| Genes assigned Pfam domains | 3,730 | 81.74% |
| Genes with signal peptides | 1,730 | 37.91% |
| Genes with transmembrane helices | 1,169 | 25.62% |
| CRISPR repeats | 0 |
Figure 3Graphical map of the chromosome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content (black), GC skew (purple/olive).
Number of genes associated with the general COG functional categories
| | | | |
|---|---|---|---|
| J | 176 | 4.5 | Translation, ribosomal structure and biogenesis |
| A | 0 | 0.0 | RNA processing and modification |
| K | 303 | 7.7 | Transcription |
| L | 118 | 3.0 | Replication, recombination and repair |
| B | 2 | 0.1 | Chromatin structure and dynamics |
| D | 30 | 0.8 | Cell cycle control, cell division, chromosome partitioning |
| Y | 0 | 0.0 | Nuclear structure |
| V | 54 | 1.4 | Defense mechanisms |
| T | 181 | 4.6 | Signal transduction mechanisms |
| M | 210 | 5.3 | Cell wall/membrane biogenesis |
| N | 8 | 0.2 | Cell motility |
| Z | 0 | 0.0 | Cytoskeleton |
| W | 0 | 0.0 | Extracellular structures |
| U | 36 | 0.9 | Intracellular trafficking and secretion, and vesicular transport |
| O | 148 | 3.8 | Posttranslational modification, protein turnover, chaperones |
| C | 291 | 7.4 | Energy production and conversion |
| G | 270 | 6.9 | Carbohydrate transport and metabolism |
| E | 504 | 12.8 | Amino acid transport and metabolism |
| F | 77 | 2.0 | Nucleotide transport and metabolism |
| H | 156 | 4.0 | Coenzyme transport and metabolism |
| I | 143 | 3.6 | Lipid transport and metabolism |
| P | 229 | 5.8 | Inorganic ion transport and metabolism |
| Q | 105 | 2.7 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 487 | 12.4 | General function prediction only |
| S | 405 | 10.3 | Function unknown |
| - | 981 | 21.5 | Not in COGs |
Growth substrates utilized by
| | |||
|---|---|---|---|
| D-glucose | + | | |
| D-fructose | + | Proline | + |
| Sucrose | - | l-Leucine | - |
| D-Galactose | + | L-Isoleucine | - |
| L-arabinose | + | L-Tryptophan | - |
| D-gluconate | + | DL-Serine | + |
| D-arabitol | + | D-alanine | (+) |
| Adonitol | + | L-alanine | - |
| Xylitol | + | L-Glutamate | - |
| D-sorbitol | + | L-threonine | + |
| D-Mannitol | + | L-aspartate | - |
| Lactose | - | hydroxy-L Proline | + |
| | L-Alaninamide | + | |
| D-Ribose | (+) | DL- Lactate | + |
| Glycerol | + | Malate | - |
| Pyruvate | + | Succinate | (+) |
| Formate | + | Fumarate | - |
| Formamide | + | Citrate | - |
| Formaldehyde | - | Methylpyruvate | + |
| Methylamine | - | Monomethylsuccinate | + |
| Trimethylamine | - | Alpha ketobutyrate | + |
| H2/CO2 | - | Alpha hydroxybutyrate | + |
| Ethylamine | - | Beta hydroxy butyrate | + |
| Oxalate | + | Gamma aminobutyrate | + |
| Acetate | + | Benzoate | - |
| Propionate | + | p-Hydroxybenzoate | - |
| Butyrate | - | m-Hydroxybenzoate | - |
| Methanol | + | p-Aminobenzoate | - |
| Ethyl alcohol | + | Cyclohexanol | - |
| n-Propanol | + | Cyclohexane | - |
Results are combined from work done for this paper and [4-6]+ = substrate utilized, - = substrate not utilized, (+) = weak growth supported or ambiguous results in growth tests, italics = different results obtained in growth studies by different authors.