| Literature DB >> 21886860 |
Johannes Sikorski, Hazuki Teshima, Matt Nolan, Susan Lucas, Nancy Hammon, Shweta Deshpande, Jan-Fang Cheng, Sam Pitluck, Konstantinos Liolios, Ioanna Pagani, Natalia Ivanova, Marcel Huntemann, Konstantinos Mavromatis, Galina Ovchinikova, Amrita Pati, Roxanne Tapia, Cliff Han, Lynne Goodwin, Amy Chen, Krishna Palaniappan, Miriam Land, Loren Hauser, Olivier D Ngatchou-Djao, Manfred Rohde, Rüdiger Pukall, Stefan Spring, Birte Abt, Markus Göker, John C Detter, Tanja Woyke, James Bristow, Victor Markowitz, Philip Hugenholtz, Jonathan A Eisen, Nikos C Kyrpides, Hans-Peter Klenk, Alla Lapidus.
Abstract
Mahella australiensis Bonilla Salinas et al. 2004 is the type species of the genus Mahella, which belongs to the family Thermoanaerobacteraceae. The species is of interest because it differs from other known anaerobic spore-forming bacteria in its G+C content, and in certain phenotypic traits, such as carbon source utilization and relationship to temperature. Moreover, it has been discussed that this species might be an indigenous member of petroleum and oil reservoirs. This is the first completed genome sequence of a member of the genus Mahella and the ninth completed type strain genome sequence from the family Thermoanaerobacteraceae. The 3,135,972 bp long genome with its 2,974 protein-coding and 59 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.Entities:
Keywords: GEBA; Gram-positive; Thermoanaerobacteraceae; chemoorganotrophic; moderately thermophilic; motile; spore-forming; strictly anaerobic
Year: 2011 PMID: 21886860 PMCID: PMC3156404 DOI: 10.4056/sigs.1864526
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Figure 1Phylogenetic tree highlighting the position of M. australiensis strain 50-1 BONT relative to the other type strains within the order Thermoanaerobacterales. The tree was inferred from 1,275 aligned characters [5,6] of the 16S rRNA gene sequence under the maximum likelihood criterion [7] and rooted in accordance with the current taxonomy. The branches are scaled in terms of the expected number of substitutions per site. Numbers to the right of bifurcations are support values from 950 bootstrap replicates [8] if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [9] are labeled with one asterisk, those registered as 'Complete and Published' with two asterisks [10,11]. Apparently, even the best BLAST hits show a low degree of similarity to M. australiensis (see above), in agreement with the isolated position of the species in the latest version of the 16S rRNA phylogeny from the All-Species-Living-Tree Project [12]. The species selection for Figure 1 was based on the current taxonomic classification (Table 1).
Classification and general features of M. australiensis 50-1 BONT according to the MIGS recommendations [13] and the NamesforLife database [14].
| MIGS ID | Property | Term | Evidence code |
|---|---|---|---|
| Current classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| Type strain 50-1 BON | TAS [ | ||
| Gram stain | positive | TAS [ | |
| Cell shape | rod-shaped | TAS [ | |
| Motility | motile by peritrichous flagella | TAS [ | |
| Sporulation | swollen sporangia, terminal spores | TAS [ | |
| Temperature range | 30°C–60°C | TAS [ | |
| Optimum temperature | 50°C | TAS [ | |
| Salinity | 0.1%-4% NaCl | TAS [ | |
| MIGS-22 | Oxygen requirement | strictly anaerobic | TAS [ |
| Carbon source | arabinose, cellobiose, fructose, galactose, | TAS [ | |
| Energy metabolism | chemoorganotroph | TAS [ | |
| MIGS-6 | Habitat | oil fields | TAS [ |
| MIGS-15 | Biotic relationship | free-living | NAS |
| MIGS-14 | Pathogenicity | not reported | |
| Biosafety level | 1 | TAS [ | |
| Isolation | oil well in Queensland | TAS [ | |
| MIGS-4 | Geographic location | Riverslea Oil Field in the Bowen-Surat basin, Queensland, Australia | TAS [ |
| MIGS-5 | Sample collection time | 1997 | NAS |
| MIGS-4.1 | Latitude | roughly -27.32 | NAS |
| MIGS-4.2 | Longitude | roughly 148.72 | NAS |
| MIGS-4.3 | Depth | not reported | |
| MIGS-4.4 | Altitude | not reported |
Evidence codes - IDA: Inferred from Direct Assay (first time in publication); TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from of the Gene Ontology project [23]. If the evidence code is IDA, the property was directly observed by one of the authors or an expert mentioned in the acknowledgements.
Figure 2Scanning electron micrograph of M. australiensis 50-1 BONT
Genome sequencing project information
| | | |
|---|---|---|
| MIGS-31 | Finishing quality | Finished |
| MIGS-28 | Libraries used | Three genomic libraries: one 454 pyrosequence standard library, |
| MIGS-29 | Sequencing platforms | Illumina GAii, 454 GS FLX Titanium |
| MIGS-31.2 | Sequencing coverage | 52.1 × Illumina; 35.9 × pyrosequence |
| MIGS-30 | Assemblers | Newbler version 2.3, Velvet, phrap |
| MIGS-32 | Gene calling method | Prodigal 1.4, GenePRIMP |
| INSDC ID | CP002360 | |
| Genbank Date of Release | May 13, 2011 | |
| GOLD ID | GC01760 | |
| NCBI project ID | 42243 | |
| Database: IMG-GEBA | 2503508009 | |
| MIGS-13 | Source material identifier | DSM 15567 |
| Project relevance | Tree of Life, GEBA |
Genome Statistics
| | | |
|---|---|---|
| Genome size (bp) | 3,135,972 | 100.00% |
| DNA coding region (bp) | 2,822,780 | 90.01% |
| DNA G+C content (bp) | 1,362,640 | 43.45% |
| Number of replicons | 1 | |
| Extrachromosomal elements | 0 | |
| Total genes | 3,033 | 100.00% |
| RNA genes | 59 | 1.95% |
| rRNA operons | 3 | |
| Protein-coding genes | 2,974 | 98.05% |
| Pseudo genes | 104 | 3.43% |
| Genes with function prediction | 2,135 | 70.39% |
| Genes in paralog clusters | 103 | 3.40% |
| Genes assigned to COGs | 2,154 | 71.02% |
| Genes assigned Pfam domains | 2,341 | 77.18% |
| Genes with signal peptides | 596 | 19.65% |
| Genes with transmembrane helices | 813 | 26.81% |
| CRISPR repeats | 2 |
Figure 3Graphical circular map of the chromosome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew.
Number of genes associated with the general COG functional categories
| | | | |
|---|---|---|---|
| J | 135 | 5.7 | Translation, ribosomal structure and biogenesis |
| A | 0 | 0.0 | RNA processing and modification |
| K | 164 | 7.0 | Transcription |
| L | 138 | 5.9 | Replication, recombination and repair |
| B | 1 | 0.0 | Chromatin structure and dynamics |
| D | 34 | 1.4 | Cell cycle control, cell division, chromosome partitioning |
| Y | 0 | 0.0 | Nuclear structure |
| V | 59 | 2.5 | Defense mechanisms |
| T | 127 | 5.4 | Signal transduction mechanisms |
| M | 121 | 5.1 | Cell wall/membrane/envelope biogenesis |
| N | 57 | 2.4 | Cell motility |
| Z | 0 | 0.0 | Cytoskeleton |
| W | 0 | 0.0 | Extracellular structures |
| U | 51 | 2.8 | Intracellular trafficking, secretion, and vesicular transport |
| O | 62 | 2.6 | Posttranslational modification, protein turnover, chaperones |
| C | 130 | 5.5 | Energy production and conversion |
| G | 382 | 16.2 | Carbohydrate transport and metabolism |
| E | 160 | 6.8 | Amino acid transport and metabolism |
| F | 63 | 2.7 | Nucleotide transport and metabolism |
| H | 123 | 5.2 | Coenzyme transport and metabolism |
| I | 37 | 1.6 | Lipid transport and metabolism |
| P | 87 | 3.7 | Inorganic ion transport and metabolism |
| Q | 25 | 1.1 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 244 | 10.4 | General function prediction only |
| S | 153 | 6.5 | Function unknown |
| - | 879 | 29.0 | Not in COGs |
Pairwise comparison of M. australiensis, T. thermosaccharolyticum and C. saccharolyticus using the GGDC-Calculator.
| HSP length / | identities / | identities / | ||
|---|---|---|---|---|
| | 2.02 | 86.8 | 1.84 | |
| | 1.16 | 86.9 | 1.01 | |
| | 2.37 | 85.5 | 2.03 |
Figure 4Venn diagram depicting the intersections of protein sets (total number of derived protein sequences in parentheses) of M. australiensis, T. thermosaccharolyticum and C. saccharolyticus.