| Literature DB >> 27123157 |
Phillip J Brumm1, Miriam L Land2, David A Mead3.
Abstract
Geobacillus sp. WCH70 was one of several thermophilic organisms isolated from hot composts in the Middleton, WI area. Comparison of 16 S rRNA sequences showed the strain may be a new species, and is most closely related to G. galactosidasius and G. toebii. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2009 (CP001638). The genome of Geobacillus species WCH70 consists of one circular chromosome of 3,893,306 bp with an average G + C content of 43 %, and two circular plasmids of 33,899 and 10,287 bp with an average G + C content of 40 %. Among sequenced organisms, Geobacillus sp. WCH70 shares highest Average Nucleotide Identity (86 %) with G. thermoglucosidasius strains, as well as similar genome organization. Geobacillus sp. WCH70 appears to be a highly adaptable organism, with an exceptionally high 125 annotated transposons in the genome. The organism also possesses four predicted restriction-modification systems not found in other Geobacillus species.Entities:
Keywords: Geobacillus sp. WCH70; Restriction-modification; Thermophile; Transposons; Wood compost
Year: 2016 PMID: 27123157 PMCID: PMC4847372 DOI: 10.1186/s40793-016-0153-y
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Classification and general features of Geobacillus strain WCH70 [33]
| MIGS ID | Property | Term | Evidence codea |
|---|---|---|---|
| Classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | |||
| Strain: WCH70 | |||
| Gram stain | Positive | IDA | |
| Cell shape | Rods and chains of rods | IDA | |
| Motility | Motile | IDA | |
| Sporulation | Subterminal spores | IDA | |
| Temperature range | 55 °C to 80 °C | IDA | |
| Optimum temperature | 70 °C | IDA | |
| pH range; Optimum | 5.8-8.0; 7.5 | IDA | |
| Carbon source | Carbohydrate or protein | IDA | |
| MIGS-6 | Habitat | Compost | IDA |
| MIGS-6.3 | Salinity | Not reported | IDA |
| MIGS-22 | Oxygen requirement | Facultative anaerobe | IDA |
| MIGS-15 | Biotic relationship | Free-living | IDA |
| MIGS-14 | Pathogenicity | Non-pathogen | IDA |
| MIGS-4 | Geographic location | Middleton, WI, USA | IDA |
| MIGS-5 | Sample collection | September 2003 | IDA |
| MIGS-4.1 | Latitude | 43.097090 | IDA |
| MIGS-4.2 | Longitude | -89.504730 | IDA |
| MIGS-4.4 | Altitude | 342 | TAS |
aEvidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [41]
Fig. 1Micrograph of Geobacillus sp. Y412MC52 cells showing individual cells and clumps of cells. Cells were grown in TSB plus 0.4 % glucose for 18 h. at 70 °C. A 1.0 ml aliquot was removed, centrifuged, re-suspended in 0.2 ml of sterile water, and stained using a 50 μM solution of SYTO® 9 fluorescent stain in sterile water (Molecular Probes). Dark field fluorescence microscopy was performed using a Nikon Eclipse TE2000-S epifluorescence microscope at 2000× magnification using a high-pressure Hg light source and a 500 nm emission filter
Fig. 2The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [18]. The bootstrap consensus tree inferred from 500 replicates [42] is taken to represent the evolutionary history of the taxa analyzed [42]. Branches corresponding to partitions reproduced in less than 50 % bootstrap replicates are collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches [42]. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The analysis involved 26 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 1271 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [17]. The type strains of all validly described species are included (NCBI accession numbers): G. caldoxylosilyticus ATCC700356T (AF067651), G. galactosidasius CF1BT (AM408559), G. jurassicus DS1T (FN428697), G. kaustophilus NCIMB8547T (X60618), G. lituanicus N-3T (AY044055), G. stearothermophilus R-35646T (FN428694), G. subterraneus 34T (AF276306), G. thermantarcticus DSM9572T (FR749957), G. thermocatenulatus BGSC93A1T (AY608935), G. thermodenitrificans R-35647T (FN538993), G. thermoglucosidasius BGSC95A1T (FN428685), G. thermoleovorans DSM5366T (Z26923), G. toebii BK-1T (FN428690), G. uzenensis UT (AF276304) and G. vulcani 3S-1T (AJ293805). The 16S rRNA sequence of Paenibacillus lautusJCM9073T (AB073188) was used to root the tree
Project information
| MIGS ID | Property | Term |
|---|---|---|
| MIGS 31 | Finishing quality | Finished |
| MIGS-28 | Libraries used | 8 Kb and 40 Kb |
| MIGS 29 | Sequencing platforms | Sanger and 454 |
| MIGS 31.2 | Fold coverage | 13 × |
| MIGS 30 | Assemblers | Phred/Phrap/Consed |
| MIGS 32 | Gene calling method | Prodigal, GenePRIMP |
| Locus Tag | GWCH70 | |
| Genbank ID | NC_012793 | |
| GenBank Date of Release | December 1, 2009 | |
| GOLD ID | Gs0012167 | |
| BIOPROJECT | PRJNA20805 | |
| MIGS 13 | Source Material Identifier | Genome |
| Project relevance | Biotechnological |
Summary of genome: one chromosome and 2 plasmids
| Label | Size (Mb) | Topology | INSDC identifier | RefSeq ID |
|---|---|---|---|---|
| Chromosome | 3.46 | Circular | CP001638.1 | NC_012793 |
| Plasmid 1 | 0.034 | Circular | CP001639.1 | NC_012794 |
| Plasmid 2 | 0.010 | Circular | CP001640.1 | NC_012790 |
Fig. 3Graphical circular map of the Geobacillus sp. WCH70 chromosome. From outside to the center: Genes on forward strand (color by COG categories) Genes on reverse strand (color by COG categories) RNA genes (tRNAs green, rRNAs red, other RNAs black) GC content, GC skew
Genome statistics
| Attribute | Value | % of Total |
|---|---|---|
| Genome size (bp) | 3,508,804 | 100.0 |
| DNA coding (bp) | 3,033,424 | 86.4 |
| DNA G + C (bp) | 1,501,708 | 42.8 |
| DNA scaffolds | 3 | |
| Total genes | 3597 | 100.0 |
| Protein coding genes | 3477 | 96.7 |
| RNA genes | 120 | 3.3 |
| Pseudo genes | 309 | 8.6 |
| Genes in internal clusters | ||
| Genes with function prediction | 2373 | 66.0 |
| Genes assigned to COGs | 2201 | 61.2 |
| Genes with Pfam domains | 2946 | 81.9 |
| Genes with signal peptides | 125 | 3.5 |
| Genes with transmembrane helices | 805 | 22.4 |
| CRISPR repeats | 6 |
Number of genes associated with general COG functional categories
| Code | Value | %age | Description |
|---|---|---|---|
| J | 195 | 8.0 | Translation, ribosomal structure and biogenesis |
| A | 0 | 0.0 | RNA processing and modification |
| K | 143 | 5.8 | Transcription |
| L | 94 | 3.8 | Replication, recombination and repair |
| B | 1 | 0.1 | Chromatin structure and dynamics |
| D | 102 | 4.2 | Cell cycle control, Cell division, chromosome partitioning |
| V | 65 | 2.6 | Defense mechanisms |
| T | 104 | 4.2 | Signal transduction mechanisms |
| M | 102 | 4.2 | Cell wall/membrane biogenesis |
| N | 62 | 2.5 | Cell motility |
| U | 33 | 1.4 | Intracellular trafficking and secretion |
| O | 97 | 4.0 | Posttranslational modification, protein turnover, chaperones |
| C | 140 | 5.7 | Energy production and conversion |
| G | 128 | 5.2 | Carbohydrate transport and metabolism |
| E | 222 | 9.1 | Amino acid transport and metabolism |
| F | 71 | 2.9 | Nucleotide transport and metabolism |
| H | 158 | 6.5 | Coenzyme transport and metabolism |
| I | 99 | 4.0 | Lipid transport and metabolism |
| P | 131 | 5.3 | Inorganic ion transport and metabolism |
| Q | 45 | 1.8 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 194 | 7.9 | General function prediction only |
| S | 157 | 6.4 | Function unknown |
| - | 1396 | 38.8 | Not in COGs |
The total is based on the total number of protein coding genes in the genome
Fig. 4Synteny plot of Geobacillus sp. WCH70 versus G. thermoglucosidasius C56-YS93
Comparison of predicted transposons
| Function Name | COG id | WCH70 | CIC9a | NBRCb | YUc | YS93d | GT20e | M10EXGf |
|---|---|---|---|---|---|---|---|---|
| Transposase, IS605 family | COG0675 | 62 | 3 | 2 | 0 | 1 | 0 | 0 |
| REP element-mobilizing transposase RayT | COG1943 | 8 | 0 | 1 | 0 | 0 | 0 | 0 |
| Transposase | COG3316 | 3 | 0 | 0 | 0 | 0 | 0 | 0 |
| Transposase, mutator type | COG3328 | 15 | 0 | 1 | 4 | 7 | 4 | 3 |
| Transposase, IS66 family | COG3436 | 10 | 0 | 0 | 0 | 0 | 0 | 0 |
| Transposase, IS204 family | COG3464 | 9 | 0 | 0 | 0 | 0 | 0 | 1 |
| Transposase, IS116 family | COG3547 | 11 | 0 | 1 | 0 | 5 | 0 | 1 |
| Transposase | COG5421 | 7 | 0 | 0 | 0 | 0 | 0 | 0 |
| Transposase | Not in WCH70 | 0 | 4 | 10 | 13 | 13 | 13 | 19 |
| Total | 125 | 7 | 15 | 17 | 26 | 17 | 24 |
Geobacillus caldoxylosilyticus CIC9, Geobacillus caldoxylosilyticus NBRC 107762, Geobacillus thermoglucosidans YU, d Geobacillus thermoglucosidasius C56-YS93, e Geobacillus thermoglucosidasius GT20, f Geobacillus thermoglucosidasius M10EXG, Geobacillus thermoglucosidasius NBRC 107763