Literature DB >> 26500717

Complete genome sequences of Geobacillus sp. Y412MC52, a xylan-degrading strain isolated from obsidian hot spring in Yellowstone National Park.

Phillip Brumm1, Miriam L Land2, Loren J Hauser2, Cynthia D Jeffries3, Yun-Juan Chang3, David A Mead4.   

Abstract

Geobacillus sp. Y412MC52 was isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2011 (CP002835). Based on 16S rRNA genes and average nucleotide identity, Geobacillus sp. Y412MC52 and the related Geobacillus sp. Y412MC61 appear to be members of a new species of Geobacillus. The genome of Geobacillus sp. Y412MC52 consists of one circular chromosome of 3,628,883 bp, an average G + C content of 52 % and one circular plasmid of 45,057 bp and an average G + C content of 45 %. Y412MC52 possesses arabinan, arabinoglucuronoxylan, and aromatic acid degradation clusters for degradation of hemicellulose from biomass. Transport and utilization clusters are also present for other carbohydrates including starch, cellobiose, and α- and β-galactooligosaccharides.

Entities:  

Keywords:  Arabinan; Biomass; G. thermocatenulatus; Geobacillus sp. Y412MC52; Obsidian hot spring; Xylan

Year:  2015        PMID: 26500717      PMCID: PMC4617443          DOI: 10.1186/s40793-015-0075-0

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

Identification of new organisms that produce biomass-degrading enzymes is of considerable interest. Commercial uses for these enzymes include paper manufacturing, brewing, biomass deconstruction and the production of animal feeds [1-3]. Hot springs, especially those at Yellowstone National Park, have been a source of many new organisms including [4, 5], [6], and [7] that possess enzymes with significant potential in biotechnological applications [8]. As part of a project in conjunction with the Great Lakes Bioenergy Research Center, Dept. of Energy, C5–6 Technologies and Lucigen Corp. isolated, characterized, and sequenced a number of new enzyme-producing aerobic organisms from Yellowstone hot springs. Geobacillus species were the most common aerobic organisms isolated during the cultivation of most hot springs samples. Geobacillus species were originally classified as members of the genus , but were subsequently reclassified as a separate genus based on 16S rRNA gene sequence analysis, lipid and fatty acid analysis, phenotypic characterization, and DNA—DNA hybridization experiments [9]. species have been isolated from a number of extreme environments including high-temperature oilfields [10], a corroded pipeline in an extremely deep well [11], African [12] and Russian [13] hot springs, marine vents [14], and the Mariana Trench [15], yet they can also be found in garden soils [16] and hay composts [17]., The ability of Geobacillus species to thrive in these varied and often hostile environments suggests that these species possess enzymes suitable for applications in challenging industrial environments. We therefore sequenced a number of these isolates including strains Y41MC52, Y41MC61, C56-T3, and Y4.1MC1 [18] to identify new enzymes suitable for use in biomass conversion into fuels and chemicals.

Organism information

Classification and features

Geobacillus sp. Y412MC52 and Geobacillus sp. Y412MC61 are two thermophilic organisms isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA (44.6100594° latitude and −110.4388217° longitude) under a sampling permit from the National Park Service. The hot spring possesses a pH of 6.37 and a temperature range of 42–90 °C. The organisms were isolated from a sample of hot spring water by enrichment and plating on YTP-2 medium [19] at 70 °C. The cultures are available from the Genetic Stock Center as GSCID: 96A11 (MC52) and GSCID: 96A12 (MC61). Both cultures are routinely grown in YTP-2 medium media and maintained on YTP-2 agar plates. MC52, is a Gram-positive, rod-shaped facultative anaerobe (Table 1 and Additional file 1: Table S1), with optimum growth temperature of 65 °C and maximum growth temperature of 75 °C. MC52 appears to grow as a mixture of single cells and occasional large clumps of cells in liquid culture (Fig. 1). Growth is not observed on minimal medium supplemented with glucose, xylose or other sugars. Excellent growth is seen in Luria Broth, Terrific Broth, Tryptic Soy Broth and other common lab media with and without additional carbohydrate, indicating potential growth requirements for both vitamins and amino acids. Growth in YTP-2 medium is stimulated by addition of monosaccharides, disaccharides, soluble starch, xylan, arabinan, and arabinogalactan. Growth in YTP-2 medium is not stimulated by addition of cellulose, mannan, glucomannan, galactomannan, chitin, or pectin. MC52 produces extracellular xylanase when grown in YTP-2 medium supplemented with pyruvate, xylose, xylooligosaccharides and arabinogalactan. No secreted xylanase is detected when MC52 is grown in YTP-2 medium supplemented with glucose or arabinose. Extracellular arabinase is detected only in cultures grown in YTP-2 medium supplemented with arabinogalactan. Extracellular amylase is detected in cultures grown in YTP-2 medium supplemented with soluble starch or pullulan. Blue (positive) colonies of MC52 are observed on plates containing either 5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside or 5-bromo-4-chloro-3-indolyl-α-D-galactopyranoside, indicating production of α-galactosidase and β-galactosidase. Fluorescent colonies are observed on plates containing 4-methylumbelliferyl-β-D-cellobioside, 4-methylumbelliferyl-β-D-xylopyranoside, and 4-methylumbelliferyl-β-D-glucoyranoside indicating production of β-glucosidase and β-xylosidase.
Table 1

Classification and general features of Geobacillus sp. Y412MC52 [46]

MIGS IDPropertyTermEvidence codea
ClassificationDomain Bacteria TAS [47]
Phylum Firmicutes TAS [48, 49]
Class Bacilli TAS [48, 49]
Order Bacillales TAS [48, 49]
Family Bacillaceae TAS [48, 49]
Genus Geobacillus TAS [9, 49]
SpeciesIDA
Strain Y412MC52IDA
Gram stainPositiveIDA
Cell shapeRodsIDA
MotilityMotileIDA
SporulationSpore formerNAS
Temperature range55 to 75 °CIDA
Optimum temperature65 °CIDA
pH range; Optimum5.5–8.0; 7.5IDA
Carbon sourceMonosaccharides, xylan, arabinanIDA
MIGS-6HabitatHot springIDA
MIGS-6.3SalinityNot reportedIDA
MIGS-22Oxygen requirementFacultative anaerobeIDA
MIGS-15Biotic relationshipFree-livingIDA
MIGS-14PathogenicityNon-pathogenNAS
MIGS-4Geographic locationObsidian spring, Yellowstone National ParkIDA
MIGS-5Sample collectionSeptember 2003IDA
MIGS-4.1Latitude44.6603028IDA
MIGS-4.2Longitude−110.865194IDA
MIGS-4.4Altitude2416 mIDA

aEvidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [50]

Fig. 1

Micrograph of Geobacillus sp. Y412MC52 cells showing individual cells and clumps of cells. Cells were grown in TSB plus 0.4 % glucose for 18 h. at 70 °C. A 1.0 ml aliquot was removed, centrifuged, re-suspended in 0.2 ml of sterile water, and stained using a 50 μM solution of SYTO® 9 fluorescent stain in sterile water (Molecular Probes). Dark field fluorescence microscopy was performed using a Nikon Eclipse TE2000-S epifluorescence microscope at 2000× magnification using a high-pressure Hg light source and a 500 nm emission filter

Classification and general features of Geobacillus sp. Y412MC52 [46] aEvidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [50] Micrograph of Geobacillus sp. Y412MC52 cells showing individual cells and clumps of cells. Cells were grown in TSB plus 0.4 % glucose for 18 h. at 70 °C. A 1.0 ml aliquot was removed, centrifuged, re-suspended in 0.2 ml of sterile water, and stained using a 50 μM solution of SYTO® 9 fluorescent stain in sterile water (Molecular Probes). Dark field fluorescence microscopy was performed using a Nikon Eclipse TE2000-S epifluorescence microscope at 2000× magnification using a high-pressure Hg light source and a 500 nm emission filter A phylogenetic tree was constructed to identify the relationship of Geobacillus sp. Y412MC52 and Geobacillus sp. Y412MC61 to other members of the family. MC52 and MC61 both contain eight annotated 16S rRNA genes. The 16S rRNA genes located at MC52 genome coordinates 11,820 through 13,365 and MC61 genome coordinates 10,516 through 12,061 were used for tree construction. Trees constructed with the remaining seven MC52 16S rRNA genes were identical to the tree shown here. The phylogeny was determined using the described 16S rRNA gene sequences, 16S rRNA gene sequences of the type strains of all validly described Geobacillus species and full-length 16S rRNA gene sequences of Geobacillus species present in GenBank. The 16S rRNA gene sequences were aligned using MUSCLE [20], pairwise distances were estimated using the Maximum Composite Likelihood approach, and initial trees for heuristic search were obtained automatically by applying the Neighbour-Joining method in MEGA 5 [21]. The alignment and heuristic trees were then used to infer the phylogeny using the Maximum Likelihood method based on Tamura-Nei [22]. The phylogenetic tree (Fig. 2) indicates that MC52, MC61 and Geobacillus sp. C56-T3 cluster separately from other validly named species.
Fig. 2

The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [22]. The bootstrap consensus tree inferred from 500 replicates [45] is taken to represent the evolutionary history of the taxa analyzed [45]. Branches corresponding to partitions reproduced in less than 50 % bootstrap replicates are collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches [45]. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The analysis involved 26 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 1271 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [21]. The type strains of all validly described species are included (NCBI accession numbers): G. caldoxylosilyticus ATCC700356T (AF067651), G. galactosidasius CF1BT (AM408559), G. jurassicus DS1T (FN428697), G. kaustophilus NCIMB8547T (X60618), G. lituanicus N-3T (AY044055), G. stearothermophilus R-35646T (FN428694), G. subterraneus 34T (AF276306), G. thermantarcticus DSM9572T (FR749957), G. thermocatenulatus BGSC93A1T (AY608935), G. thermodenitrificans R-35647T (FN538993), G. thermoglucosidasius BGSC95A1T (FN428685), G. thermoleovorans DSM5366T (Z26923), G. toebii BK-1T (FN428690), G. uzenensis UT (AF276304) and G. vulcani 3S-1T (AJ293805). Additional16S rRNA sequences of G. thermoleovorans strain NP54 (JN871595G. thermoleovorans strain NP33 (JQ343209), G. thermoleovorans strain LEH-1 (NR_036985), G. thermocatenulatus strain DSM 730 (NR_119305), G. vulcani 3S-1 (NR_025426), G. strain C56-T3 (NC_014206), G. strain GHH01 (NC_020210), G. strain C56-YS93 (CP002835), and G. strain G11MC16 (CP002835)

The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [22]. The bootstrap consensus tree inferred from 500 replicates [45] is taken to represent the evolutionary history of the taxa analyzed [45]. Branches corresponding to partitions reproduced in less than 50 % bootstrap replicates are collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches [45]. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The analysis involved 26 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 1271 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [21]. The type strains of all validly described species are included (NCBI accession numbers): G. caldoxylosilyticus ATCC700356T (AF067651), G. galactosidasius CF1BT (AM408559), G. jurassicus DS1T (FN428697), G. kaustophilus NCIMB8547T (X60618), G. lituanicus N-3T (AY044055), G. stearothermophilus R-35646T (FN428694), G. subterraneus 34T (AF276306), G. thermantarcticus DSM9572T (FR749957), G. thermocatenulatus BGSC93A1T (AY608935), G. thermodenitrificans R-35647T (FN538993), G. thermoglucosidasius BGSC95A1T (FN428685), G. thermoleovorans DSM5366T (Z26923), G. toebii BK-1T (FN428690), G. uzenensis UT (AF276304) and G. vulcani 3S-1T (AJ293805). Additional16S rRNA sequences of G. thermoleovorans strain NP54 (JN871595G. thermoleovorans strain NP33 (JQ343209), G. thermoleovorans strain LEH-1 (NR_036985), G. thermocatenulatus strain DSM 730 (NR_119305), G. vulcani 3S-1 (NR_025426), G. strain C56-T3 (NC_014206), G. strain GHH01 (NC_020210), G. strain C56-YS93 (CP002835), and G. strain G11MC16 (CP002835)

Genome sequencing and annotation

Genome project history

Y412MC52 was selected for sequencing on the basis of its biotechnological potential as part of the U.S. Department of Energy Genomic Science program (formerly Genomics:GTL). The genome sequence is deposited in the Genomes On Line Database [23, 24] (GOLD ID = Gc01757), and in GenBank (NCBI Reference Sequence = CP002442.1). Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute. A summary of the project information and its association with MIGS identifiers is shown in Table 2.
Table 2

Project information

MIGS IDPropertyTerm
MIGS 31Finishing qualityFinished
MIGS-28Libraries used6 kb and 24 kb
MIGS 29Sequencing platforms454 Titanium, Illumina GAii
MIGS 31.2Fold coverage5.8
MIGS 30AssemblersPhred/Phrap/Consed
MIGS 32Gene calling methodProdigal, GenePRIMP
Locus tagGYMC52
Genbank IDCP002835.1
GenBank date of releaseJuly 1, 2011
GOLD IDGc01757
BIOPROJECTPRJNA30797
MIGS 13Source material identifierBGSCID: 96A11
Project relevanceBiotechnological
Project information

Growth conditions and genomic DNA preparation

For preparation of genomic DNA, cultures of Y51MC23 were grown from a single colony in YTP-2 in 1000 ml medium in a 2000 ml Erlenmeyer flask at 70 °C, 200 rpm for 18 h. Cells were collected by centrifugation at 4 °C and stored frozen until used for DNA preparation. The cell concentrate was lysed using a combination of SDS and proteinase K, and genomic DNA was isolated using a phenol/chloroform extraction method [25]. The genomic DNA was precipitated, and treated with RNase to remove residual contaminating RNA.

Genome sequencing and assembly

The genome of Geobacillus sp. Y412MC52 was sequenced at the Joint Genome Institute (JGI) using a combination of Sanger, Illumina and 454 technologies [26]. An Illumina GAii shotgun library with reads of 664 Mb, a 454 Titanium draft library with average read length of 250 bp, and two Sanger libraries with average insert size of 3 and 8 Kb were generated for this genome. Illumina sequencing data was assembled with VELVET [27], and the consensus sequences were shredded into 1.5 Kb overlapped fake reads and assembled together with the 454 data. Draft assemblies were based on 95.5 MB 454 draft data. Newbler parameters are - consed -a 50–1 350 -g -m -ml 20. The initial Newbler assembly contained 40 contigs in 18 scaffolds. We converted the initial 454 assembly into a phrap assembly by making fake reads from the consensus, collecting the read pairs in the 454 paired end library. The Phred/Phrap/Consed software package was used for sequence assembly and quality assessment [28-30] in the following finishing process. Illumina data was used to correct potential base errors and increase consensus quality using a software Polisher developed at JGI (Alla Lapidus, unpublished). After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with gapResolutioin (Cliff Han, unpublished), Dupfinisher, or sequencing cloned bridging PCR fragments with subcloning. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks. A total of 1069 additional reactions and 9 shatter libraries were necessary to close gaps and to raise the quality of the finished sequence. The overall average error rate achieved was 0.01 errors/10 Kb.

Genome annotation

Genes were identified using Prodigal [31] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [32]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. These data sources were combined to assert a product description for each predicted protein. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE [32], RNAMMer [33], Rfam [34], TMHMM [35], and signalP [35].

Genome properties

The genome of Geobacillus sp. Y412MC52 consists of one circular chromosome of 3,628,883 bp (Table 3 and Fig. 3) and an average G + C content of 52 % and one circular plasmid of 45,057 bp and an average G + C content of 45 % (Table 4). There are 88 tRNA genes, 25 rRNA genes and 3 “other” identified RNA genes. There are 3634 predicted protein-coding regions and 175 pseudogenes in the genome. A total of 2569 genes (68.51 %) have been assigned a predicted function while the rest have been designated as hypothetical proteins (Table 4). The numbers of genes assigned to each COG functional category are listed in Table 5. About 35 % of the annotated genes were not assigned to a COG or have an unknown function.
Table 3

Summary of genome: 1 chromosome and 1 plasmid

LabelSize (Mb)TopologyINSDC identifierRefSeq ID
Chromosome3.62CircularCP002442NC_014915
Plasmid 10.045CircularCP002443NC_014916
Fig. 3

Graphical circular map of the Y412MC52 chromosome. From outside to the center: Genes on forward strand (color by COG categories) Genes on reverse strand (color by COG categories) RNA genes (tRNAs green, rRNAs red, other RNAs black) GC content, GC skew

Table 4

Genome statistics

AttributeValue
Genome size (bp)3,673,940
DNA coding (bp)3,199,671
DNA G + C (bp)1,922,887
DNA scaffolds2
Total genes3750
Protein-coding genes3634
RNA genes116
Pseudo genes175
Genes in internal clusters1984
Genes with function prediction2569
Genes assigned to COGs2414
Genes with Pfam domains3048
Genes with signal peptides174
Genes with transmembrane helices873
CRISPR repeats6
Table 5

Number of genes associated with general COG functional categories

CodeValuePercentDescription
J1495.59Translation, ribosomal structure and biogenesis
A00RNA processing and modification
K1806.76Transcription
L1565.86Replication, recombination and repair
B10.04Chromatin structure and dynamics
D311.16Cell cycle control, cell division, chromosome partitioning
V361.35Defense mechanisms
T1244.65Signal transduction mechanisms
M1043.90Cell wall/membrane/envelope biogenesis
N582.18Cell motility
U461.73Intracellular trafficking, secretion, and vesicular transport
O813.04Posttranslational modification, protein turnover, chaperones
C1575.89Energy production and conversion
G1937.24Carbohydrate transport and metabolism
E2589.68Amino acid transport and metabolism
F712.07Nucleotide transport and metabolism
H1264.73Coenzyme transport and metabolism
I1184.43Lipid transport and metabolism
P1214.54Inorganic ion transport and metabolism
Q702.63Secondary metabolites biosynthesis, transport and catabolism
R30411.41General function prediction only
S28010.51Function unknown
133635.63Not in COGs

The total is based on the total number of protein coding genes in the annotated genome

Summary of genome: 1 chromosome and 1 plasmid Graphical circular map of the Y412MC52 chromosome. From outside to the center: Genes on forward strand (color by COG categories) Genes on reverse strand (color by COG categories) RNA genes (tRNAs green, rRNAs red, other RNAs black) GC content, GC skew Genome statistics Number of genes associated with general COG functional categories The total is based on the total number of protein coding genes in the annotated genome

Insights from the genome sequence

Average Nucleotide Identity (ANI) calculations [36] were used to compare the genomes of MC52 and other sequenced Geobacillus species. The comparison of the MC52 genome to the other genomes (Table 6) confirms the phylogenetic tree obtained using 16S rRNA genes. MC52 is most closely related to MC61 (100 % identity) followed by Geobacillus sp. C56-T3 (98.3 %). These values are above the species cutoff value of 98.2 % to 99.0 % [37] indicating that these are most likely strains of the same species. The ANI values for all other sequenced strains are below 98 %, suggesting that MC52, MC61, and C56-T3 represent members of a new species. Comparison of genes shows MC52 and MC61 share 3329 genes (Fig. 4). MC52 has 52 unique genes and MC61 has 48. These unique genes code mostly for hypothetical proteins and are randomly distributed throughout both genomes. Alignment of the MC52 and M61 genomes using progressiveMauve [38] shows one predominant, four medium, and two small Locally Collinear Blocks of conserved genes (Fig. 5). In Y412MC61, two of the medium blocks precede the predominant block, while these blocks follow the predominant block in Y412MC52. In addition to having alternate locations within these genomes, these two blocks reverse their orientation between the two genomes. Taken together, these results indicate that MC52 and M61 are not two different isolates of the same strain, but are two closely related strains of the same species with a unique relationship to each other.
Table 6

Average Nucleotide Identity with MC52

StrainANI
Geobacillus sp. Y412MC61100
Geobacillus sp. C56-T398.3
Geobacillus sp. CAMR1273997.6
Geobacillus sp. MAS196.9
G. kaustophilus HTA42696.7
Geobacillus sp. A896.7
G. thermoleovorans CCB_US3_UF596.7
G. thermoleovorans B2396.7
Geobacillus sp. FW2396.7
G. kaustophilus GBlys96.6
Geobacillus sp. GHH0196.5
G. kaustophilus NBRC 10244596.4
Geobacillus sp. WSUCF196.2
Geobacillus sp. CAMR542096.1
G. thermocatenulatus GS-194.7
G. vulcani PSS191.3
G. stearothermophilus 2289.6

Values obtained from IMG database [51]

Fig. 4

Venn Diagram of Y412MC52 and Y412MC61 determined using software at https://edgar.computational.bio.uni-giessen.de

Fig. 5

Prophage insert in Y412MC52 identified using phast [41, 42]

Average Nucleotide Identity with MC52 Values obtained from IMG database [51] Venn Diagram of Y412MC52 and Y412MC61 determined using software at https://edgar.computational.bio.uni-giessen.de Prophage insert in Y412MC52 identified using phast [41, 42] MC52 possesses a 45-gene arabinan and xylan degradation cluster that allows degradation of hemicellulose components of biomass (GYMC52_1817 through GYMC52_1867). The cluster contains one secreted xylanase (GYMC52_1825) and one secreted arabinase (GYMC52_1858), in agreement with the experimental results. The organization of the xylan degradation portion of the cluster matches the glucuronic acid utilization cluster described for [39]. The arabinan degradation part of the cluster is smaller than the arabinan cluster of [40], lacking araP, araS, araT, araE, araG and araH genes. MC52 also possesses three clusters annotated for degradation of aromatic acid molecules, GYMC52_1956 through GYMC52_1962, GYMC52_1990 through GYMC52_2001, and GYMC52_3134 through GYMC52_3141. Geobacillus species utilize xylan by transporting large xylooligosaccharides into the cell and then degrading these xylooligosaccharides intracellularly [39]. These aromatic acid degradation clusters may allow degradation and utilization of lignin fragments such as ferulic, sinapic, and cinnamic acids that are attached to the xylooligosaccharides. Utilization of these aromatic acids increases the metabolic energy obtained from the fragments and eliminates potential toxicity of these aromatic acids. Transport and metabolic clusters for utilization of cellobiose and related oligosaccharides (GYMC52_1797 through GYMC52_1801), α- and β-galactooligosaccharides (GYMC52_12121 through GYMC52_2132), and α-1,4-linked glucooligosaccharides (GYMC52_06321 through GYMC52_0637) were identified, confirming the experimental observations of the corresponding enzymatic activities. The smaller arabinan cluster in MC52 is the result of an 11-gene insert (GYMC52_1870 through GYMC52_1880) coding for a peptide utilization cluster that replaces part of the arabinan cluster. This peptide utilization cluster is found in only a few strains, including Geobacillus sp. Y412MC61 (GYMC61_2740 through GYMC52_2750), Geobacillus sp. Y4.1MC1 (GY4MC1_2192 through GY4MC1_2202), and Geobacillus sp. C56-YS93 (Geoth_2276 through Geoth_2288). The cluster does not code for a secreted protease or peptidase, but contains an annotated five-gene ABC peptide transporter system and two intracellular peptidases. strain Y412MC52 possesses a 54.4 Kb, 73-gene insert that codes for 47 phage genes identified using phast [41, 42] phage identification software (Fig. 5), an identical insert is present in Y412MC61. The prophage insert has 39 % coverage and 83 % identity to phage E2 (GenBank NC_009552) [43], isolated from a deep sea location. The phage is not present in strain C56-YS93 also isolated from Obsidian Hot Spring, indicating the phage may have a limited range of hosts in the hot spring.

Conclusions

Obsidian Hot Spring is home to a wide variety of organisms, including Y412MC10 [19], C56-YS93 (manuscript submitted) and Geobacillus sp. Y412MC52 and Y412MC61. Especially of interest is the isolation of both low G + C (C56-YS93, 43.9 % G + C) and high G + C (Y412MC52 and Y412MC61, 52.3 % G + C) xylanolytic Geobacillus species from the same hot spring sample. This suggests that the high and low G + C Geobacillus species may occupy separate ecological niches that allow each strain to thrive in the same site. Based on the genomic analysis, Geobacillus sp. Y412MC52 appears to utilize only some biomass components such as xylan, arabinoglucuronoxylan, and the arabinan component of arabinogalactan. MC52 shows no genes coding for utilization of other biomass components such as cellulose, mannan, glucomannan, galactomannan, chitin, or pectin, confirming experimental observations. The limited range of substrates suggests MC52 functions as part of a microbial consortium in degrading biomass. The presence of aromatic acid metabolic clusters and the lack of mannan-utilization clusters suggest the organism has a preference for utilization of hemicellulose derived from grassy plants rather than woody plants. Based on 16S rRNA genes and average nucleotide identity, Geobacillus sp. Y412MC52 and the related Geobacillus sp. Y412MC61 appear to be members of a new species of . The presence of multiple 16S rRNA genes in Geobacillus species as well as the small differences observed in 16S rRNA gene sequences makes assignment of strains to new or existing species difficult. Utilization of recN sequences [44] has been proposed as an alternative to 16S rRNA gene sequences, but it is unclear if this leads to a more accurate description of the distinct species. Sequencing of additional genomes and in-depth microbiological characterizations are needed to clarify the relationships among Geobacillus species.
  48 in total

1.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

Authors:  A Krogh; B Larsson; G von Heijne; E L Sonnhammer
Journal:  J Mol Biol       Date:  2001-01-19       Impact factor: 5.469

2.  Rfam: an RNA family database.

Authors:  Sam Griffiths-Jones; Alex Bateman; Mhairi Marshall; Ajay Khanna; Sean R Eddy
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

3.  MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Authors:  Robert C Edgar
Journal:  Nucleic Acids Res       Date:  2004-03-19       Impact factor: 16.971

4.  Taxonomic study of aerobic thermophilic bacilli: descriptions of Geobacillus subterraneus gen. nov., sp. nov. and Geobacillus uzenensis sp. nov. from petroleum reservoirs and transfer of Bacillus stearothermophilus, Bacillus thermocatenulatus, Bacillus thermoleovorans, Bacillus kaustophilus, Bacillus thermodenitrificans to Geobacillus as the new combinations G. stearothermophilus, G. th.

Authors:  T N Nazina; T P Tourova; A B Poltaraus; E V Novikova; A A Grigoryan; A E Ivanova; A M Lysenko; V V Petrunyaka; G A Osipov; S S Belyaev; M V Ivanov
Journal:  Int J Syst Evol Microbiol       Date:  2001-03       Impact factor: 2.747

5.  The glucuronic acid utilization gene cluster from Bacillus stearothermophilus T-6.

Authors:  S Shulami; O Gat; A L Sonenshein; Y Shoham
Journal:  J Bacteriol       Date:  1999-06       Impact factor: 3.490

6.  Three novel halotolerant and thermophilic Geobacillus strains from shallow marine vents.

Authors:  Teresa L Maugeri; Concetta Gugliandolo; Daniela Caccamo; Erko Stackebrandt
Journal:  Syst Appl Microbiol       Date:  2002-10       Impact factor: 4.022

7.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology.

Authors:  Evelyn Camon; Michele Magrane; Daniel Barrell; Vivian Lee; Emily Dimmer; John Maslen; David Binns; Nicola Harte; Rodrigo Lopez; Rolf Apweiler
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

8.  [Geobacillus uralicus, a new species of thermophilic bacteria].

Authors:  N A Popova; Iu A Nikolaev; T P Turova; A M Lysenko; G A Osipov; N V Verkhovtseva; N S Panikov
Journal:  Mikrobiologiia       Date:  2002 May-Jun

9.  Thermophilic protease-producing Geobacillus from Buranga hot springs in Western Uganda.

Authors:  Joseph F Hawumba; Jacques Theron; Volker S Brözel
Journal:  Curr Microbiol       Date:  2002-08       Impact factor: 2.188

10.  Geobacillus toebii sp. nov., a novel thermophilic bacterium isolated from hay compost.

Authors:  M H Sung; H Kim; J W Bae; S K Rhee; C O Jeon; K Kim; J J Kim; S P Hong; S G Lee; J H Yoon; Y H Park; D H Baek
Journal:  Int J Syst Evol Microbiol       Date:  2002-11       Impact factor: 2.747

View more
  10 in total

1.  Decreased growth of wild soil microbes after 15 years of transplant-induced warming in a montane meadow.

Authors:  Alicia M Purcell; Michaela Hayer; Benjamin J Koch; Rebecca L Mau; Steven J Blazewicz; Paul Dijkstra; Michelle C Mack; Jane C Marks; Ember M Morrissey; Jennifer Pett-Ridge; Rachel L Rubin; Egbert Schwartz; Natasja C van Gestel; Bruce A Hungate
Journal:  Glob Chang Biol       Date:  2021-10-15       Impact factor: 13.211

2.  Erratum to: Complete genome sequences of Geobacillus sp. Y412MC52, a xylan-degrading strain isolated from obsidian hot spring in Yellowstone National Park.

Authors:  Phillip Brumm; Miriam L Land; Loren J Hauser; Cynthia D Jeffries; Yun-Juan Chang; David A Mead
Journal:  Stand Genomic Sci       Date:  2016-01-26

3.  Insights into the Geobacillus stearothermophilus species based on phylogenomic principles.

Authors:  S A Burgess; S H Flint; D Lindsay; M P Cox; P J Biggs
Journal:  BMC Microbiol       Date:  2017-06-26       Impact factor: 3.605

4.  Unraveling the microbial and functional diversity of Coamo thermal spring in Puerto Rico using metagenomic library generation and shotgun sequencing.

Authors:  Ricky Padilla-Del Valle; Luis R Morales-Vale; Carlos Ríos-Velázquez
Journal:  Genom Data       Date:  2016-12-23

5.  CLAME: a new alignment-based binning algorithm allows the genomic description of a novel Xanthomonadaceae from the Colombian Andes.

Authors:  Andres Benavides; Juan Pablo Isaza; Juan Pablo Niño-García; Juan Fernando Alzate; Felipe Cabarcas
Journal:  BMC Genomics       Date:  2018-12-11       Impact factor: 3.969

6.  The Geobacillus Pan-Genome: Implications for the Evolution of the Genus.

Authors:  Oliver K Bezuidt; Rian Pierneef; Amin M Gomri; Fiyin Adesioye; Thulani P Makhalanyane; Karima Kharroub; Don A Cowan
Journal:  Front Microbiol       Date:  2016-05-24       Impact factor: 5.640

7.  Complete Genome Sequence of Geobacillus thermoglucosidasius NCIMB 11955, the Progenitor of a Bioethanol Production Strain.

Authors:  Lili Sheng; Ying Zhang; Nigel P Minton
Journal:  Genome Announc       Date:  2016-09-29

8.  Microbial diversity of thermophiles with biomass deconstruction potential in a foliage-rich hot spring.

Authors:  Li Sin Lee; Kian Mau Goh; Chia Sing Chan; Geok Yuan Annie Tan; Wai-Fong Yin; Chun Shiong Chong; Kok-Gan Chan
Journal:  Microbiologyopen       Date:  2018-03-30       Impact factor: 3.139

9.  Thermophile Lytic Enzyme Fusion Proteins that Target Clostridium perfringens.

Authors:  Steven M Swift; Kevin P Reid; David M Donovan; Timothy G Ramsay
Journal:  Antibiotics (Basel)       Date:  2019-11-08

10.  DATMA: Distributed AuTomatic Metagenomic Assembly and annotation framework.

Authors:  Andres Benavides; Friman Sanchez; Juan F Alzate; Felipe Cabarcas
Journal:  PeerJ       Date:  2020-09-03       Impact factor: 2.984

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.