Literature DB >> 29483968

High Quality Draft Genomes of the Type Strains Geobacillus thermocatenulatus DSM 730T, G. uzenensis DSM 23175T And Parageobacillus galactosidasius DSM 18751T.

Winnie Thabisa Ramaloko1, Nadine Koen1, Shamara Polliack1, Habibu Aliyu1,2, Pedro Humberto Lebre1, Teresa Mohr2, Florian Oswald2, Michaela Zwick2, Daniel Ray Zeigler3, Anke Neumann2, Christoph Syldatk2, Don Arthur Cowan1, Pieter De Maayer4.   

Abstract

The thermophilic 'Geobacilli' are important sources of thermostable enzymes and other biotechnologically relevant macromolecules. The present work reports the high quality draft genome sequences of previously unsequenced type strains of Geobacillus uzenensis (DSM 23175T), G. thermocatenulatus (DSM 730T) and Parageobacillus galactosidasius (DSM 18751T). Phylogenomic analyses revealed that DSM 18751T and DSM 23175T represent later heterotypic synonyms of P. toebii and G. subterraneus, respectively, while DSM 730T represents the type strain for the species G. thermocatenulatus. These genome sequences will contribute towards a deeper understanding of the ecological and biological diversity and the biotechnological exploitation of the 'geobacilli'.

Entities:  

Keywords:  Firmicutes; Geobacillus; Illumina HiSeq sequencing; Parageobacillus; phylogenomics; thermophile

Year:  2018        PMID: 29483968      PMCID: PMC5824068          DOI: 10.7150/jgen.22986

Source DB:  PubMed          Journal:  J Genomics


Introduction

The 'geobacilli' are cosmopolitan thermophilic Firmicutes that are highly adaptable and consequently have been isolated from wide range of environments, including oil wells, deserts, hot springs, compost and soils 1. The taxonomy of these bacteria has recently been re-examined through phylogenomics, resulting in the genus Geobacillus 2 being divided into two genera: Geobacillus and Parageobacillus 3. These genera have been the subject of increasing interest because of their ability to produce a wide range of thermostable enzymes, such as amylases, proteases, lipases, hemicellulolytic enzymes and other industrially and biotechnologically relevant macromolecules 4-5. The increasing availability and accessibility of complete genome sequences, together with the development of tools that allow for accurate functional annotation of genomic data, are enhancing the ways in which microorganisms can be studied and characterized 6. Furthermore, these genome sequences provide a resource for tapping into the biotechnological potential of microorganisms. Elucidating the genome sequences of type strains is especially important for resolving the taxonomic status of microorganisms 7. Currently, the genome sequences of sixty-eight Geobacillus and sixteen Parageobacillus strains are publically available. These include the genome sequences of eleven and five validly described type strains of Geobacillus and Parageobacillus, respectively. The genomes of the G. uzenensis DSM 23175T, G. thermocatenulatus DSM 730T 2 and P. galactosidasius DSM 18751T 8 were paired-end sequenced using the Illumina HiSeq platform (Illumina, Inc., San Diego, CA, USA). A total of 3.6 Gb (ca. 1,060x coverage), 3.8 Gb (ca. 1,059x coverage) and 3.7 Gb (ca. 964x coverage) of reads were generated for G. uzenensis DSM 23175T, G. thermocatenulatus DSM 730T and P. galactosidasius DSM 18751T, respectively. The reads were assembled using SPAdes 9, and the resulting contigs were further assembled using Multi-Draft based scaffolder (MeDusa3) 10 and Mauve 2.3.1 11. Finally, the genomes were annotated using RAST 12 and EggNOG 4.5.1 13 and checked for completeness using BUSCO 14. The genome sequences were assembled to high quality draft status (between two and ten contigs) and range in size between 3.56 and 3.79 Mb, coding for between 3,783 and 4,067 proteins (Table 1). A substantially lower G+C content was observed for the Parageobacillus genome (41.6%) compared to the Geobacillus spp. (51.8 and 52.2% respectively), which represents a distinguishing feature between the two genera 3. Assessment of the three genomes using the Firmicutes dataset indicated that all the genomes were ca. 99.4% complete. Classification of proteins into their COG functional categories based on EggNOG showed similar proportions of proteins in the different functional groups among the three strains (Figure 1), although a larger proportion of proteins involved in metabolism are present in the two Geobacillus isolates (Figure 1A). In particular, there are a larger proportion of proteins involved in amino acid, carbohydrate and lipid metabolism in the Geobacillus strains (Figure 1B), suggesting that greater metabolic versatility exists in the Geobacillus strains compared to P. galactosidasius DSM 18751T. By contrast, an elevated number of proteins (334 proteins; 8.21% of total proteins) involved in DNA replication, recombination and repair (Figure 1B) in P. galactosidasius DSM 18751T compared to the other strains (246 and 244 proteins for DSM 730T and DSM 23175T, respectively) may indicate a distinct mobilome exists in the former strain.
Table 1

Genome features of the sequenced Geobacillus/Parageobacillus species

SpeciesStrainGenome size (Mb)# ContigsG+C (%)# encoded proteins# RNAsIsolation sourceReference
G. thermocatenulatusDSM 730T 13.56251.83,783109Hot gas well (Russia)2
G. uzenensisDSM 23175T 13.361052.23,589115Oil field (Kazakhstan)2
P. galactosidasiusDSM 18751T 23.79641.64,067127Compost (Italy)8

1 Obtained from the Bacillus Genetic Stock Centre (BGSC) at Ohio State University, USA.

2 Obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen (DSMZ), Leibniz, Braunschweig, Germany.

Fig 1

EggNOG functional classification of proteins encoded on the three sequenced genomes. (A) Proportions (%) of proteins in each of the EggNOG super-functional categories Information processing and storage (orange), Cellular processing and signalling (yellow), Metabolism (purple) and Poorly characterized (grey). (B) Relative proportions of proteins involved in Energy metabolism (C), Amino acid transport and metabolism (E), Carbohydrate transport and metabolism (G), Lipid metabolism (I) and DNA replication, recombination and repair (L) for G. thermocatenulatus DSM 730T (blue bars), G. uzenensis DSM 23175T (maroon bars) and P. galatctosidasius DSM 18751T (green bars).

Maximum likelihood phylogenies were constructed on the basis of the core proteins conserved among 11 Geobacillus and 7 Parageobacillus genomes, including the 3 genomes sequenced in this study. A total of 1,355 conserved proteins were identified using Orthofinder 15, aligned using T-coffee 16, concatenated and trimmed using GBlocks 17 before the resulting alignment (296,082 amino acids in length) was used to construct a core genome maximum likelihood phylogeny using PhyML-SMS with SH-aLRT branch support method 18. The core protein phylogeny showed that G. thermocatenulatus DSM 730T clusters with three strains namely, G. thermocatenulatus GS-1, G. thermocatenulatus BCO2 and G. thermocatenulatus T6, in a clade previously shown to represent a distinct Geobacillus genomospecies 3. G. uzenensis DSM 23175T clusters with the type strain of G. subterraneus (DSM 13552T). P. galactosidasius DSM 18751T also clusters with the type strain of P. toebii (DSM 14590T) and two other P. toebii strains. Several phylogenomic methods, including digital DNA-DNA Hybridization (dDDH) and Average Nucleotide Identity (ANI) calculations have been developed and have been shown to accurately distinguish between strains at the species level 19-20. Pairwise BLAST-based Average Nucleotide Identity values (ANIb) were obtained using JSpecies 21, and dDDH values were calculated with the Genome-to-Genome Distance Calculator (GGDC 2.1), using formula 2 19.G. thermocatenulatus DSM 730T showed the highest similarity with G. thermocatenulatus T6 with an ANI value of 99.7% and dDDH of 93.6%, which far exceeds the species cut-off thresholds of 96% and 70% for ANI and dDDH, respectively. Comparison of the 16S rRNA gene sequences indicated that the gene from G. uzenensis DSM 23175T showed 99.9% sequence identity with that of G. subterraneus DSM 13552T, while the two genomes shared 99.6% ANI and 93.1% dDHH values. Furthermore, the 16S rRNA gene of P. galactosidasius DSM 18751T shared 99.3% sequence identity with that of P. toebii DSM 14590T. Phylogenomic analyses indicated that the two strains had ANI and dDDH values of 98.2% and 87.9%, respectively, both of which exceed the threshold values for species circumscription. Based on these phylogenomic analyses, we can conclude that P. galactosidasius DSM 18751T and G. uzenensis DSM 23175T most likely represent later heterotypic synonyms of P. toebii and G. subterraneus, respectively, rather than type strains of distinct species as previously described. Conversely, we can conclusively characterize G. thermocatenulatus DSM 730T as the type strain for the species G. thermocatenulatus. Regardless of this, these genome sequences will be of additive value towards the exploration of the diversity among the geobacilli and to further explore the biotechnological potential of these Geobacillus and Parageobacillus species.

Nucleotide sequence accession numbers

The whole genome sequences have been deposited at DDBJ/EMBL/Genbank under the accession numbers NEWK00000000 (G. thermoscatenulatus DSM 730T), NEWL00000000 (G. uzenensis DSM 13551T) and NDYL00000000 (P. galactosidasius DSM 18571T). The versions described in this paper are the first versions, NEWK01000000, NEWL01000000 and NDYL01000000, respectively.
  21 in total

1.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis.

Authors:  J Castresana
Journal:  Mol Biol Evol       Date:  2000-04       Impact factor: 16.240

Review 2.  Microbial systematics in the post-genomics era.

Authors:  Beile Gao; Radhey S Gupta
Journal:  Antonie Van Leeuwenhoek       Date:  2011-11-03       Impact factor: 2.271

3.  Taxonomic study of aerobic thermophilic bacilli: descriptions of Geobacillus subterraneus gen. nov., sp. nov. and Geobacillus uzenensis sp. nov. from petroleum reservoirs and transfer of Bacillus stearothermophilus, Bacillus thermocatenulatus, Bacillus thermoleovorans, Bacillus kaustophilus, Bacillus thermodenitrificans to Geobacillus as the new combinations G. stearothermophilus, G. th.

Authors:  T N Nazina; T P Tourova; A B Poltaraus; E V Novikova; A A Grigoryan; A E Ivanova; A M Lysenko; V V Petrunyaka; G A Osipov; S S Belyaev; M V Ivanov
Journal:  Int J Syst Evol Microbiol       Date:  2001-03       Impact factor: 2.747

4.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

Authors:  Felipe A Simão; Robert M Waterhouse; Panagiotis Ioannidis; Evgenia V Kriventseva; Evgeny M Zdobnov
Journal:  Bioinformatics       Date:  2015-06-09       Impact factor: 6.937

5.  Phylogenomic re-assessment of the thermophilic genus Geobacillus.

Authors:  Habibu Aliyu; Pedro Lebre; Jochen Blom; Don Cowan; Pieter De Maayer
Journal:  Syst Appl Microbiol       Date:  2016-10-04       Impact factor: 4.022

6.  Geobacillus galactosidasius sp. nov., a new thermophilic galactosidase-producing bacterium isolated from compost.

Authors:  Annarita Poli; Giusi Laezza; Reyhan Gul-Guven; Pierangelo Orlando; Barbara Nicolaus
Journal:  Syst Appl Microbiol       Date:  2011-06-08       Impact factor: 4.022

7.  progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement.

Authors:  Aaron E Darling; Bob Mau; Nicole T Perna
Journal:  PLoS One       Date:  2010-06-25       Impact factor: 3.240

8.  T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension.

Authors:  Paolo Di Tommaso; Sebastien Moretti; Ioannis Xenarios; Miquel Orobitg; Alberto Montanyola; Jia-Ming Chang; Jean-François Taly; Cedric Notredame
Journal:  Nucleic Acids Res       Date:  2011-05-09       Impact factor: 16.971

9.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Authors:  Ross Overbeek; Robert Olson; Gordon D Pusch; Gary J Olsen; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Fangfang Xia; Rick Stevens
Journal:  Nucleic Acids Res       Date:  2013-11-29       Impact factor: 16.971

10.  JSpeciesWS: a web server for prokaryotic species circumscription based on pairwise genome comparison.

Authors:  Michael Richter; Ramon Rosselló-Móra; Frank Oliver Glöckner; Jörg Peplies
Journal:  Bioinformatics       Date:  2015-11-16       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.