| Literature DB >> 28174620 |
Christian Schäfers1, Saskia Blank1, Sigrid Wiebusch1, Skander Elleuche1, Garabed Antranikian1.
Abstract
Thermus brockianus strain GE-1 is a thermophilic, Gram-negative, rod-shaped and non-motile bacterium that was isolated from the Geysir geothermal area, Iceland. Like other thermophiles, Thermus species are often used as model organisms to understand the mechanism of action of extremozymes, especially focusing on their heat-activity and thermostability. Genome-specific features of T. brockianus GE-1 and their properties further help to explain processes of the adaption of extremophiles at elevated temperatures. Here we analyze the first whole genome sequence of T. brockianus strain GE-1. Insights of the genome sequence and the methodologies that were applied during de novo assembly and annotation are given in detail. The finished genome shows a phred quality value of QV50. The complete genome size is 2.38 Mb, comprising the chromosome (2,035,182 bp), the megaplasmid pTB1 (342,792 bp) and the smaller plasmid pTB2 (10,299 bp). Gene prediction revealed 2,511 genes in total, including 2,458 protein-encoding genes, 53 RNA and 66 pseudo genes. A unique genomic region on megaplasmid pTB1 was identified encoding key enzymes for xylan depolymerization and xylose metabolism. This is in agreement with the growth experiments in which xylan is utilized as sole source of carbon. Accordingly, we identified sequences encoding the xylanase Xyn10, an endoglucanase, the membrane ABC sugar transporter XylH, the xylose-binding protein XylF, the xylose isomerase XylA catalyzing the first step of xylose metabolism and the xylulokinase XylB, responsible for the second step of xylose metabolism. Our data indicate that an ancestor of T. brockianus obtained the ability to use xylose as alternative carbon source by horizontal gene transfer.Entities:
Keywords: Single molecule real-time sequencing; Thermophiles; Thermus; Thermus brockianus; Whole genome sequence; Xylan degradation; Xylose metabolism; de novo assembly
Year: 2017 PMID: 28174620 PMCID: PMC5292009 DOI: 10.1186/s40793-017-0225-7
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Fig. 1Unrooted phylogenetic tree based on 16S rRNA encoding sequences from 14 species of the genus Thermus. The phylogenetic tree was generated using the program package PHYLIP (version 3.695) [58] and TreeView X [59], based on a multiple sequence alignment (1,345 nts) that was generated with clustalX [60]. The number of nucleotide replacements at each position in the sequence was estimated with the DNADIST program and trees were constructed using NEIGHBOR. Bootstrap analysis was done using 1,000 iterations. CONSENSE was used to produce a majority rule consensus tree. The position of the isolate Thermus brockianus strain GE-1 is indicated in red. The 16S rRNA encoding sequence from Marinithermus hydrothermalis was used as outgroup. Accession numbers of all sequences are indicated in the figure. For the following species sequenced genomes are available at NCBI (number of available genome sequences are given in square brackets): T. caliditerrae [1], T. amyloliquefaciens [1], T. antranikianii [1], T. scotoductus [4], T. igniterrae [1], T. brockianus [1, this study], T. aquaticus [3], T. islandicus [1], T. thermophilus [5], T. filiformis [1], T. oshimai [2] and M. hydrothermalis [1]
Fig. 2Photomicrograph of T. brockianus GE-1
Classification and general features of T. brockianus GE-1 according to MIGS [20]
| MIGS ID | Property | Term | Evidence codea |
|---|---|---|---|
| Classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| Strain GE-1 | IDA | ||
| Gram stain | Negative | IDA | |
| Cell shape | Rod | IDA | |
| Motility | Non-motile | NAS | |
| Sporulation | Non-sporulating | NAS | |
| Temperature range | 45-83 °C | TAS [ | |
| Optimum temperature | 70 °C | TAS [ | |
| pH range; Optimum | pH 7.0 – pH 8.0 | NAS | |
| Carbon source | Diverse set of sugars | IDA | |
| MIGS-6 | Habitat | Terrestrial hot springs | IDA |
| MIGS-6.3 | Salinity | Not reported | |
| MIGS-22 | Oxygen requirement | Aerobic | NAS |
| MIGS-15 | Biotic relationship | Free-living | NAS |
| MIGS-14 | Pathogenicity | Non-pathogen | NAS |
| MIGS-4 | Geographic location | Geysir geothermal area, Iceland | IDA |
| MIGS-5 | Sample collection | 1992 | IDA |
| MIGS-4.1 | Latitude | Not reported | - |
| MIGS-4.2 | Longitude | Not reported | - |
| MIGS-4.4 | Altitude | Not reported | - |
aEvidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [57]
Project information
| MIGS ID | Property | Term |
|---|---|---|
| MIGS 31 | Finishing quality | Finished genome |
| MIGS-28 | Libraries used | PacBio RS library |
| MIGS 29 | Sequencing platforms | PacBio RS II |
| MIGS 31.2 | Fold coverage | 156.56x PacBio |
| MIGS 30 | Assemblers | HGAP2 version 2.3.0 |
| MIGS 32 | Gene calling method | Prodigal v2.6 |
| Locus Tag | A0O31 | |
| Genbank ID | CP016312, CP016313, CP016314 | |
| Genbank Date of Release | November 17, 2016 | |
| GOLD ID | Gp0134387 | |
| BIOPROJECT | PRJNA314486 | |
| MIGS 13 | Source Material Identifier | GE_001 |
| Project relevance | Biotechnological |
Fig. 3Graphical circular maps of T. brockianus GE-1 replicons. The complete genome of T. brockianus GE-1 is composed of a single circular chromosome that consists of 2,035,182 bp (a) and two circular plasmids, pTB1 (b) and pTB2 (c). The size of megaplasmid pTB1 is 342,792 bp and 10,299 bp for pTB2. These maps were generated by using CGView [38]. Data shown on those maps will be explained from the inside to the outside: Second circle represents the GC skew of both strands (green for plus strand, purple for minus strand) and the fourth circle shows the GC content. The sixth and seventh circle exhibits the protein-encoding genes for the plus and minus strand as well as RNA features. All tRNAs are highlighted in orange, rRNAs are shown in light purple and other RNAs are represented by a grey color
Summary of the genome of Thermus brockianus GE-1: 1 chromosome and 2 plasmids
| Label | Size (Mb) | Topology | INSDC identifier | RefSeq ID |
|---|---|---|---|---|
| Chromosome | 2.035 | circular | CP016312 | - |
| pTB1 | 0.343 | circular | CP016313 | - |
| pTB2 | 0.010 | circular | CP016314 | - |
Genome statistics
| Attribute | Value | % of Totala |
|---|---|---|
| Genome size (bp) | 2,388,273 | 100.0 |
| DNA coding (bp) | 2,217,408 | 92.9 |
| DNA G + C (bp) | 1,597,811 | 67.0 |
| DNA scaffolds | 3 | 100.0 |
| Total genes | 2,511 | 100.0 |
| Protein coding genes | 2,458 | 97.9 |
| RNA genes | 53 | 2.1 |
| Pseudo genesb | 66 | 2.6 |
| Genes in internal clusters | - | - |
| Genes with function prediction | 1,834 | 73.0 |
| Genes assigned to COGs | 1,948 | 77.6 |
| Genes with Pfam domains | 1,736 | 69.1 |
| Genes with signal peptides | 112 | 4.5 |
| Genes with transmembrane helices | 561 | 22.3 |
| CRISPR repeats | 8 | 0.3 |
aThe total is based on either the size of the genome in base pairs or the total genes in the annotated genome
bPseudo genes may also be counted as protein coding or RNA genes, so is not additive under total gene count
Number of genes associated with general COG functional categories
| Code | Value | %age | Description |
|---|---|---|---|
| J | 143 | 5.81 | Translation, ribosomal structure and biogenesis |
| A | 0 | 0.00 | RNA processing and modification |
| K | 87 | 3.54 | Transcription |
| L | 106 | 4,31 | Replication, recombination and repair |
| B | 2 | 0.08 | Chromatin structure and dynamics |
| D | 28 | 1.14 | Cell cycle control, Cell division, chromosome partitioning |
| V | 27 | 1.10 | Defense mechanisms |
| T | 71 | 2.89 | Signal transduction mechanisms |
| M | 84 | 3.42 | Cell wall/membrane biogenesis |
| N | 14 | 0.57 | Cell motility |
| U | 18 | 0.73 | Intracellular trafficking and secretion |
| O | 84 | 3.42 | Posttranslational modification, protein turnover, chaperones |
| C | 155 | 6.31 | Energy production and conversion |
| G | 123 | 5.00 | Carbohydrate transport and metabolism |
| E | 207 | 8.42 | Amino acid transport and metabolism |
| F | 70 | 2.85 | Nucleotide transport and metabolism |
| H | 107 | 4.35 | Coenzyme transport and metabolism |
| I | 78 | 3.17 | Lipid transport and metabolism |
| P | 100 | 4.07 | Inorganic ion transport and metabolism |
| Q | 23 | 0.94 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 253 | 10.29 | General function prediction only |
| S | 168 | 6.83 | Function unknown |
| - | 510 | 20.75 | Not in COGs |
The total is based on the total number of protein coding genes in the genome
Fig. 4Genomic organization of genes encoding proteins for xylan and cellulose degradation as well as xylose metabolism located on the megaplasmid pTB1 of T. brockianus GE-1. Sizes, localization and orientation of the genes on megaplasmid pTB1 section are displayed proportionally. All genes highlighted with a star are not detectable in any other Thermus spp. genome except T. brockianus GE-1. Genes marked with a diamond are conserved in Thermus spp. ABC transporter system associated genes include sugar ABC transporter substrate-binding protein and two sugar ABC transporter permeases