| Literature DB >> 26457128 |
Beng Soon Teh1, Nyok-Sean Lau2, Fui Ling Ng3, Ahmad Yamin Abdul Rahman3, Xuehua Wan4, Jennifer A Saito4, Shaobin Hou4, Aik-Hong Teh2, Nazalan Najimudin3, Maqsudul Alam5.
Abstract
Thermus sp. strain CCB_US3_UF1 is a thermophilic bacterium of the genus Thermus, a member of the family Thermaceae. Members of the genus Thermus have been widely used as a biological model for structural biology studies and to understand the mechanism of microbial adaptation under thermal environments. Here, we present the complete genome sequence of Thermus sp. CCB_US3_UF1 isolated from a hot spring in Malaysia, which is the fifth member of the genus Thermus with a completely sequenced and publicly available genome (Genbank date of release: December 2, 2011). Thermus sp. CCB_US3_UF1 has the third largest genome within the genus. The complete genome comprises of a chromosome of 2.26 Mb and a plasmid of 19.7 kb. The genome contains 2279 protein-coding and 54 RNA genes. In addition, its genome revealed potential pathways for the synthesis of secondary metabolites (isoprenoid) and pigments (carotenoid).Entities:
Keywords: Extremophile; Hot spring; Thermophile; Thermus
Year: 2015 PMID: 26457128 PMCID: PMC4599208 DOI: 10.1186/s40793-015-0053-6
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Fig. 1Phylogenetic tree highlighting the position of Thermus sp. CCB_US3_UF1 relative to the other type strains within the families Deinococcaceae and Thermaceae. Strains shown are those within the Deinococcaceae and Thermaceae having the corresponding NCBI genome project ids listed within [53]. The tree used sequences aligned by Ribosomal Database Project (RDP) aligner and Juke-Cantor corrected distance model. Distance matrix was constructed based on alignment model positions without the use of alignment insert, and a minimum comparable position of 200 was used. The tree was constructed with RDP Tree Builder that used Weighbor [54] with an alphabet size of 4 and a length size of 1000. The building of the tree involved a bootstrapping process that was repeated 100 times to generate a majority consensus tree [55]
Classification and general features of Thermus sp. CCB_US3_UF1 according to the MIGS recommendations [57]
| MIGS ID | Property | Term | Evidence codea |
|---|---|---|---|
| Classification | Domain | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species Unknown | IDA | ||
| Type strain CCB_US3_UF1 | IDA | ||
| Gram stain | Negative | IDA | |
| Cell shape | Rod | IDA | |
| Motility | Non-motile | NAS | |
| Sporulation | Non-sporulating | NAS | |
| Temperature range | Thermophile (45-83 °C) | TAS [ | |
| Optimum temperature | 60 °C | IDA | |
| pH range; Optimum | Not reported | ||
| Carbon source | Not reported | ||
| MIGS-6 | Habitat | Hot springs | IDA |
| MIGS-6.3 | Salinity | Not-reported | |
| MIGS-22 | Oxygen requirement | Aerobic | NAS |
| MIGS-15 | Biotic relationship | Free-living | NAS |
| MIGS-14 | Pathogenicity | Non-pathogen | NAS |
| MIGS-4 | Geographic location | Ulu Slim, Perak, Malaysia | IDA |
| MIGS-5 | Sample collection | 2009 | IDA |
| MIGS-4.1 | Latitude | 3.898822°N | IDA |
| MIGS-4.2 | Longitude | 101.497911°E | IDA |
| MIGS-4.4 | Altitude | 51 m | IDA |
aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [65]
Fig. 2Transmission electron micrograph of Thermus sp. CCB_US3_UF1
Project information
| MIGS ID | Project | Term |
|---|---|---|
| MIGS 31 | Finishing quality | Finished |
| MIGS-28 | Libraries used | Two genomic libraries: one 454 PE library (3 kb insert size), one Illumina library (3 kb insert size) |
| MIGS 29 | Sequencing platforms | Illumina GA II×, 454 GS FLX Titanium |
| MIGS 31.2 | Fold coverage | 115× (Illumina); 21.14× (454) |
| MIGS 30 | Assemblers | Newbler v 2.3, burrows-wheeler alignment (BWA) |
| MIGS 32 | Gene calling method | Glimmer 3.02 |
| Locus tag | TCCBUS3UF1 | |
| Genbank ID | CP003126, CP003127 | |
| GenBank date of release | December 2, 2011 | |
| GOLD ID | Gp0013444 | |
| BIOPROJECT | PRJN76491 | |
| MIGS 13 | Source material identifier | CCB_US3_UF1 |
| Project relevance | Biotechnology, pathway, extremophile |
Fig. 3Graphical circular map of the Thermus sp. CCB_US3_UF1 chromosome and plasmid pTCCB09. a Chromosome. b Plasmid. From the inside to outside, the second and fourth circles show GC skew and G + C content respectively. The sixth and seventh circles show protein coding genes in positive and negative strands and RNA genes (tRNAs red, rRNAs light purple, other RNAs grey). This figure was generated by CGView [56]
Genome statistics
| Attribute | Value | % of Totala |
|---|---|---|
| Genome size (bp) | 2,263,488 | 100.00 |
| DNA coding (bp) | 2,137,656 | 94.44 |
| DNA G + C (bp) | 1,552,285 | 68.58 |
| DNA scaffolds | 1 | 100.00 |
| Total genesb | 2,333 | 100.00 |
| Protein coding genes | 2,279 | 97.64 |
| RNA genes | 54 | 2.31 |
| Pseudo genes | 1 | 0.04 |
| Genes in internal clusters | 822 | 36.07 |
| Genes with function prediction | 2,072 | 90.92 |
| Genes assigned to COGs | 2,098 | 89.89 |
| Genes with Pfam domains | 1,469 | 64.46 |
| Genes with signal peptides | 113 | 4.96 |
| Genes with transmembrane helices | 460 | 20.18 |
| CRISPR repeats | 8 | 0.34 |
aThe total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome
bPseudogenes may also be counted as protein coding or RNA genes, so their number is not additive under the total gene count
Number of genes associated with general COG functional categories
| Code | Value | % agea | Description |
|---|---|---|---|
| J | 147 | 6.4 | Translation, ribosomal structure and biogenesis |
| A | 23 | 1.0 | RNA processing and modification |
| K | 97 | 4.2 | Transcription |
| L | 115 | 5.0 | Replication, recombination and repair |
| B | 3 | 0.1 | Chromatin structure and dynamics |
| D | 38 | 1.6 | Cell cycle control, cell division, chromosome partitioning |
| Y | 0 | 0.0 | Nuclear structure |
| V | 27 | 1.2 | Defense mechanisms |
| T | 77 | 3.3 | Signal transduction mechanisms |
| M | 91 | 3.9 | Cell wall/membrane biogenesis |
| N | 63 | 2.7 | Cell motility |
| Z | 0 | 0.0 | Cytoskeleton |
| W | 0 | 0.0 | Extracellular structures |
| U | 50 | 2.2 | Intracellular trafficking and secretion |
| O | 90 | 3.9 | Posttranslational modification, protein turnover, chaperones |
| C | 145 | 6.3 | Energy production and conversion |
| G | 138 | 6.0 | Carbohydrate transport and metabolism |
| E | 247 | 10.7 | Amino acid transport and metabolism |
| F | 71 | 3.1 | Nucleotide transport and metabolism |
| H | 115 | 5.0 | Coenzyme transport and metabolism |
| I | 95 | 4.1 | Lipid transport and metabolism |
| P | 95 | 4.1 | Inorganic ion transport and metabolism |
| Q | 56 | 2.4 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 310 | 13.4 | General function prediction only |
| S | 215 | 9.3 | Function unknown |
| - | 181 | 7.8 | Not in COGs |
aThe total is based on the total number of protein coding genes in the genome
Comparison of genome features of different species of Thermus
| Species |
|
|
|
|
|---|---|---|---|---|
| Genome size (bp) | 2,263,488 | 2,127,482 | 2,116,056 | 2,355,186 |
| G + C content (%) | 68.6 | 69.4 | 69.5 | 64.9 |
| Number of protein coding genes | 2,279 | 2,210 | 2,173 | 2,458 |
| Coding area (%) | 94.4 | 94.8 | 94.9 | 94.0 |
| Total number of genes | 2,333 | 2,263 | 2,226 | 2,511 |
| Hypothetical genes | 742 | 734 | 758 | 619 |
| Proteins with assigned function | 1,537 | 1,476 | 1,415 | 1,839 |
| rRNA | 6 | 6 | 6 | 6 |
| tRNA | 48 | 47 | 47 | 47 |
| Transposase | 13 | 18 | 18 | 22 |
| CRISPR sequences | 8 | 10 | 11 | 3 |
Table adapted from NCBI
Fig. 4Comparison of competence proteins between Thermus sp. CCB_US3_UF1 and other Thermus-related species using MAUVE alignments
Direct repeat consensus sequences of CRISPR loci
| CRISPR locus | Direct repeat consensus |
|---|---|
| 1 | GTAGTCCCCACGCACGTGGGGATGGACC |
| 2 | GTTTCAAACCCTCATAGGTACGGTCAGAAC |
| 3 | CTTTGAACCGTACCTATAAGGGTTTGAAAC |
| 4 | CTTTGAACCGTACCTATAAGGGTTTGAAAC |
| 5 | GTTGCAAAAGTGGCTTCCCCGCAAGGGGATTGCGAC |
| 6 | GTCGCAATCCCCTTACGGGGAAGCCACTTTTGCAAC |
| 7 | GTCGCAATCCCCTTACGGGGAAGCCACTTTTGCAAC |
| 8 | CGTAGTCCCCACACGCGTGGGGATGGACC |