Literature DB >> 26457128

Complete genome sequence of the thermophilic Thermus sp. CCB_US3_UF1 from a hot spring in Malaysia.

Beng Soon Teh1, Nyok-Sean Lau2, Fui Ling Ng3, Ahmad Yamin Abdul Rahman3, Xuehua Wan4, Jennifer A Saito4, Shaobin Hou4, Aik-Hong Teh2, Nazalan Najimudin3, Maqsudul Alam5.   

Abstract

Thermus sp. strain CCB_US3_UF1 is a thermophilic bacterium of the genus Thermus, a member of the family Thermaceae. Members of the genus Thermus have been widely used as a biological model for structural biology studies and to understand the mechanism of microbial adaptation under thermal environments. Here, we present the complete genome sequence of Thermus sp. CCB_US3_UF1 isolated from a hot spring in Malaysia, which is the fifth member of the genus Thermus with a completely sequenced and publicly available genome (Genbank date of release: December 2, 2011). Thermus sp. CCB_US3_UF1 has the third largest genome within the genus. The complete genome comprises of a chromosome of 2.26 Mb and a plasmid of 19.7 kb. The genome contains 2279 protein-coding and 54 RNA genes. In addition, its genome revealed potential pathways for the synthesis of secondary metabolites (isoprenoid) and pigments (carotenoid).

Entities:  

Keywords:  Extremophile; Hot spring; Thermophile; Thermus

Year:  2015        PMID: 26457128      PMCID: PMC4599208          DOI: 10.1186/s40793-015-0053-6

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

spp. are Gram-negative, aerobic, non-sporulating, and rod-shaped thermophilic bacteria. was the first bacterium of the genus that was discovered in several of the hot springs in Yellowstone National Park, United States [1]. A few years later, two strains of (HB27 and HB8) were successfully isolated from thermal environments in Japan [2, 3]. To date, many strains of have been isolated from various geothermal environments such as hot springs and deep-sea hydrothermal vents. In addition to the ability to survive under thermal environments, can also thrive in environments with extreme pH values, demonstrating great capabilities for adaptation to various environmental conditions. The whole genome sequences of two strains of , HB8 and HB27, were independently completed in 2004 [4, 5]. The genome of a second species, SA-01, is also available [6]. has attracted attention as one of the model organisms for structural biology studies because protein complexes from extremophiles are easier to crystallize than their mesophilic counterparts [7]. Some of the breakthrough examples of large complexes from thermophiles that have been crystallized are structures of the 70S ribosome [8], the bacterial RNA polymerase [9, 10] and the respiratory complex I [11] from spp. that were solved before those of . Members of the genus are of considerable biotechnological interest as sources of thermophilic enzymes [12, 13]. Thermozymes and proteins from the genus are good candidates for industrial processes because of their high thermal stability and co-solvent compatibility. The most well-known enzyme mined from the genus is DNA polymerase, an important enzyme used in PCR. Other than DNA polymerase, thermozymes from this genus are also widely used in food, pharmaceutical and paper-pulp industries [7]. Examples of industrial applications for thermostable enzymes include organic synthesis (e.g. esterases, lipases, proteases), starch-processing (e.g. α-amylases, glucose isomerases), pulp and paper manufacturing (e.g. xylanases) as well as animal feed and human food production (amino acid and vitamin synthesis) [13, 14]. Here, we present a summary of classification and a set of features for sp. CCB_US3_UF1, together with the description of the complete genome sequence and annotation.

Organism information

Classification and features

spp. are suggested to be closely related to the genus based on several comparative studies on 16S rRNA and protein sequences, and they form a distinct branch known as the group [15, 16]. Nevertheless, the exact phylogenetic position of the phylum remains to be determined. This phylum was proposed to derive from the oldest groups of the Bacteria Domain, after those of and based on 16S rRNA sequence comparison [17]. A more in-depth analysis of the phylogeny of the phylum based on conserved orthologs can be carried out as genome sequences from both of the genera are available [18]. In order to better understand the phylogeny of sp. CCB_US3_UF1, we constructed a phylogenetic tree based on the 16S rRNA gene sequences. There are two identical copies of the 16S rRNA gene in the sp. CCB_US3_UF1 genome. One copy of the gene sequence was used to search against the nucleotide database using NCBI BLASTN [19]. The BLASTN result shows that it has the highest sequence identity to RF-4 (97 %, Y18406), YS38 (96 %, Z15062), and Se-1 (95 %, AF032127). Figure 1 shows the phylogenetic neighborhood of sp. CCB_US3_UF1 relative to type strains of the families and . (D38365) was used as an outgroup to root the tree.
Fig. 1

Phylogenetic tree highlighting the position of Thermus sp. CCB_US3_UF1 relative to the other type strains within the families Deinococcaceae and Thermaceae. Strains shown are those within the Deinococcaceae and Thermaceae having the corresponding NCBI genome project ids listed within [53]. The tree used sequences aligned by Ribosomal Database Project (RDP) aligner and Juke-Cantor corrected distance model. Distance matrix was constructed based on alignment model positions without the use of alignment insert, and a minimum comparable position of 200 was used. The tree was constructed with RDP Tree Builder that used Weighbor [54] with an alphabet size of 4 and a length size of 1000. The building of the tree involved a bootstrapping process that was repeated 100 times to generate a majority consensus tree [55]

Phylogenetic tree highlighting the position of Thermus sp. CCB_US3_UF1 relative to the other type strains within the families Deinococcaceae and Thermaceae. Strains shown are those within the Deinococcaceae and Thermaceae having the corresponding NCBI genome project ids listed within [53]. The tree used sequences aligned by Ribosomal Database Project (RDP) aligner and Juke-Cantor corrected distance model. Distance matrix was constructed based on alignment model positions without the use of alignment insert, and a minimum comparable position of 200 was used. The tree was constructed with RDP Tree Builder that used Weighbor [54] with an alphabet size of 4 and a length size of 1000. The building of the tree involved a bootstrapping process that was repeated 100 times to generate a majority consensus tree [55] sp. CCB_US3_UF1 is a Gram-negative bacterium (Table 1) and it has a rod-shaped filamentous structure (Fig. 2). Members of the genus are capable of growing at temperatures ranging between 45 °C and 83 °C [20]. Most of them have a maximum temperature for growth at slightly below 80 °C [21, 22]. Interestingly, a few strains of can grow at 80 °C or above [23]. sp. CCB_US3_UF1 was isolated from a hot spring in Ulu Slim, Perak, Malaysia. It can grow well between 60 °C and 70 °C. spp. need carbohydrates, amino acids, carboxylic acids and peptides as sources of carbon and energy. The strain CCB_US3_UF1 is an aerobic, non-sporulating, non-motile and yellow-pigmented bacterium. Some of the members of the genus are capable of growing anaerobically using nitrate as an electron acceptor and some can even reduce nitrite [22, 23].
Table 1

Classification and general features of Thermus sp. CCB_US3_UF1 according to the MIGS recommendations [57]

MIGS IDPropertyTermEvidence codea
ClassificationDomain Bacteria TAS [17]
Phylum Deinococcus-Thermus TAS [58]
Class Deinococci TAS [59, 60]
Order Thermales TAS [60, 61]
Family Thermaceae TAS [60, 62]
Genus Thermus TAS [1, 63, 64]
Species UnknownIDA
Type strain CCB_US3_UF1IDA
Gram stainNegativeIDA
Cell shapeRodIDA
MotilityNon-motileNAS
SporulationNon-sporulatingNAS
Temperature rangeThermophile (45-83 °C)TAS [20]
Optimum temperature60 °CIDA
pH range; OptimumNot reported
Carbon sourceNot reported
MIGS-6HabitatHot springsIDA
MIGS-6.3SalinityNot-reported
MIGS-22Oxygen requirementAerobicNAS
MIGS-15Biotic relationshipFree-livingNAS
MIGS-14PathogenicityNon-pathogenNAS
MIGS-4Geographic locationUlu Slim, Perak, MalaysiaIDA
MIGS-5Sample collection2009IDA
MIGS-4.1Latitude3.898822°NIDA
MIGS-4.2Longitude101.497911°EIDA
MIGS-4.4Altitude51 mIDA

aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [65]

Fig. 2

Transmission electron micrograph of Thermus sp. CCB_US3_UF1

Classification and general features of Thermus sp. CCB_US3_UF1 according to the MIGS recommendations [57] aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [65] Transmission electron micrograph of Thermus sp. CCB_US3_UF1

Genome sequencing and annotation

Genome project history

The genus belongs to one of the oldest evolutionary branches of the Bacteria domain. The genome sequencing of sp. CCB_US3_UF1 was initiated as it can serve as a model bacterium for studying the evolution of thermophilic adaptation. The sequencing and finishing of the genome were completed at the Advanced Studies in Genomics, Proteomics and Bioinformatics (University of Hawaii) and TEDA School of Biological Sciences and Biotechnology (Nankai University, China). The genome annotation was performed at the Centre for Chemical Biology (Universiti Sains Malaysia). This genome sequence was first published in March 2012 [24]. A summary of the project information is shown in Table 2.
Table 2

Project information

MIGS IDProjectTerm
MIGS 31Finishing qualityFinished
MIGS-28Libraries usedTwo genomic libraries: one 454 PE library (3 kb insert size), one Illumina library (3 kb insert size)
MIGS 29Sequencing platformsIllumina GA II×, 454 GS FLX Titanium
MIGS 31.2Fold coverage115× (Illumina); 21.14× (454)
MIGS 30AssemblersNewbler v 2.3, burrows-wheeler alignment (BWA)
MIGS 32Gene calling methodGlimmer 3.02
Locus tagTCCBUS3UF1
Genbank IDCP003126, CP003127
GenBank date of releaseDecember 2, 2011
GOLD IDGp0013444
BIOPROJECTPRJN76491
MIGS 13Source material identifierCCB_US3_UF1
Project relevanceBiotechnology, pathway, extremophile
Project information

Growth conditions and genomic DNA preparation

sp. CCB_US3_UF1 was grown aerobically to late exponential phase in 50 ml of ATCC medium 697 ( medium) [3] at 60 °C. Genomic DNA was isolated from sp. CCB_US3_UF1 using a modified phenol-chloroform extraction protocol [25]. The quality of DNA was checked by 0.5 % agarose gel electrophoresis and its quantity by a NanoDrop 2000 Spectrophotometer (Thermo Scientific, Wilmington, Delaware, USA). A DNA concentration of 363.4 ng/μl and OD260/OD280 of 1.90 was obtained.

Genome sequencing and assembly

The whole-genome sequencing of sp. CCB_US3_UF1 was performed using Roche 454 and Illumina paired-end sequencing technologies. A 3 kb genomic library was constructed and 97,991 paired-end reads and 54,397 single-end reads were generated using the GS FLX system, providing 21.14-fold genome coverage. Six large scaffolds including 51 contigs were successfully assembled from 97.09 % of the reads using the 454 Newbler assembly software (454 Life Sciences, Branford, CT). A total of 3,469,788 reads from 3 kb library were produced to reach a depth of 115-fold coverage with an Illumina GA IIx (Illumina, San Diego, CA). These reads were mapped to the scaffolds using the Burrows-Wheeler Alignment (BWA) tool [26]. The majority of the gaps within the scaffolds were filled by local assembly of 454 and Illumina reads. The gaps between the scaffolds were filled by sequencing PCR products using an ABI 3730xl capillary sequencer. PCR products were sequenced to verify repeats larger than 3 kb. The putative sequencing errors were verified and corrected by consensus of the Roche/454 and Illumina reads.

Genome annotation

The automated annotation of the genome was done using the DIYA (Do-It-Yourself Annotator) pipeline [27]. The pipeline uses Glimmer3 to predict open reading frames [28], followed by protein similarity searches using BLAST [19] against UNIREF [29], RPS-BLAST against CDD [30], and Asgard [31]. In addition, RPS-BLAST searches against the COG database was done to enable assignment of COG functional categories to the ORFs. Transfer RNAs were predicted by using tRNAscan-SE [32] while ribosomal RNAs were identified by using RNAmmer [33].

Genome properties

The complete genome of sp. CCB_US3_UF1 is composed of a single circular chromosome of 2,243,772 bp and a plasmid of 19,716 bp with G + C contents of 68.6 % and 65.6 %, respectively (Fig. 3). There are 2334 predicted coding sequences (CDS), 2 rRNA operons, and 48 tRNA genes in the chromosome (Table 3). A total of 32 CDS are predicted in the plasmid. The distribution of genes into COG functional categories is presented in Table 4.
Fig. 3

Graphical circular map of the Thermus sp. CCB_US3_UF1 chromosome and plasmid pTCCB09. a Chromosome. b Plasmid. From the inside to outside, the second and fourth circles show GC skew and G + C content respectively. The sixth and seventh circles show protein coding genes in positive and negative strands and RNA genes (tRNAs red, rRNAs light purple, other RNAs grey). This figure was generated by CGView [56]

Table 3

Genome statistics

AttributeValue% of Totala
Genome size (bp)2,263,488100.00
DNA coding (bp)2,137,65694.44
DNA G + C (bp)1,552,28568.58
DNA scaffolds1100.00
Total genesb 2,333100.00
Protein coding genes2,27997.64
RNA genes542.31
Pseudo genes10.04
Genes in internal clusters82236.07
Genes with function prediction2,07290.92
Genes assigned to COGs2,09889.89
Genes with Pfam domains1,46964.46
Genes with signal peptides1134.96
Genes with transmembrane helices46020.18
CRISPR repeats80.34

aThe total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome

bPseudogenes may also be counted as protein coding or RNA genes, so their number is not additive under the total gene count

Table 4

Number of genes associated with general COG functional categories

CodeValue% agea Description
J1476.4Translation, ribosomal structure and biogenesis
A231.0RNA processing and modification
K974.2Transcription
L1155.0Replication, recombination and repair
B30.1Chromatin structure and dynamics
D381.6Cell cycle control, cell division, chromosome partitioning
Y00.0Nuclear structure
V271.2Defense mechanisms
T773.3Signal transduction mechanisms
M913.9Cell wall/membrane biogenesis
N632.7Cell motility
Z00.0Cytoskeleton
W00.0Extracellular structures
U502.2Intracellular trafficking and secretion
O903.9Posttranslational modification, protein turnover, chaperones
C1456.3Energy production and conversion
G1386.0Carbohydrate transport and metabolism
E24710.7Amino acid transport and metabolism
F713.1Nucleotide transport and metabolism
H1155.0Coenzyme transport and metabolism
I954.1Lipid transport and metabolism
P954.1Inorganic ion transport and metabolism
Q562.4Secondary metabolites biosynthesis, transport and catabolism
R31013.4General function prediction only
S2159.3Function unknown
-1817.8Not in COGs

aThe total is based on the total number of protein coding genes in the genome

Graphical circular map of the Thermus sp. CCB_US3_UF1 chromosome and plasmid pTCCB09. a Chromosome. b Plasmid. From the inside to outside, the second and fourth circles show GC skew and G + C content respectively. The sixth and seventh circles show protein coding genes in positive and negative strands and RNA genes (tRNAs red, rRNAs light purple, other RNAs grey). This figure was generated by CGView [56] Genome statistics aThe total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome bPseudogenes may also be counted as protein coding or RNA genes, so their number is not additive under the total gene count Number of genes associated with general COG functional categories aThe total is based on the total number of protein coding genes in the genome

Comparison with other sequenced genomes

The genome of sp. CCB_US3_UF1 (2.26 Mb) is larger than those of HB27 (2.13 Mb) and HB8 (2.12 Mb), but smaller than that of SA-01 (2.36 Mb) (Table 5).
Table 5

Comparison of genome features of different species of Thermus

Species Thermus sp. CCB_US3_UF1 Thermus thermophilus HB27 Thermus thermophilus HB8 Thermus scotoductus SA-01
Genome size (bp)2,263,4882,127,4822,116,0562,355,186
G + C content (%)68.669.469.564.9
Number of protein coding genes2,2792,2102,1732,458
Coding area (%)94.494.894.994.0
Total number of genes2,3332,2632,2262,511
Hypothetical genes742734758619
Proteins with assigned function1,5371,4761,4151,839
rRNA6666
tRNA48474747
Transposase13181822
CRISPR sequences810113

Table adapted from NCBI

Comparison of genome features of different species of Thermus Table adapted from NCBI The sp. CCB_US3_UF1 genome was compared against closely related genomes using BLAST and Artemis comparison tool to identify regions of synteny. The three closest with sequenced genomes ( strains HB27, HB8 and SA-01) were selected for the comparison. The genome of strain HB27 consists of a chromosome (1.89 Mb) and a megaplasmid (0.23 Mb). On the other hand, strain HB8 has a chromosome of 1.85 Mb, a megaplasmid (0.26 Mb) and a plasmid (9.3 kb) [5]. The genome of includes a 2.3 Mb chromosome and a plasmid of 8.4 kb. sp. CCB_US3_UF1, HB27, HB8 and SA-01 all have a small genome size that is below 2.5 Mb. They also display a high G + C content that may correlate with their thermophilic lifestyle. CCB_US3_UF1 has a higher number of predicted protein coding sequences (2279) than HB27 (2210) and HB8 (2173), but lower than that of SA-01 (2458). They also share a similar number of rRNA (16S-23S-5S) operons with a well-balanced high G + C content above 60 %, a common feature displayed by thermophilic bacteria. The number of tRNAs that are present in all four genomes is between 47 and 48. In terms of transposase genes, SA-01 has the highest number (22 genes), followed by CCB_US3_UF1 (13 genes), HB27 (18 genes), and HB8 (18 genes). Interestingly, no prophage-related genes are found in these four genomes, implying the occurrence of clustered regularly interspaced short palindromic repeats (CRISPRs). CRISPR is characterized as a type of antiviral immune system found in Bacteria and Archaea [34]. There are 1728 proteins, or 76 % of the total proteins, from sp. CCB_US3_UF1 that are found orthologous to the proteins in HB27, and a total of 1691 (74 %) orthologs are shared between CCB_US3_UF1 and HB8. Meanwhile, a total of 1885 (83 %) proteins are shared between CCB_US3_UF1 and SA-01, showing greater similarity between these two species. The protein ortholog mapping was done with a cut-off e-value of 10-5 using the protein-protein BLAST (blastp). Despite of the similarity of many of their gene products, genome-wide synteny between sp. CCB_US3_UF1 and HB27, HB8 and SA-01 could not be detected. The plasmid of sp. CCB_US3_UF1 shows no overall similarity to the other sequenced plasmids of HB27 and HB8, but it has high similarity to the plasmid of Thermus sp. 4C, designated as pL4C [35]. The gene encoding chromosome segregation ATPase (TCCBUS3UF1_p160) that is found in pL4C is present in the sp. CCB_US3_UF1 plasmid. This protein has been suggested to play an essential role in plasmid replication and partition [36]. In addition, a putative integrase gene that facilitates gene transfer and chromosome modification can be found in both plasmids.

Insights from the genome sequence

The sp. CCB_US3_UF1 genome encodes genes for complete tricarboxylic cycle, gluconeogenesis, glyoxylate bypass and Embden-Meyerhof pathways. Both sp. CCB_US3_UF1 and HB27 share similar sets of genes that are involved in aerobic respiration. At high temperatures, the solubility of oxygen in water is low. Two terminal cytochrome oxidases are found in : a caa3- type (TCCBUS3UF1_540-550) that is expressed under high oxygen levels, and a ba3- type oxidase (TCCBUS3UF1_13990, TCCBUS3UF1_14010) that is expressed under low oxygen supply [37, 38]. sp. CCB_US3_UF1 is able to synthesize many important compounds, including amino acids, vitamins, cofactors, carriers, purines and pyrimidines. Many of these biosynthetic pathways show a high degree of conservation between CCB_US3_UF1 and HB27. In addition, sp. CCB_US3_UF1 has branched-chain amino acid ABC transport systems that are important for nutrient acquisition, and ion transporters for the elimination of toxic compounds such as copper and arsenite.

Motility and natural transformation

So far, motility is not observed in and no flagella biosynthetic gene is present in the genomes. However, genes encoding gliding motility proteins (TCCBUS3UF1_13970, TCCBUS3UF1_13980) and a twitching mobility protein (PilT, TCCBUS3UF1_9080) are found in the genome of sp. CCB_US3_UF1. These two proteins are also found in HB27, HB8 and SA-01, and this raises the question regarding the existence of motility in . sp. CCB_US3_UF1 is also found to possess type IV pili that are crucial in the attachment, twitching motility, surface colonization, and natural transformation systems in bacteria [39]. The efficiency of a DNA uptake system in is crucial to thermoadaptation and exchange of genetic materials in high temperature environments. Competence proteins play an important role in natural transformation and can be categorized into three groups: DNA-translocator-specific proteins, type IV pili (Tfp)-related proteins, and nonconserved proteins [40]. Genes encoding DNA-translocator-specific proteins [ComEA (TCCBUS3UF1_22560), ComEC (TCCBUS3UF1_22570), DprA (TCCBUS3UF1_18680)] and Tfp-related proteins [PilA1 (TCCBUS3UF1_8740), PilA2 (TCCBUS3UF1_8720), PilA3 (TCCBUS3UF1_8710)] were found in the sp. CCB_US3_UF1 genome. Genes encoding leader peptidase (PilD, TCCBUS3UF1_20930), traffic-NTPase (PilF, TCCBUS3UF1_21340), inner membrane protein (PilC, TCCBUS3UF1_8100), PilM-homolog and secretin-like protein (PilQ, TCCBUS3UF1_6320) were also identified. In addition, genes encoding competence proteins ComZ (TCCBUS3UF1_870), PilN (TCCBUS3UF1_6350), PilO (TCCBUS3UF1_6340), and PilW (TCCBUS3UF1_6330) were present in the chromosome of CCB_US3_UF1. The genes encoding PilM, PilN, PilO, PilW, and PilQ are found to cluster together in the genome (Fig. 4). The rearrangement of these genes is different in sp. CCB_US3_UF1 compared to HB27, HB8 and SA-01, demonstrating the loss of synteny between CCB_US3_UF1 and the other bacteria. The involvement of pili in DNA uptake has yet to be determined.
Fig. 4

Comparison of competence proteins between Thermus sp. CCB_US3_UF1 and other Thermus-related species using MAUVE alignments

Comparison of competence proteins between Thermus sp. CCB_US3_UF1 and other Thermus-related species using MAUVE alignments

Genomic islands

Potential genomic islands present in the sp. CCB_US3_UF1 genome were predicted using the IslandViewer database [41]. Early studies on genomic islands focused on regions that carry virulence factors and they are termed pathogenicity islands. Genomic islands are also shown to carry various types of genes associated with many metabolic pathways or biological processes [42]. A total of 11 possible genomic islands were identified in the sp. CCB_US3_UF1 genome. Several of these genomic islands carry genes encoding proteins involved in transport systems and defense mechanisms. For example, genomic islands 2, 3, 6 and 7 contain numerous transporter genes that may be involved in membrane transport in sp. CCB_US3_UF1. It is interesting to note that CRISPR-associated Cas proteins that are associated with phage immunity are present on Genomic Island 8 (245986 - 276477) and Genomic Island 10 (1323615 - 1334721). In comparison with other members of , HB27 harbors 10 genomic islands while both HB8 and SA-01 carry 13 genomic islands.

CRISPR

CRISPR is an RNAi-like system that provides adaptive immunity against phages or other infections is present in prokaryotes [43]. Using the CRISPR Finder tool [44], eight CRISPR repeat regions were detected in the sp. CCB_US3_UF1 genome (Table 6). The number of spacers in each of these loci are 3, 17, 14, 23, 18, 9, 12, and 2 respectively, i.e. a total of 98 spacers.
Table 6

Direct repeat consensus sequences of CRISPR loci

CRISPR locusDirect repeat consensus
1GTAGTCCCCACGCACGTGGGGATGGACC
2GTTTCAAACCCTCATAGGTACGGTCAGAAC
3CTTTGAACCGTACCTATAAGGGTTTGAAAC
4CTTTGAACCGTACCTATAAGGGTTTGAAAC
5GTTGCAAAAGTGGCTTCCCCGCAAGGGGATTGCGAC
6GTCGCAATCCCCTTACGGGGAAGCCACTTTTGCAAC
7GTCGCAATCCCCTTACGGGGAAGCCACTTTTGCAAC
8CGTAGTCCCCACACGCGTGGGGATGGACC
Direct repeat consensus sequences of CRISPR loci A comparison with other Thermus sp. revealed that a total of 10, 11 and 3 CRISPRs were found in HB27, HB8 and SA-01, respectively. In terms of the number of spacers, HB8 has the largest (112), followed by sp. CCB_US3_UF1 (98), SA-01 (87) and HB27 (74). The existence of a large number of CRISPRs in the genomes reflects an adaptation strategy employed by to protect themselves from foreign DNA invasion from the surrounding environments.

Isoprenoid biosynthesis

Based on the genome information, Thermus sp. CCB_US3_UF1 synthesizes precursors for isoprenoid compounds from pyruvate and glyceraldehyde 3-phosphate using the deoxyxylulose phosphate (MEP/DOXP) pathway instead of the mevalonate pathway. Isoprenoid compounds are derived from the five-carbon precursor isopentenyl diphosphate (IPP). The genes encoding enzymes of the complete DOXP pathway could be identified in the genome. The DOXP pathway is initiated by the conversion of glyceraldehyde 3-phosphate and pyruvate to 1-deoxy-D-xylulose 5-phosphate (DOXP) catalyzed by DOXP synthase (TCCBUS3UF1_200). Isoprenoid synthesis then proceeds through a series of enzymatic reactions that lead to the formation of 2-C-methyl-D-erythritol-2,4-cyclodiphosphate. Genes encoding the enzymes involved are dxr (TCCBUS3UF1_15410), ispD (TCCBUS3UF1_19830), ispE (TCCBUS3UF1_19820), and ispF (TCCBUS3UF1_380). Genes gcpE (TCCBUS3UF1_18880) and lytB (TCCBUS3UF1_22000), which encode the enzymes involved in the last two steps of isoprenoid synthesis that lead to IPP formation are also encoded in the genome (Additional file 1: Figure S1).

Carotenoid biosynthesis

Most species of the genus are characterized by the ability to synthesize yellow carotenoid-like pigments [45]. Carotenoids are natural pigments that have been used commercially as food colorants, nutrient supplements and for pharmaceuticals purposes [46]. has been shown to produce carotenoids known as thermozeaxanthins and thermobiszeaxanthins [47]. As carotenoids are one of the hydrophobic components associated with the cell membrane, it was suggested that carotenoids might have an essential role in stabilizing the membrane of at high temperature. In HB27, genes encoding the terminal steps of carotenoid biosynthesis are found in the large plasmid (pTT27), whereas precursor synthesis involving the formation of geranylgeranyl pyrophosphate (GGPP) is accomplished by enzymes encoded on the chromosome [5]. In sp. CCB_US3_UF1, genes encoding the enzymes for both the terminal and precursor steps of carotenoid biosynthesis are located on the chromosome (Additional file 2: Figure S2). In the bacterial carotenoid biosynthetic pathway, phytoene is the first carotenoid synthesized and it is formed from the condensation of two molecules of geranylgeranyl pyrophosphate (GGPP) [48]. The GGPP synthase gene (TTHA0013) from HB8 has been identified and functionally characterized [49]. The gene has a homolog (TCCBUS3UF1_18840) in sp. CCB_US3_UF1. Phytoene is synthesized from GGPP by phytoene synthase (CrtB). In HB27, a gene encoding a homolog of phytoene synthase (TT_P0057) was cloned and identified as crtB. It was suggested that phytoene synthase is the rate-limiting enzyme in the carotenoid biosynthesis in . In addition, crtB of was found to cluster together with other carotenogenic genes on the large plasmid [50]. Interestingly, the homolog of crtB (TCCBUS3UF1_10160) in sp. CCB_US3_UF1 is encoded on the chromosome and not the plasmid. Phytoene is then converted to lycopene via a series of desaturation steps that are catalyzed by phytoene desaturase (CrtI), cis-carotene isomerase (CrtH) and ζ-carotene desaturase [51]. In bacteria, only one phytoene desaturase, CrtI, has been detected. A gene encoding a CrtI homolog (TCCBUS3UF1_10090) is detected in the genome of sp. CCB_US3_UF1 as well [5, 52]. Following lycopene synthesis, the carotenoid biosynthetic pathway branches into acyclic and cyclic carotenoids formation. A possible gene encoding an enzyme that catalyzes the cyclization of lycopene, CrtY-type lycopene cyclase (TCCBUS3UF1_10120) is found in the genome of sp. CCB_US3_UF1 and the other three sequenced genomes [52] (Additional file 2: Figure S2).

Conclusion

have proven to be useful as sources of thermostable enzymes and the genome sequences provide information for further exploring the biotechnological potentials of this genus. Analysis of the sp. CCB_US3_UF1 genome revealed that it encodes pathways for the synthesis of secondary metabolites (isoprenoid) and pigments (carotenoid). The latter has attracted industrial interest for application in food industries. The CRISPR/ Cas system that is found in could be an interesting tool in molecular biology, particularly for genome editing. Considering the great potential of in various fields, the complete genome sequence of sp. CCB_US3_UF1 is a valuable resource for both fundamental researches and biotechnological applications.
  53 in total

1.  Weighted neighbor joining: a likelihood-based approach to distance-based phylogeny reconstruction.

Authors:  W J Bruno; N D Socci; A L Halpern
Journal:  Mol Biol Evol       Date:  2000-01       Impact factor: 16.240

Review 2.  Biotechnologically relevant enzymes from Thermus thermophilus.

Authors:  A A Pantazaki; A A Pritsa; D A Kyriakidis
Journal:  Appl Microbiol Biotechnol       Date:  2002-01       Impact factor: 4.813

3.  The Deinococcus-Thermus phylum and the effect of rRNA composition on phylogenetic tree construction.

Authors:  W G Weisburg; S J Giovannoni; C R Woese
Journal:  Syst Appl Microbiol       Date:  1989       Impact factor: 4.022

4.  Validation of publication of new names and new combinations previously effectively published outside the IJSEM. International Journal of Systematic and Evolutionary Microbiology.

Authors: 
Journal:  Int J Syst Evol Microbiol       Date:  2002-05       Impact factor: 2.747

5.  Conserved enzymes mediate the early reactions of carotenoid biosynthesis in nonphotosynthetic and photosynthetic prokaryotes.

Authors:  G A Armstrong; M Alberti; J E Hearst
Journal:  Proc Natl Acad Sci U S A       Date:  1990-12       Impact factor: 11.205

6.  Comparative genomics of Thermus thermophilus: Plasticity of the megaplasmid and its contribution to a thermophilic lifestyle.

Authors:  Holger Brüggemann; Chaoyin Chen
Journal:  J Biotechnol       Date:  2006-05-19       Impact factor: 3.307

7.  Complete genome sequence of the thermophilic bacterium Thermus sp. strain CCB_US3_UF1.

Authors:  Beng Soon Teh; Ahmad Yamin Abdul Rahman; Jennifer A Saito; Shaobin Hou; Maqsudul Alam
Journal:  J Bacteriol       Date:  2012-03       Impact factor: 3.490

8.  Molecular cloning and sequence analysis of the crtB gene of Thermus thermophilus HB27, an extreme thermophile producing carotenoid pigments.

Authors:  T Hoshino; R Fujii; T Nakahara
Journal:  Appl Environ Microbiol       Date:  1993-09       Impact factor: 4.792

9.  Identification of signature proteins that are distinctive of the Deinococcus-Thermus phylum.

Authors:  Emma Griffiths; Radhey S Gupta
Journal:  Int Microbiol       Date:  2007-09       Impact factor: 2.479

10.  DIYA: a bacterial annotation pipeline for any genomics lab.

Authors:  Andrew C Stewart; Brian Osborne; Timothy D Read
Journal:  Bioinformatics       Date:  2009-03-02       Impact factor: 6.937

View more
  1 in total

1.  Complete genome sequence of Thermus brockianus GE-1 reveals key enzymes of xylan/xylose metabolism.

Authors:  Christian Schäfers; Saskia Blank; Sigrid Wiebusch; Skander Elleuche; Garabed Antranikian
Journal:  Stand Genomic Sci       Date:  2017-02-03
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.