Literature DB >> 26587181

Draft genome sequence of Cellulomonas carbonis T26(T) and comparative analysis of six Cellulomonas genomes.

Weiping Zhuang1, Shengzhe Zhang1, Xian Xia1, Gejiao Wang1.   

Abstract

Most Cellulomonas strains are cellulolytic and this feature may be applied in straw degradation and bioremediation. In this study, Cellulomonas carbonis T26(T), Cellulomonas bogoriensis DSM 16987(T) and Cellulomonas cellasea 20108(T) were sequenced. Here we described the draft genomic information of C. carbonis T26(T) and compared it to the related Cellulomonas genomes. Strain T26(T) has a 3,990,666 bp genome size with a G + C content of 73.4 %, containing 3418 protein-coding genes and 59 RNA genes. The results showed good correlation between the genotypes and the physiological phenotypes. The information are useful for the better application of the Cellulomonas strains.

Entities:  

Keywords:  Cellulolytic; Cellulomonas; Cellulomonas carbonis; Comparative genomics; Genome sequence

Year:  2015        PMID: 26587181      PMCID: PMC4652355          DOI: 10.1186/s40793-015-0096-8

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

Strain T26T (= CGMCC 1.10786T = KCTC 19824 = CCTCC AB2010450 T) is the type strain of which was isolated from coal mine soil [1]. The genus was first proposed by Bergey et al. in 1923 [2]. To date, the genus contains 27 species and mainly isolated from cellulose enriched environments such as soil, bark, wood and sugar field [1-4]. The common characteristics of the strains are Gram-positive, rods, high G + C content (69–76 mol%) and cellulolytic, containing anteiso-C15:0 and C16:0 as the major fatty acids, and menaquinone-9(H4) as the predominant quinone. Most strains can degrade cellulose and hemicellulose, making the strains applicable in paper, textile, and food industries, soil fertility and bioremediation [5-8]. The characterization of cellobiose phosphorylase, endo-1,4-xylanase, xylanases and endo-1,4-glucanase of strains have been previously published [9-12]. So far, three genomes of have been published including DSM 20109 [13], ATCC 484 [14] and “Cellulomonas gilvus”ATCC 131271 [14] and showed a wide variety of cellulases and hemicellulases in their genomes [13, 14]. In order to provide more genomic information about strains for potential industrial application, we sequenced the genomes of T26T [1], DSM 20118 [2] and DSM 16987 [15]. Here we present a summary genomic features of T26T together with the comparison results of the six available genomes.

Organism information

Classification and features

The taxonomic classification and general features of T26T are presented in Table 1. A total of 105 single-copy conserved proteins were obtained within the 13 genomes by OrthoMCL with a Match Cutoff 50 % and an E-value Exponent Cutoff 1-e5 [16, 17]. Figure 1 shows the phylogenetic tree of T26T and 12 related strains based on conserved gene sequences. The tree was constructed by MEGA 5.05 with Maximum-Likelihood method to determine phylogenetic position [18]. The genome based phylogenetic tree (Fig. 1) is similar to the 16S rRNA gene based phylogenetic tree [1].
Table 1

Classification and general features of C. carbonis T26T

MIGS IDPropertyTermEvidence codea
ClassificationDomain Bacteria TAS [33]
Phylum Actinobacteria TAS [34]
Class Actinobacteria TAS [35]
Order Micrococcales TAS [36]
Family Cellulomonadaceae TAS [37]
Genus Cellulomonas TAS [1, 38]
Species Cellulomonas carbonis TAS [1]
(Type) strain: T26T = (CGMCC 1.10786T = KCTC 19824T = CCTCC AB2010450T)
Gram stainPositiveTAS [1]
Cell shapeRod-shapedTAS [1]
MotilityMotileTAS [1]
SporulationNon-sporulatingNAS
Temperature range4-45 °CTAS [1]
Optimum temperature28 °CTAS [1]
pH range; Optimum6-10;7TAS [1]
Carbon sourceD-glucose, L-arabinose, mannose, N-acetylTAS [1]
glucosamine, maltose, gluconate, sucrose, glycogen, salicin, D-melibiose, D-sorbitol, xylose, D-lactose, D-galactose, D-fructose, and raffinose.
MIGS-6HabitatSoilTAS [1]
MIGS-6.3Salinity0-7 % NaCl (w/v)TAS [1]
MIGS-22Oxygen requirementAerobicTAS [1]
MIGS-15Biotic relationshipfree-livingTAS [1]
MIGS-14Pathogenicitynon-pathogenNAS
MIGS-4Geographic locationTianjin city,ChinaTAS [1]
MIGS-5Sample collection2012TAS [1]
MIGS-4.1Latitude39°01'49.77" NTAS [1]
MIGS-4.2Longitude117°11'20.20" ETAS [1]
MIGS-4.4AltitudeNot reportedTAS [1]

aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [23]

Fig. 1

Phylogenetic tree showing the position of C. carbonis T26T (shown in bold) based on aligned sequences of 105 single-copy conserved proteins shared among the 13 genomes. The conserved protein was acquired by OrthoMCL with a Match Cutoff 50 % and an E-value Exponent Cutoff 1-e5 [15, 16]. Phylogenetic analysis was performed using MEGA version 5.05 and the tree was built using the Maximum-Likelihood method [17] with 1000 bootstrap repetitions were computed to estimate the reliability of the tree. The corresponding GenBank accession numbers are displayed in parentheses

Classification and general features of C. carbonis T26T aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [23] Phylogenetic tree showing the position of C. carbonis T26T (shown in bold) based on aligned sequences of 105 single-copy conserved proteins shared among the 13 genomes. The conserved protein was acquired by OrthoMCL with a Match Cutoff 50 % and an E-value Exponent Cutoff 1-e5 [15, 16]. Phylogenetic analysis was performed using MEGA version 5.05 and the tree was built using the Maximum-Likelihood method [17] with 1000 bootstrap repetitions were computed to estimate the reliability of the tree. The corresponding GenBank accession numbers are displayed in parentheses Strain T26T is Gram-positive, aerobic, motile and rod-shaped (0.5–0.8 × 2.0–2.4 μm) (Fig. 2). The colonies are yellow-white, convex, circular, smooth, non-transparent and about 1 mm in diameter after 3 days incubation on R2A agar at 28 °C [1]. The optimal growth occurs at 28 °C (Table 1). The strain was able to hydrolyse CM-cellulose, starch, gelatin, aesculin and positive in catalase and nitrate reduction [1]. T26T was capable of utilizing a wide range of sole carbon sources including D-glucose, L-arabinose, mannose, N-acetyl glucosamine, maltose, gluconate, sucrose, glycogen, salicin, D-melibiose, D-sorbitol, xylose, D-lactose, D-galactose, D-fructose and raffinose [1, Table 1].
Fig. 2

A transmission electron micrograph of strain T26T grown on LB agar at 28 °C for 48 h. The bar indicates 0.5 μm

A transmission electron micrograph of strain T26T grown on LB agar at 28 °C for 48 h. The bar indicates 0.5 μm

Chemotaxonomy

T26T contains anteiso-C15:0 (33.6 %), anteiso-C15:1 A (22.1 %), C16:0 (14.4 %) and C14:0 (12.1 %) as the major fatty acids and menaquinone-9(H4) as the predominant respiratory quinone. The major polar lipids of this strain were diphosphatidylglycerol and phosphatidylglycerol [1].

Genome sequencing information

Genome project history

This organism was selected for sequencing particularly due to its cellulolytic activity and other applications. Genome sequencing was performed by Majorbio Bio-pharm Technology in April-June, 2013. The raw reads were assembled by SOAPdenovo v1.05. The genome annotation was performed at the RAST server version 2.0 [19] and the NCBI Prokaryotic Genome Annotation Pipeline and has been deposited at DDBJ/EMBL/GenBank under accession number AXCY00000000. The version described in this study is the first version AXCY01000000. The project information are summarized in Table 2.
Table 2

Project information

MIGS IDPropertyTerm
MIGS-31Finishing qualityDraft
MIGS-28Libraries usedIllumina Paired-End library (300 bp insert size)
MIGS-29Sequencing platformsIllumina Miseq 2000
MIGS-31.2Fold coverage343.5×
MIGS-30AssemblersSOAPdenovo v1.05
MIGS-32Gene calling methodGeneMarkS+
Locus tagN868
GenBank IDAXCY00000000
GenBank Date of ReleaseOctober 17, 2014
GOLD IDGi0055591
BIOPROJECTPRJN215138
MIGS-13Source material identifierT26T
Project relevanceGenome comparison
Project information

Growth conditions and genomic DNA preparation

Strain T26T was grown aerobically in 50 ml LB medium at 28 °C for 36 h with 160 rpm shaking. Cells were collected by centrifugation and about 20 mg pellet was obtained. Genomic DNA was extracted, concentrated and purified using the QiAamp kit (Qiagen, Germany). The quality of DNA was assessed by 1 % agarose gel electrophoresis and the quantity of DNA was measured using NanoDrop Spectrophotometer 2000 (Equl-Thermo SCIENTIFIC, USA). About 8.8 μg of genomic DNA was sent to Shanghai Majorbio Bio-pharm Technology Co., Ltd for library preparation and sequencing.

Genome sequencing and assembly

The genome of T26T was sequenced by Illumina Hisep2000 pair-end technology at Shanghai Majorbio Bio-pharm Technology Co., Ltd. A 300 bp Illumina standard shotgun library was constructed and generated 7,703,453 × 2 reads totaling 1,556,097,506 bp Illumina data. Raw reads were filtered using the FastQC toolkit and optimizing through local gap filling and base correction with Gap Closer. All general aspects of library construction and sequencing can be found at the Illumina’s official website [20]. Using SOAPdenovo v1.05 version [21], 7,324,578 × 2 paired reads and 349,082 single reads were assembled de novo. Due to very high GC content, the final draft assembly yield 547 contigs arranged in 414 scaffolds with 343.5 × coverage. The final assembly results showed that 97.6 % of the bases present in larger contigs (>1000 bp), and the contig N50 is 29,777 bp. The draft genome of T26T is present as a set of contigs ordered against the complete genome of DSM 20109 using Mauve software [22].

Genome annotation

The draft genome sequence of T26T was annotation through the RAST server version 2.0 and the National Center for Biotechnology Information Prokaryotic Genome Annotation Pipeline. Genes were identified using the gene caller GeneMarkS+ with the similarity-based gene detection approach [23]. The predicted CDSs were translated and used to search the NCBI Nonredundant Database, Pfam [24], KEGG [25], and the NCBI Conserved Domain Database through the Batch web CD-Search tool [26]. The miscellaneous features were prediction by WebMGA [27], TMHMM [28] and SignalP [29]. The putative cellulose-degrading enzymes were identified through Carbohydrate-Active enZYmes Database (CAZymes) Database [30].

Genome properties

The whole genome of T26T is 3,990,666 bp in length, with an average GC content of 73.4 %, and comprised of 547 contigs. The genome properties and statistics are summarized in Table 3 and Fig. 3. From a total of 3513 genes, 3418 protein-coding genes were identified and 71 % of them were assigned putative functions, while the remainder was annotated as hypothetical proteins. In addition, 36 pseudogenes, 11 rRNA, 46 tRNAs and 1 ncRNA were identified. The distributions of genes among the COGs functional categories are shown in Table 4.
Table 3

Genome statistics

AttributeValue% of totala
Genome size (bp)3,990,666100.00
DNA coding (bp)2,927,15373.35
DNA G + C (bp)3,368,22084.40
DNA scaffolds414100.00
Total genes3513100.00
Protein-coding genes341897.30
RNA genes591.68
Pseudo genes361.02
Genes in internal clusters143540.85
Genes with function prediction248171.00
Genes assigned to COGs145041.28
Genes with Pfam domains223163.51
Genes with signal peptides2537.20
Genes with transmembrane helices76421.75
CRISPR repeats0-

aThe total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome

Fig. 3

A graphical circular map of the C. carbonis T26T genome performed with CGview comparison tool [39]. From outside to center, ring 1, 4 show protein-coding genes colored by COG categories on forward/reverse strand; ring 2, 3 denote genes on forward/reverse strand; ring 5 shows G + C% content plot, and the innermost ring shows GC skew

Table 4

Number of genes associated with general COG functional categories

CodeValue%agea Description
J1524.45Translation, ribosomal structure and biogenesis
A40.12RNA processing and modification
K2447.14Transcription
L1363.98Replication, recombination and repair
B10.03Chromatin structure and dynamics
D290.85Cell cycle control, Cell division, chromosome partitioning
V581.70Defense mechanisms
T1955.71Signal transduction mechanisms
M1414.13Cell wall/membrane biogenesis
N541.58Cell motility
U611.78Intracellular trafficking and secretion
O1063.10Posttranslational modification, protein turnover, chaperones
C1815.30Energy production and conversion
G2988.72Carbohydrate transport and metabolism
E1985.79Amino acid transport and metabolism
F722.11Nucleotide transport and metabolism
H1163.39Coenzyme transport and metabolism
I912.66Lipid transport and metabolism
P1303.80Inorganic ion transport and metabolism
Q481.40Secondary metabolites biosynthesis, transport and catabolism
R3409.95General function prediction only
S1995.82Function unknown
-196857.58Not in COGs

aThe percentage is based on the total number of protein-coding genes in the annotated genome

Genome statistics aThe total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome A graphical circular map of the C. carbonis T26T genome performed with CGview comparison tool [39]. From outside to center, ring 1, 4 show protein-coding genes colored by COG categories on forward/reverse strand; ring 2, 3 denote genes on forward/reverse strand; ring 5 shows G + C% content plot, and the innermost ring shows GC skew Number of genes associated with general COG functional categories aThe percentage is based on the total number of protein-coding genes in the annotated genome

Insights from the genome sequence

In order to reveal more genomic information for better application of the strains, the genomic features of T26T together with the comparison results of the six genomes were analyzed (Table 5). OrthoMCL analysis with a Match cutoff of 50 % and an E-value Exponent cutoff of 1-e5 identified 1189 single-copy conserved proteins among the six genomes (Fig. 4). Several carbohydrate-active enzymes have been identified and classified into different families of glycoside hydrolases, carbohydrate binding modules, carbohydrate esterases, auxiliary activities and polysaccharide lyases [31] (Fig. 5, Additional file 1: Table S1). Some putative glycoside hydrolases may be responsible for the ability of spp. to utilize various sole carbon sources.
Table 5

General features of the six Cellulomonas genomes

StrainIsolation sourceGenome size (Mb)CovergeCDSsRNAG + C contentGenBank No.
C. gilvus ATCC 13127T feces3.53-31645473.8 %NC_015671
C. fimi ATCC 484T soil4.27-37615474.7 %NC_015514
C. flavigena DSM 20109T soil4.12-36785474.3 %NC_014151
C. bogoriensis DSM 16987T sediment and water3.19368.2 x28985172.2 %AXCZ00000000
C. carbonis T26T coal mine soil3.99343.5 x34185973.3 %AXCY00000000
C. cellasea DSM 20108T NR4.66724.0 x35604474.6 %AXNJ00000000
Fig. 4

Ortholog analysis of the six Cellulomonas genomes conducted using OrthoMCL. The total numbers of shared proteins among the six genomes and unique proteins from each species were tabulated and presented as a Venn diagram

Fig. 5

Comparative analysis of putative proteins of CAZy family of six Cellulomonas genomes. From outside to center, ring 1 is C. flavigena DSM 20109T; ring 2 is C. gilvus ATCC 13127T; ring 3 is C. fimi ATCC 484T; ring 4 is C. cellasea DSM 20108T; ring 5 is C. bogoriensis DSM 16987T; ring 6 is C. carbonis T26T. AA, auxiliary activities; CBM, carbohydrate binding module; CE, carbohydrate esterase; GH, glycoside hydrolases; GT, glycosyltransferase; PL, polysaccharide lyase

General features of the six Cellulomonas genomes Ortholog analysis of the six Cellulomonas genomes conducted using OrthoMCL. The total numbers of shared proteins among the six genomes and unique proteins from each species were tabulated and presented as a Venn diagram Comparative analysis of putative proteins of CAZy family of six Cellulomonas genomes. From outside to center, ring 1 is C. flavigena DSM 20109T; ring 2 is C. gilvus ATCC 13127T; ring 3 is C. fimi ATCC 484T; ring 4 is C. cellasea DSM 20108T; ring 5 is C. bogoriensis DSM 16987T; ring 6 is C. carbonis T26T. AA, auxiliary activities; CBM, carbohydrate binding module; CE, carbohydrate esterase; GH, glycoside hydrolases; GT, glycosyltransferase; PL, polysaccharide lyase Some potential cellulose-degrading enzymes were found and analyzed (Fig. 6, Additional file 1: Table S2). ATCC 484 possesses the highest number of putative cellulases, including ten members of β-glucosidases (GH1 and GH3); six members of endoglucanases (GH6 and GH9); four endo-β-1,4-glucanases (GH48 and GH5) and one cellobiose phosphorylase (GH94). T26T has the fewest putative cellulases, including one cellobiose phosphorylase (GH94); one endoglucanase (GH6) and five β-glucosidases (GH1 and GH3). Cellulose activity assays were performed on Congo-Red agar media [32] and all of the six strains yielded a cellulose clearing zone on the media (data not shown). The Kyoto Encyclopedia of Genes and Genomes was used to construct metabolic pathways and all of the six strains have the complete cellulose degradation pathways (data not shown).
Fig. 6

The distribution of cellulases in six Cellulomonas genomes. The cellulases are β-glucosidase, endoglucanase, endo-β-1,4-glucanase and cellobiose phosphorylase

The distribution of cellulases in six Cellulomonas genomes. The cellulases are β-glucosidase, endoglucanase, endo-β-1,4-glucanase and cellobiose phosphorylase In addition to the utilization of cellulose, the strains are also known to degrade hemicelluloses. A large number of putative intracellular and extracellular xylan degrading enzymes have been identified in the genomes, such as endo-1-4,-β-xylanase, β-xylosidase, α-L-arabinofuranosidase, acetylxylan esterase and α-glucuronidase (Additional file 1: Table S3) which suggests the capacity to degrade hemicelluloses. We also found a large number of α-amylases which are responsible to the degradation of starch in the six genomes (Additional file 1: Table S4) suggest the potential application in bioremediation of food industrial wastewater.

Conclusions

The genomic information of T26T and the comparison results of the six genomes revealed a high degree of putative cellulases, hemicellulases. In addition, we found that the genomes also contain members of α-amylases. These information provides a genomic basis for the better application of spp. in industry and environmental bioremediation. In addition, the genomes possess many putative carbohydrate-active enzymes which is in agreement with their physiological ability to utilize various sole carbon sources.
  32 in total

1.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

Authors:  A Krogh; B Larsson; G von Heijne; E L Sonnhammer
Journal:  J Mol Biol       Date:  2001-01-19       Impact factor: 5.469

2.  Improved prediction of signal peptides: SignalP 3.0.

Authors:  Jannick Dyrløv Bendtsen; Henrik Nielsen; Gunnar von Heijne; Søren Brunak
Journal:  J Mol Biol       Date:  2004-07-16       Impact factor: 5.469

3.  Examining the role of phosphate in glycosyl transfer reactions of Cellulomonas uda cellobiose phosphorylase using D-glucal as donor substrate.

Authors:  Patricia Wildberger; Lothar Brecker; Bernd Nidetzky
Journal:  Carbohydr Res       Date:  2012-04-13       Impact factor: 2.104

4.  The application of exogenous cellulase to improve soil fertility and plant growth due to acceleration of straw decomposition.

Authors:  Wei Han; Ming He
Journal:  Bioresour Technol       Date:  2010-01-21       Impact factor: 9.642

5.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.

Authors:  Koichiro Tamura; Daniel Peterson; Nicholas Peterson; Glen Stecher; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2011-05-04       Impact factor: 16.240

6.  Purification and characterization of two sugarcane bagasse-absorbable thermophilic xylanases from the mesophilic Cellulomonas flavigena.

Authors:  Alejandro Santiago-Hernández; Jesús Vega-Estrada; María del Carmen Montes-Horcasitas; María Eugenia Hidalgo-Lara
Journal:  J Ind Microbiol Biotechnol       Date:  2007-01-12       Impact factor: 3.346

7.  OrthoMCL: identification of ortholog groups for eukaryotic genomes.

Authors:  Li Li; Christian J Stoeckert; David S Roos
Journal:  Genome Res       Date:  2003-09       Impact factor: 9.043

8.  Comparing thousands of circular genomes using the CGView Comparison Tool.

Authors:  Jason R Grant; Adriano S Arantes; Paul Stothard
Journal:  BMC Genomics       Date:  2012-05-23       Impact factor: 3.969

9.  The genome sequences of Cellulomonas fimi and "Cellvibrio gilvus" reveal the cellulolytic strategies of two facultative anaerobes, transfer of "Cellvibrio gilvus" to the genus Cellulomonas, and proposal of Cellulomonas gilvus sp. nov.

Authors:  Melissa R Christopherson; Garret Suen; Shanti Bramhacharya; Kelsea A Jewell; Frank O Aylward; David Mead; Phillip J Brumm
Journal:  PLoS One       Date:  2013-01-14       Impact factor: 3.240

10.  The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).

Authors:  Ross Overbeek; Robert Olson; Gordon D Pusch; Gary J Olsen; James J Davis; Terry Disz; Robert A Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Fangfang Xia; Rick Stevens
Journal:  Nucleic Acids Res       Date:  2013-11-29       Impact factor: 16.971

View more
  3 in total

1.  Metagenomics study to compare the taxonomic composition and metabolism of a lignocellulolytic microbial consortium cultured in different carbon conditions.

Authors:  Qinggeer Borjigin; Bizhou Zhang; Xiaofang Yu; Julin Gao; Xin Zhang; Jiawei Qu; Daling Ma; Shuping Hu; Shengcai Han
Journal:  World J Microbiol Biotechnol       Date:  2022-03-24       Impact factor: 3.312

2.  Xylanases of Cellulomonas flavigena: expression, biochemical characterization, and biotechnological potential.

Authors:  Alexander V Lisov; Oksana V Belova; Zoya A Lisova; Nataliy G Vinokurova; Alexey S Nagel; Zhanna I Andreeva-Kovalevskaya; Zhanna I Budarina; Maxim O Nagornykh; Marina V Zakharova; Andrey M Shadrin; Alexander S Solonin; Alexey A Leontievsky
Journal:  AMB Express       Date:  2017-01-03       Impact factor: 3.298

3.  Optimization of bioprocesses with Brewers' spent grain and Cellulomonas uda.

Authors:  Alexander Akermann; Jens Weiermüller; Jonas Nicolai Chodorski; Malte Jakob Nestriepke; Maria Teresa Baclig; Roland Ulber
Journal:  Eng Life Sci       Date:  2021-08-27       Impact factor: 2.678

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.