| Literature DB >> 21227922 |
Phuongan Dam1, Irina Kataeva, Sung-Jae Yang, Fengfeng Zhou, Yanbin Yin, Wenchi Chou, Farris L Poole, Janet Westpheling, Robert Hettich, Richard Giannone, Derrick L Lewis, Robert Kelly, Harry J Gilbert, Bernard Henrissat, Ying Xu, Michael W W Adams.
Abstract
Caldicellulosiruptor bescii DSM 6725 utilizes various polysaccharides and grows efficiently on untreated high-lignin grasses and hardwood at an optimum temperature of ∼ 80 °C. It is a promising anaerobic bacterium for studying high-temperature biomass conversion. Its genome contains 2666 protein-coding sequences organized into 1209 operons. Expression of 2196 genes (83%) was confirmed experimentally. At least 322 genes appear to have been obtained by lateral gene transfer (LGT). Putative functions were assigned to 364 conserved/hypothetical protein (C/HP) genes. The genome contains 171 and 88 genes related to carbohydrate transport and utilization, respectively. Growth on cellulose led to the up-regulation of 32 carbohydrate-active (CAZy), 61 sugar transport, 25 transcription factor and 234 C/HP genes. Some C/HPs were overproduced on cellulose or xylan, suggesting their involvement in polysaccharide conversion. A unique feature of the genome is enrichment with genes encoding multi-modular, multi-functional CAZy proteins organized into one large cluster, the products of which are proposed to act synergistically on different components of plant cell walls and to aid the ability of C. bescii to convert plant biomass. The high duplication of CAZy domains coupled with the ability to acquire foreign genes by LGT may have allowed the bacterium to rapidly adapt to changing plant biomass-rich environments.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21227922 PMCID: PMC3082886 DOI: 10.1093/nar/gkq1281
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Diagram of C. bescii chromosome. From outside to inside, the circles show (i) COG categories (two circles), (ii) mean centered GC content of the genome, (iii) genes (two circles) with functions related to CAZy (green), sugar transporters (red) and cell-adhesion (blue), (iv) GC skew plot(orange-purple circle) and (v) RNA genes (ribosomal in red, tRNA in blue and others in aquamarine). The GenomeViz was used to construct the circular chromosome wheel (http://www.uniklinikum-giessen.de/genome/index.html).
General features and comparative genomics of C. bescii DSM 6725
| General features | ||||||
|---|---|---|---|---|---|---|
| Length of chromosome (Mbp) | 2.9 | 3.0 | 2.4 | 2.7 | 3.8 | 1.9 |
| G + C content (%) | 35.2 | 35.3 | 34.5 | 37.6 | 39.0 | 46.2 |
| Coding density (%) | 85.4 | 86.1 | 86.7 | 86.8 | 83.5 | 86.8 |
| Total no. of predicted protein coding genes | 2662 | 2679 | 2243 | 2588 | 3191 | 1858 |
| Average length of protein-coding genes (bp) | 942 | 957 | 915 | 905 | 1008 | 905 |
| Total no. of predicted tRNA | 47 | 46.0 | 56 | 55 | 56.0 | 46.0 |
| Total no. of rRNA genes (no. of operons) | 9 ( | 9 ( | 16 ( | 12 ( | 12 ( | 3 ( |
| Secreted Proteome (SignalP prediction) | 394 | 362 | 244 | 257 | 404 | 207 |
| Membrane Proteins (TMHMM prediction) | 344 | 348 | 282 | 401 | 481 | 239 |
| Percentage secreted proteome (SignalP prediction) | 14.80 | 13.51 | 10.88 | 9.93 | 12.66 | 11.14 |
| Percentage membrane proteins (TMHMM prediction) | 12.92 | 12.99 | 12.57 | 15.49 | 15.07 | 12.86 |
| IS elements | ||||||
| Full copies | 34 | 91 | 42 | 69 | 100 | 3 |
| Partial copies | 132 | 130 | 77 | 104 | 56 | 22 |
| Ismax-full-copy | eISCsa4 | eISCsa4 | eISTps4/eISTps5 | eISTps1 | ISCth3 | eISTma2 |
| MaxCopy no. | 12 | 33 | 2 | 17 | 18 | 2 |
| Growth on cellulose and xylanc | Cellulose, xylan | Cellulose, xylan | Does not grow | Does not grow | Cellulose | Xylan, CMC |
aISmax is the IS element with the largest full copy number.
bMaxCopy no. is the largest full copy number.
cSee text for references.
dCMC: carboxymethyl cellulose.
Primary extracellular proteins of C. bescii involved in utilization of insoluble components of plant biomass
| Gene | CAZy module architecture | CAZy module activity ( | Transcriptomics | Proteomics | |||
|---|---|---|---|---|---|---|---|
| CBM | Catalycic (main activities) | Cell. | Signif. | ExtP | Membr. | ||
| Cbes_0089 | GH11-CBM36 | Xylan | Xylanase | Up | Yes | ||
| Cbes_0182 | GH43-CBM22a-GH43-CBM6b | Xylana,b, amorphous celluloseb | Xylanase, β-xylosidase, arabinanase | Up | Yes | ||
| Cbes_0183 | CBM22-CBM22-GH10 | Xylan | Endo-1,4-, endo-1,3-β-xylanase | Up | Yes | ||
| Cbes_0458 | GH1 | β-glucosidase, β-galactosidase, β-mannosidase, β-glucuronidase | Up | No | C | ||
| Cbes_0594 | GH5-CBM28-SLH-SLH-SLH | Amorphpus cellulose, cellooligosaccharides | Mannanase, cellulase, lichenase, xylanase | Up | Yes | C | |
| Cbes_0609 | CBM41a-CBM48b-GH13-CBM20c | Starcha,c, glycogenb, cyclodextrinesc | Starch | Up | Yes | C | C |
| Cbes_0610 | CBM20 | Starch, cyclodextins | Up | Yes | CX | ||
| Cbes_0618 | CBM22-CBM22-GH10 | Xylan | Endo-1,4-, endo-1,3-β-xylanase | Up | Yes | X | |
| Cbes_1439 | GH23 | Peptidoglycan lyase | Down | Yes | |||
| Cbes_1462 | CE4 | Acetyl xylan esterase | Up | Yes | |||
| Cbes_1829 | CE4 | Acetyl xylan esterase | Up | No | |||
| Cbes_1853 | PL11-CBM3 | Cellulose | Rhamnogalacturonan lyase | Down | No | CX | |
| Cbes_1854 | CBMX-PL3 | Pectate lyase | Down | Y/N | CX | X | |
| Cbes_1855 | CBMX-PL9 | Pectate lyase, exopolygalacturonate lyase | Down | Yes | CX | ||
| Cbes_1857 | GH10a-CBM3-CBM3-GH48b | Cellulose | endo-1,4-, endo-1,3-β-xylanasea, cellobiohydrolaseb | Up | Yes | CX | X |
| Cbes_1859 | GH5a-CBM3-CBM3-GH44b | Cellulose | Mannanasea; Xyloglucanase, endoglucanaseb | Up | Yes | CX | X |
| Cbes_1860 | GH74a-CBM3-CBM3-GH48b | Cellulose | Xyloglucanase, endoglucanasea; cellobiohydrolaseb | Up | Yes | CX | X |
| Cbes_1865 | GH9a-CBM3-CBM3-CBM3-GH5b | Cellulose | Endoglucanasea; mannanaseb | Up | No | CX | |
| Cbes_1866 | GH5a-CBM3-CBM3-CBM3-GH5b | Cellulose | Mannanasea, cellulaseb | Down | Y/N | CX | |
| Cbes_1867 | GH9a-CBM3-CBM3-CBM3-GH48b | Cellulose | Endoglucanasea, cellobiohydrolaseb | Up | Yes | CX | CX |
| Cbes_2593 | GH13 | Starch | Up | Yes | |||
Primary extracellular proteins are CAZy proteins where each contains a signal peptide and, in most cases, a CBM. The superscripts on the CBM and GH domains (a, b or c) indicate the corresponding CAZy module activity. Transcriptomics and proteomics show regulation of gene transcription on cellulose versus glucose, and protein identification using LC-MS/MS. N-terminal GH5 modules in Cbes_1859, Cbes_1865 and Cbes_1866 are identical to N-terminal GH5 module of Csac_1077, and C-terminal module in Cbes_1866 is identical to the C-terminal module in Csac_1077 which has been experimentally shown to display mannanase and cellulase/lichenase activities, respectively (40).
TMD, transmembrane domain; Membr., membrane protein fraction; C, cellulose; X, xylan; CBMX, an unknown module possibly pectin binding.
Figure 2.Comparison of the two relative gene clusters involved in biomass conversion in C. bescii DSM 6725 (A) and C. saccharolyticus DSM 8903 (B). Abbreviations: GH5, GH9, GH10, GH44, GH48 and GH74, glycoside hydrolases of families 5, 9, 10, 44, 48 and 74, respectively; CBM3, carbohydrate-binding module of family 3 where ‘b’ and ‘c’ are of subgroups within CBM3; GT39, glycosyl transferase of family 39; PL3, PL9 and PL11, polysaccharide lyases of families 3, 9 and 11; X, module of unknown function with homology to pfam CBM_4_9, Signal peptides, linkers and fragments of unknown function are shown in violet, blue and grey colors, respectively. CelA, CelB and ManA (encoding by Csac_1076, _1077 and _1078) are enzymes with experimentally demonstrated activities.
Figure 3.Scanning electron microscopy (SEM) images of C. bescii cells attached to xylan from oat spelts (A) and to switchgrass (B). The bars indicate (A) 1 μm and (B) 2 μm, respectively.
Caldicellulosiruptor bescii genes encoding proteins with putative cell adhesion, protein–protein interaction or carbohydrate-binding function
| Gene name | Protein (AAs) | SP | Annotation | Domain structure | Transcriptomics | Proteomics | |||
|---|---|---|---|---|---|---|---|---|---|
| FilterPaper | Significant | ExtP | Membrane | WC | |||||
| Cbes_0012 | 3027 | Y | Q466C0 Putative uncharacterized protein | Up | Yes | CXn | CXn | ||
| Cbes_0077 | 1710 | Y | A4J714 S-layer domain protein | Up | Yes | CXn | C | ||
| Cbes_0594 | 755 | Y | Q59154 Endoglucanase | GH5- | Up | Yes | C | ||
| Cbes_0608 | 547 | Y | A3DET8 Cellulose 1,4-beta-cellobiosidase | X- | Up | Yes | |||
| Cbes_0438 | 1157 | Y | A4XG20 S-layer domain protein | Up | No | Xn | Yes | ||
| Cbes_1839 | 575 | Y | A4XM24 S-layer domain protein | Up | Yes | ||||
| Cbes_1943 | 277 | Y | A4XI88 S-layer domain protein | Yes | No | Yes | |||
| Cbes_2295 | 1074 | Y | A4XM87 S-layer domain protein | Yes | Yes | Xn | C | ||
| Cbes_2341 | 484 | Y | A4XH32 S-layer domain protein | X- | Y/N | No | |||
| Cbes_1573 | 1055 | Y | A4XH96 Putative uncharacterized protein | Y/N | No | ||||
| Cbes_2303 | 1018 | Y | A4XM93 S-layer domain protein | Y/N | No | CXn | CXn | ||
| Cbes_2342 | 1010 | Y | A4XH31 Putative uncharacterized protein | Y/N | No | Yes | |||
| Cbes_1944 | 1201 | Y | A4XI87 Fibronectin, type III domain protein | X- | Y/N | No | Yes | ||
| Cbes_1945 | 1265 | Y | A4XI87 Fibronectin, type III domain protein | X- | Y/N | No | |||
| Cbes_0190 | 582 | N | A4XM45 Peptidase M23B | X- | UP | Yes | |||
| Cbes_0508 | 203 | Y | A4XIM2 Allergen V5/Tpx-1 family protein | Y/N | No | Yes | |||
| Cbes-0560 | 507 | Y | A4XHM2 Peptidoglycan-binding LysM | Yes | No | Yes | |||
| Cbes_1391 | 109 | Y | A4XKU6 Peptidoglycan-binding LysM | Y/N | No | Yes | |||
| Cbes_2402 | 511 | N | A4XGE4 Peptidoglycan-binding LysM | X- | Y/N | No | Yes | ||
| Cbes_0174 | 951 | Y | Extracellular solute-binding protein family 1 | Up | No | Xn | Xn | ||
| Cbes_0181 | 595 | Y | Extracellular solute-binding protein family 1 | Up | Yes | Xn | CXn | ||
SP, Signal Peptide; SLH, surface layer homology domain; SBP1, solute-binding protein of family 1; X, domain not present in PFAM; CBM_X, Pfam annotation of PF06204; RHS, multiple tandem 22-residue repeats each containing strongly conserved dipeptide YD; WC, cell-extract; C, cells grown on cellulose; Xn, cells grown on xylan.
Distribution of COG within some genomes of anaerobic thermophiles
| Computed frequency | Average | Function | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| COG | Cbes | Csac | Teth | TTE | Cthe | Tmar | Cbes | Csac | Teth | TTE | Cthe | Tmar | ||
| A | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.240 | 0.240 | 0.240 | 0.240 | 0.240 | 0.240 | 0.056 | RNA processing |
| B | 0.00 | 0.11 | 0.00 | 0.10 | 0.05 | 0.07 | 0.167 | 0.390 | 0.167 | 0.388 | 0.747 | 0.304 | 0.149 | Chromatin structure |
| C | 6.08 | 5.98 | 6.67 | 6.91 | 6.02 | 8.16 | 0.069 | 0.063 | 0.892 | 0.871 | 0.065 | 0.726 | 9.581 | Energy |
| D | 2.75 | 1.96 | 2.60 | 2.25 | 1.80 | 1.54 | 0.018 | 0.768 | 0.967 | 0.108 | 0.679 | 0.499 | 1.537 | Cell division |
| E | 10.18 | 9.69 | 11.98 | 12.41 | 8.42 | 13.54 | 0.202 | 0.141 | 0.480 | 0.395 | 0.044 | 0.207 | 11.876 | Amino acids |
| F | 3.52 | 3.12 | 3.78 | 3.30 | 3.10 | 3.70 | 0.373 | 0.181 | 0.478 | 0.742 | 0.172 | 0.524 | 3.739 | Nucleotides |
| G | 12.29 | 12.18 | 10.32 | 9.22 | 7.40 | 12.28 | 0.099 | 0.895 | 0.750 | 0.629 | 0.406 | 0.099 | 8.169 | Carbohydrates |
| H | 5.63 | 4.87 | 5.66 | 3.40 | 4.58 | 3.98 | 0.394 | 0.245 | 0.600 | 0.068 | 0.198 | 0.881 | 6.126 | Coenzymes |
| I | 1.79 | 2.01 | 2.24 | 3.04 | 2.22 | 2.30 | 0.168 | 0.222 | 0.711 | 0.436 | 0.282 | 0.308 | 2.860 | Lipids |
| J | 8.90 | 7.99 | 9.03 | 8.17 | 7.77 | 9.42 | 0.311 | 0.220 | 0.675 | 0.763 | 0.201 | 0.631 | 10.513 | Translation |
| K | 8.45 | 8.31 | 7.73 | 8.43 | 8.75 | 5.93 | 0.158 | 0.813 | 0.663 | 0.161 | 0.107 | 0.152 | 7.207 | Transcription |
| L | 8.13 | 13.29 | 10.44 | 9.11 | 14.67 | 6.28 | 0.485 | 0.992 | 0.134 | 0.689 | 0.001 | 0.207 | 8.047 | DNA |
| M | 5.89 | 5.88 | 5.72 | 5.81 | 7.59 | 5.16 | 0.378 | 0.619 | 0.584 | 0.605 | 0.096 | 0.454 | 5.359 | Cell membrane |
| N | 8.51 | 3.76 | 3.07 | 3.72 | 4.35 | 4.12 | 0.016 | 0.598 | 0.489 | 0.592 | 0.314 | 0.348 | 3.136 | Cell motility and secretion |
| O | 3.84 | 3.18 | 3.78 | 4.19 | 4.07 | 3.77 | 0.181 | 0.036 | 0.159 | 0.671 | 0.726 | 0.157 | 4.521 | Posttranslational modification |
| P | 4.87 | 4.39 | 5.78 | 6.34 | 4.77 | 8.37 | 0.077 | 0.038 | 0.774 | 0.635 | 0.067 | 0.126 | 6.809 | Inorganic ions |
| Q | 3.59 | 1.32 | 1.24 | 1.78 | 1.48 | 1.47 | 0.075 | 0.202 | 0.179 | 0.646 | 0.750 | 0.245 | 2.153 | Secondary metabolites |
| T | 5.57 | 6.88 | 5.13 | 6.86 | 8.28 | 5.09 | 0.383 | 0.785 | 0.553 | 0.783 | 0.094 | 0.547 | 4.779 | Signal transduction |
| U | 0.00 | 2.12 | 3.01 | 2.20 | 2.31 | 2.51 | 0.033 | 0.695 | 0.067 | 0.273 | 0.767 | 0.171 | 1.658 | Transport |
| V | 0.00 | 2.81 | 1.83 | 2.72 | 2.27 | 2.30 | 0.035 | 0.878 | 0.449 | 0.859 | 0.276 | 0.736 | 1.707 | Defense mechanism |
| Z | 0.00 | 0.16 | 0.00 | 0.00 | 0.09 | 0.00 | 0.310 | 1.000 | 0.310 | 0.310 | 0.022 | 0.310 | 0.018 | Cytoskeleton |
| R | 9.30 | 11.00 | 10.79 | 11.63 | 10.00 | 13.02 | General prediction only | |||||||
| S | 5.20 | 6.61 | 7.94 | 6.88 | 5.77 | 17.00 | Function unknown | |||||||
| Not in COG | 32.11 | 18.48 | 13.64 | 14.61 | 22.23 | 9.85 | Not assigned | |||||||
aFrequency was computed as percentage of genes assigned to each COG group among all genes with COG assignment. When a gene was assigned to multiple COG groups, it would be counted multiple times.
bThe P-value was calculated based on the assumption that the distribution of the frequence in each COG group follows a normal distribution.
Cbes, Caldicellulosiruptor bescii DSM 6725; Csac, Caldicellulosiruptor saccharolyticus DSM 8903; Teth, T. pseudethanolicus
ATCC 33223; TTE, T. tengcongensis MB4; Cthe, C. thermocellum ATCC 27405; Tmar, Thermotoga maritima MSB8.