| Literature DB >> 28469608 |
Leandro N Lemos1,2, Roberta V Pereira1, Ronaldo B Quaggio1, Layla F Martins1, Livia M S Moura1,2, Amanda R da Silva1,2, Luciana P Antunes1, Aline M da Silva1, João C Setubal1,3.
Abstract
Microbial consortia selected from complex lignocellulolytic microbial communities are promising alternatives to deconstruct plant waste, since synergistic action of different enzymes is required for full degradation of plant biomass in biorefining applications. Culture enrichment also facilitates the study of interactions among consortium members, and can be a good source of novel microbial species. Here, we used a sample from a plant waste composting operation in the São Paulo Zoo (Brazil) as inoculum to obtain a thermophilic aerobic consortium enriched through multiple passages at 60°C in carboxymethylcellulose as sole carbon source. The microbial community composition of this consortium was investigated by shotgun metagenomics and genome-centric analysis. Six near-complete (over 90%) genomes were reconstructed. Similarity and phylogenetic analyses show that four of these six genomes are novel, with the following hypothesized identifications: a new Thermobacillus species; the first Bacillus thermozeamaize genome (for which currently only 16S sequences are available) or else the first representative of a new family in the Bacillales order; the first representative of a new genus in the Paenibacillaceae family; and the first representative of a new deep-branching family in the Clostridia class. The reconstructed genomes from known species were identified as Geobacillus thermoglucosidasius and Caldibacillus debilis. The metabolic potential of these recovered genomes based on COG and CAZy analyses show that these genomes encode several glycoside hydrolases (GHs) as well as other genes related to lignocellulose breakdown. The new Thermobacillus species stands out for being the richest in diversity and abundance of GHs, possessing the greatest potential for biomass degradation among the six recovered genomes. We also investigated the presence and activity of the organisms corresponding to these genomes in the composting operation from which the consortium was built, using compost metagenome and metatranscriptome datasets generated in a previous study. We obtained strong evidence that five of the six recovered genomes are indeed present and active in that composting process. We have thus discovered three (perhaps four) new thermophillic bacterial species that add to the increasing repertoire of known lignocellulose degraders, whose biotechnological potential can now be investigated in further studies.Entities:
Keywords: bacterial genome reconstruction; cellulolytic; composting; consortium; glycoside hydrolases; metagenome; thermophilic
Year: 2017 PMID: 28469608 PMCID: PMC5395642 DOI: 10.3389/fmicb.2017.00644
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Assembly metrics of shotgun metagenomics of the ZCTH02 consortium.
| Sample description | Thermophilic and cellulolytic composting-derived bacterial consortium |
| Location of São Paulo Zoo composting (source of inoculum) | 23°38′56.9″S 46°37′18.7″W |
| Metagenome ID | ZCTH02 |
| Number of paired-end reads | 3,046,968 |
| Number of assembled contigs | 13,240 |
| N50 (bp) | 17,996 |
| Longest contig (bp) | 509,962 |
| Total assembled length (bp) | 27,862,858 |
| Number of assembled contigs ≥1,600 bp | 1,468 |
| Number of paired-end reads in contigs ≥1,600 bp | 1,263,585 |
| Total assembled contigs length ≥1,600 bp (bp) | 20,608,826 |
Only this dataset was used on binning analysis.
Genomic features of bacterial genomes reconstructed from the ZCTH02 consortium shotgun metagenome.
| Hypothesis | New | First | New Paenibacillaceae genus (similar to | New deep-branching family (Clostridia class) genome | ||
| Estimated Genome Size (Mb) | 3.37 | 3.44 | 3.12 | 4.38 | 2.86 | 2.77 |
| Number of contigs | 46 | 143 | 187 | 244 | 244 | 304 |
| Mapped reads (%) | 32.85 | 22.18 | 15.75 | 13.86 | 7.34 | 7.40 |
| GGDC (Difference in % G+C) | 3.77 (distinct species) | 2.05 (interpretation: distinct species with respect to | 3.77 (distinct species) | 0.54 (either distinct or same species) | 0.52 (either distinct or same species) | 15.40 (distinct species) |
| Best hit (16S rRNA) | Uncultured bacteria ( | Uncultured bacterium ( | Uncultured | Uncultured composting bacterium ( | ||
| Coverage/Identity (%) | 97/99 | 99/99 | 95/100 | 100/99 | 93/100 | 95/98 |
| Best hit (DNA Primase) (%) - nt | NA | |||||
| Coverage/Identity (%) | 100/89 | 3/83 | 8/79 | 100/99 | 100/100 | NA |
| Estimated Completeness (%) | 95.05 | 95.38 | 93.35 | 99.44 | 97.96 | 91.58 |
| Estimated Contamination (%) | 0.00 | 1.41 | 4.92 | 2.20 | 0.58 | 4.83 |
| G+C content (%) | 64.34 | 53.68 | 62.51 | 43.41 | 52.15 | 65.53 |
| Maximum scaffold length (bp) | 509,962 | 208,013 | 120,895 | 122,932 | 49,485 | 62,878 |
| N50 contig length | 168,514 | 69,814 | 33,295 | 33,740 | 18,062 | 12,917 |
| CDS number | 3,041 | 3,297 | 2,969 | 4,376 | 3,156 | 2,722 |
GGDC (Genome-to-Genome Distance Calculator) comparison between thermophilic compost genomes reconstructed in this work and most similar genome.
16S rRNA comparative analysis with partial bin sequences.
Estimated completeness and contamination of draft genome based on single copy lineage-specific marker genes (Parks et al., .
Direct comparisons between BZ5 and C. debilis (NZ_KB912918.1) genome.
No hit with nt database.
Figure 1Phylogenetic analysis of the six reconstructed bacterial genomes. The analyses were based on ~400 conserved single-copy protein sequences, selected among microbial type strain genomes phylogenetically close to BZ1 and BZ3 (A), BZ2 and BZ5 (B), BZ4 (C), and BZ6 (D). Black dots indicate bootstrap values of ≥80%.
Figure 2Functional profile of the six reconstructed genomes based on COG categories and CAZy families. (A) Abundance of COGs of each COG functional category based on relative abundance of genes annotated per genome. (B) Abundance of CAZy families based on relative abundance of genes annotated per genome. The figures were drawn based on data shown in Table S1 (A) and Table 4 (B).
Number of CDSs from reconstructed genomes mapped to enzymes in the CAZy database.
| SLH | – | 34 | 13 | 35 | 29 | 0 | 4 |
| GH | 1 | 2 | 0 | 1 | 2 | 3 | 1 |
| 2 | 3 | 1 | 0 | 1 | 2 | 0 | |
| 3 | 3 | 1 | 1 | 1 | 1 | 0 | |
| 4 | 3 | 2 | 0 | 1 | 3 | 1 | |
| 5 | 0 | 0 | 0 | 1 | 0 | 0 | |
| 8 | 1 | 0 | 1 | 0 | 0 | 0 | |
| 9 | 2 | 0 | 1 | 0 | 0 | 0 | |
| 10 | 3 | 0 | 1 | 2 | 0 | 0 | |
| 11 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 13 | 4 | 0 | 2 | 8 | 3 | 1 | |
| 15 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 16 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 18 | 2 | 0 | 2 | 2 | 2 | 1 | |
| 23 | 2 | 2 | 1 | 2 | 1 | 3 | |
| 26 | 0 | 0 | 1 | 0 | 0 | 0 | |
| 30 | 1 | 1 | 0 | 0 | 0 | 0 | |
| 31 | 2 | 0 | 2 | 0 | 1 | 2 | |
| 32 | 1 | 4 | 0 | 1 | 1 | 0 | |
| 35 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 36 | 2 | 0 | 0 | 1 | 2 | 0 | |
| 38 | 0 | 0 | 1 | 0 | 0 | 1 | |
| 39 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 42 | 3 | 1 | 1 | 0 | 0 | 0 | |
| 43 | 9 | 0 | 3 | 0 | 1 | 0 | |
| 51 | 4 | 1 | 3 | 0 | 0 | 0 | |
| 52 | 0 | 0 | 0 | 1 | 0 | 1 | |
| 53 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 57 | 0 | 0 | 0 | 0 | 0 | 1 | |
| 65 | 0 | 0 | 0 | 0 | 2 | 0 | |
| 67 | 1 | 0 | 0 | 1 | 0 | 0 | |
| 73 | 2 | 0 | 1 | 1 | 0 | 0 | |
| 74 | 0 | 0 | 0 | 1 | 0 | 0 | |
| 76 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 78 | 0 | 0 | 0 | 1 | 0 | 1 | |
| 88 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 94 | 2 | 0 | 0 | 0 | 2 | 0 | |
| 95 | 1 | 0 | 1 | 0 | 0 | 0 | |
| 105 | 4 | 0 | 1 | 0 | 0 | 0 | |
| 108 | 0 | 0 | 0 | 0 | 0 | 2 | |
| 109 | 13 | 2 | 11 | 3 | 5 | 7 | |
| 113 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 115 | 0 | 0 | 2 | 0 | 0 | 0 | |
| 120 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 127 | 2 | 0 | 1 | 0 | 0 | 0 | |
| 129 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 130 | 1 | 2 | 0 | 1 | 1 | 1 | |
| GT | 2 | 14 | 7 | 10 | 8 | 5 | 7 |
| 4 | 12 | 7 | 10 | 5 | 4 | 7 | |
| 5 | 0 | 0 | 0 | 1 | 0 | 0 | |
| 8 | 1 | 0 | 1 | 2 | 0 | 0 | |
| 19 | 0 | 1 | 0 | 1 | 1 | 3 | |
| 26 | 1 | 3 | 2 | 1 | 1 | 2 | |
| 27 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 28 | 2 | 3 | 2 | 4 | 4 | 1 | |
| 30 | 0 | 0 | 1 | 0 | 0 | 0 | |
| 32 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 35 | 0 | 0 | 0 | 1 | 0 | 0 | |
| 39 | 0 | 0 | 1 | 1 | 0 | 0 | |
| 51 | 3 | 2 | 4 | 3 | 3 | 1 | |
| 81 | 0 | 1 | 0 | 2 | 0 | 2 | |
| 83 | 0 | 1 | 0 | 2 | 0 | 0 | |
| 84 | 1 | 0 | 0 | 0 | 1 | 0 | |
| 94 | 1 | 1 | 1 | 1 | 0 | 0 | |
| CE | 1 | 14 | 6 | 4 | 6 | 5 | 3 |
| 3 | 3 | 0 | 1 | 3 | 3 | 0 | |
| 4 | 8 | 7 | 8 | 7 | 4 | 2 | |
| 6 | 0 | 0 | 2 | 0 | 0 | 0 | |
| 7 | 2 | 1 | 0 | 4 | 1 | 0 | |
| 8 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 9 | 3 | 0 | 2 | 1 | 1 | 1 | |
| 10 | 7 | 1 | 3 | 2 | 2 | 5 | |
| 11 | 0 | 0 | 0 | 0 | 0 | 1 | |
| 12 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 14 | 2 | 1 | 2 | 2 | 1 | 2 | |
| 15 | 2 | 0 | 0 | 0 | 0 | 0 | |
| PL | 9 | 2 | 0 | 1 | 0 | 0 | 0 |
| 11 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 12 | 0 | 1 | 0 | 0 | 0 | 1 | |
| 22 | 0 | 0 | 0 | 0 | 0 | 1 | |
| AA | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 2 | 3 | 0 | 2 | 0 | 0 | 2 | |
| 4 | 0 | 2 | 2 | 5 | 1 | 2 | |
| 6 | 2 | 1 | 1 | 5 | 0 | 0 | |
| 7 | 0 | 0 | 1 | 0 | 0 | 2 | |
| CBM | 6 | 4 | 0 | 1 | 0 | 0 | 0 |
| 9 | 3 | 0 | 1 | 0 | 0 | 0 | |
| 16 | 0 | 0 | 0 | 0 | 1 | 1 | |
| 22 | 3 | 0 | 2 | 0 | 0 | 0 | |
| 25 | 0 | 0 | 0 | 0 | 0 | 1 | |
| 30 | 2 | 0 | 1 | 0 | 0 | 0 | |
| 32 | 0 | 0 | 5 | 0 | 0 | 0 | |
| 34 | 1 | 0 | 1 | 1 | 1 | 1 | |
| 35 | 1 | 0 | 0 | 0 | 0 | 0 | |
| 38 | 0 | 2 | 0 | 0 | 0 | 0 | |
| 40 | 0 | 1 | 0 | 0 | 0 | 0 | |
| 48 | 0 | 0 | 0 | 2 | 0 | 0 | |
| 50 | 13 | 9 | 15 | 17 | 14 | 12 | |
| 54 | 0 | 1 | 1 | 0 | 0 | 0 | |
| 61 | 2 | 0 | 0 | 0 | 0 | 0 | |
| 66 | 2 | 0 | 0 | 0 | 0 | 0 | |
CAZy families: GH, Glycoside Hydrolases; GT, GlycosylTransferases; CE, Carbohydrate Esterases; PL, Polysaccharide Lyases; AA, Auxiliary Activities; CBM, Carbohydrate-Binding Modules.
Number of CDSs assigned to COGs related to lignocellulose metabolism in reconstructed genomes.
| COG0296 | 1,4-alpha-glucan branching enzyme | 0 | 0 | 0 | 1 | 0 | 0 |
| COG0366 | Glycosidases | 4 | 0 | 2 | 6 | 3 | 1 |
| COG0383 | Alpha-mannosidase | 0 | 0 | 0 | 0 | 0 | 2 |
| COG0438 | Glycosyltransferase | 13 | 8 | 11 | 7 | 4 | 10 |
| COG0662 | Mannose-6-phosphate isomerase | 2 | 4 | 1 | 1 | 1 | 1 |
| COG0726 | Predicted xylanase/chitin deacetylase | 8 | 0 | 7 | 7 | 4 | 2 |
| COG0836 | Mannose-1-phosphate guanylyltransferase | 1 | 0 | 2 | 0 | 0 | 1 |
| COG1172 | Xylose/arabinose/galactoside ABC-type transport systems, permease components | 4 | 1 | 3 | 1 | 1 | 10 |
| COG1216 | Predicted glycosyltransferases | 2 | 2 | 3 | 1 | 0 | 0 |
| COG1363 | Cellulase M and related proteins | 0 | 2 | 0 | 4 | 5 | 1 |
| COG1440 | Phosphotransferase system cellobiose-specific component IIB | 0 | 0 | 0 | 0 | 6 | 0 |
| COG1447 | Phosphotransferase system cellobiose-specific component IIA | 0 | 0 | 0 | 1 | 7 | 0 |
| COG1455 | Phosphotransferase system cellobiose-specific component IIC | 0 | 0 | 0 | 1 | 5 | 0 |
| COG1472 | Beta-glucosidase-related glycosidases | 3 | 1 | 1 | 1 | 1 | 0 |
| COG1486 | Alpha-galactosidases/6-phospho-beta-glucosidases, glycosyl hydrolases family 4 | 3 | 0 | 0 | 1 | 3 | 1 |
| COG1501 | Alpha-glucosidases, glycosyl hydrolases family 31 | 2 | 0 | 2 | 0 | 1 | 2 |
| COG1874 | Beta-galactosidase | 4 | 1 | 1 | 0 | 0 | 0 |
| COG2115 | Xylose isomerase | 1 | 0 | 1 | 1 | 1 | 1 |
| COG2132 | Putative multicopper oxidases | 2 | 8 | 7 | 5 | 0 | 3 |
| COG2152 | Predicted glycosylase | 1 | 2 | 0 | 1 | 1 | 1 |
| COG2160 | L-arabinose isomerase | 2 | 0 | 1 | 0 | 0 | 0 |
| COG2273 | Beta-glucanase/Beta-glucan synthetase | 1 | 0 | 0 | 0 | 0 | 0 |
| COG2723 | Beta-glucosidase/6-phospho-beta-glucosidase/beta-galactosidase | 3 | 0 | 1 | 2 | 3 | 1 |
| COG2730 | Endoglucanase | 0 | 0 | 0 | 1 | 0 | 0 |
| COG2814 | Arabinose efflux permease | 21 | 19 | 18 | 18 | 8 | 7 |
| COG3250 | Beta-galactosidase/beta-glucuronidase | 3 | 1 | 0 | 1 | 2 | 0 |
| COG3345 | Alpha-galactosidase | 2 | 0 | 0 | 1 | 2 | 0 |
| COG3405 | Endoglucanase Y | 1 | 0 | 1 | 0 | 0 | 0 |
| COG3459 | Cellobiose phosphorylase | 3 | 0 | 0 | 0 | 5 | 0 |
| COG3507 | Beta-xylosidase | 7 | 0 | 2 | 0 | 1 | 0 |
| COG3534 | Alpha-L-arabinofuranosidase | 4 | 1 | 4 | 0 | 0 | 0 |
| COG3661 | Alpha-glucuronidase | 1 | 0 | 1 | 1 | 0 | 0 |
| COG3664 | Beta-xylosidase | 0 | 0 | 1 | 1 | 0 | 0 |
| COG3693 | Beta-1,4-xylanase | 3 | 0 | 1 | 2 | 0 | 0 |
| COG3858 | Predicted glycosyl hydrolase | 2 | 1 | 2 | 2 | 2 | 1 |
| COG3867 | Arabinogalactan endo-1,4-beta-galactosidase | 1 | 0 | 0 | 0 | 0 | 0 |
| COG3940 | Predicted beta-xylosidase | 1 | 0 | 0 | 0 | 0 | 0 |
| COG4124 | Beta-mannanase | 0 | 0 | 1 | 0 | 0 | 0 |
| COG4213 | ABC-type xylose transport system, periplasmic component | 3 | 0 | 3 | 1 | 0 | 0 |
| COG4214 | ABC-type xylose transport system, permease component | 2 | 0 | 3 | 1 | 0 | 0 |
| COG5520 | O-Glycosyl hydrolase | 1 | 1 | 0 | 0 | 0 | 0 |
| COG5581 | Predicted glycosyltransferase | 2 | 1 | 2 | 1 | 1 | 1 |
Figure 3Variation of metagenome and metatranscriptome reads mapped to the six reconstructed genomes over days of composting. Relative abundance of reads (%) was calculated using total reads of each indicated genome per total reads in the metagenomic (DNA) or metatranscriptomic (RNA) sequences in samples collected from the Sao Paulo Zoo composting process and described previously (Antunes et al., 2016). Shaded bars indicate days of composting where the respective genomes were more abundant.
Figure 4Relative abundance of time-series composting metatranscriptome sequence reads mapped to lignocellulose degrading genes in the reconstructed genomes. (A) The numbers in each day column refer to relative abundance of CDSs representing different enzymatic functions expressed in per thousand to which metatranscriptomic reads were mapped. (B) Colored pie charts show the amount of normalized reads mapped to lignocellulose-related enzyme genes from the six reconstructed genomes over days of thermophilic composting indicated in colored boxes. Red arrows indicate turning step (aeration of the composting pile).
Figure 5An overview of lignocellulose degradation in the ZCTH02 consortium. The diagram shows the main constituents of lignocellulose (hemicellulose, cellulose and lignin). Each colored square represents a GH family or an AA family that contains one or more CDSs that were annotated in a given BZ genome. The color of each square corresponds to the BZ genome where those CDSs were annotated, according to the key in the figure.