| Literature DB >> 34542586 |
Wei Zou1, Guangbin Ye1, Chaojie Liu1, Kaizheng Zhang1, Hehe Li2, Jiangang Yang1.
Abstract
Clostridium beijerinckii is a well-known anaerobic solventogenic bacterium which inhabits a wide range of different niches. Previously, we isolated five butyrate-producing C. beijerinckii strains from pit mud (PM) of strong-flavor baijiu (SFB) ecosystems. Genome annotation of the five strains showed that they could assimilate various carbon sources as well as ammonium to produce acetate, butyrate, lactate, hydrogen, and esters but did not produce the undesirable flavors isopropanol and acetone, making them useful for further exploration in SFB production. Our analysis of the genomes of an additional 233 C. beijerinckii strains revealed an open pangenome based on current sampling and will likely change with additional genomes. The core genome, accessory genome, and strain-specific genes comprised 1567, 8851, and 2154 genes, respectively. A total of 298 genes were found only in the five C. beijerinckii strains from PM, among which only 77 genes were assigned to Clusters of Orthologous Genes categories. In addition, 15 transposase and 12 phage integrase families were found in all five C. beijerinckii strains from PM. Between 18 and 21 genome islands were predicted for the five C. beijerinckii genomes. The existence of a large number of mobile genetic elements indicated that the genomes of the five C. beijerinckii strains evolved with the loss or insertion of DNA fragments in the PM of SFB ecosystems. This study presents a genomic framework of C. beijerinckii strains from PM that could be used for genetic diversification studies and further exploration of these strains.Entities:
Keywords: zzm321990 Clostridium beijerinckiizzm321990 ; baijiu; butyrate; mobile genetic elements; pangenome; pit mud
Mesh:
Year: 2021 PMID: 34542586 PMCID: PMC8527462 DOI: 10.1093/g3journal/jkab317
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Genome features of the five Clostridium beijerinckii strains isolated from pit mud (PM)
| Features | 2-1 | 3-8 | G3-1 | G3-3 | G3-5 |
|---|---|---|---|---|---|
| Genome size (bp) | 5,626,308 | 5,461,616 | 5,637,955 | 5,609,807 | 5,637,825 |
| No. of all scaffolds | 328 | 92 | 129 | 105 | 130 |
| Total reads | 10,554,482 | 6,891,264 | 10,865,966 | 9,147,036 | 8,625,822 |
| Total reads length (bp) | 1,704,906,022 | 1,021,795,131 | 1,615,799,558 | 1,352,042,586 | 1,282,836,405 |
| Largest scaffold length (bp) | 246,790 | 235,144 | 235,510 | 279,509 | 235,510 |
| Scaffold N50 (bp) | 60,705 | 83,607 | 92,783 | 122,976 | 92,771 |
| G+C content (%) | 29.78 | 29.58 | 29.64 | 29.60 | 29.64 |
| Coding protein number | 5035 | 4932 | 5081 | 5071 | 5083 |
| Proteins annotated belong RAST subsystems | 2078 | 1869 | 2083 | 2084 | 2083 |
| rRNA | 6 | 8 | 37 | 17 | 36 |
| tRNA | 59 | 52 | 81 | 58 | 80 |
Figure 1Biosynthetic pathways of acetone, butanol, and ethanol (ABE) and isopropanol, butanol, and ethanol (IBE) from glucose in the five Clostridium beijerinckii strains isolated from pit mud (PM). Genes shown in gray were absent in all five strains. Genes shown in red were absent in strain 3-8 but present in strains 2-1, G3-1, G3-3, and G3-5.
Figure 2Mathematical formula fitting the pangenome and core genome size when the genome number of Clostridium beijerinckii strains varied from 1 to 233. The cumulative curve (in blue) indicates an open pangenome.
Figure 3Phylogenetic trees of Clostridium beijerinckii strains based on concatenated amino acid sequences of the core genome. Red: strain from pit mud (PM); green: strains from fecal material; blue: strains from soil.
Figure 4Distribution of Clusters of Orthologous Genes (COG) categories between the core genome, accessory genome, and strain-specific genes of Clostridium beijerinckii strains. (B) chromatin structure and dynamics; (C) energy production and conversion; (D) cell cycle control, cell division, chromosome partitioning; (E) amino acid transport and metabolism; (F) nucleotide transport and metabolism; (G) carbohydrate transport and metabolism; (H) coenzyme transport and metabolism; (I) lipid transport and metabolism; (J) translation, ribosomal structure, and biogenesis; (K) transcription; (L) replication, recombination, and repair; (M) cell wall/membrane/envelope biogenesis; (N) cell motility; (O) posttranslational modification, protein turnover, chaperones; (P) inorganic ion transport and metabolism; (Q) secondary metabolite biosynthesis, transport, and catabolism; (S) function unknown; (T) signal transduction mechanisms; (U) intracellular trafficking, secretion, and vesicular transport; (V) defense mechanisms; (W) extracellular structures; and (Z) cytoskeleton.
Clusters of Orthologous Genes (COG) annotation of acceesory genes shared only by Clostridium beijerinckii strains isolated from pit mud (PM)
| COG category | Function description | 2-1 | 3-8 | G3-1 | G3-3 | G3-5 |
|---|---|---|---|---|---|---|
| D | Phage tail tape measure protein | 1# | 1 | 1 | 1 | 1 |
| S | von Willebrand factor, type A | 1 | 1 | 1 | 1 | 1 |
| L | Subunit R is required for both nuclease and ATPase activities, but not for modification | 1 | 1 | 1 | 1 | 1 |
| T | Nacht domain | 1 | 1 | 1 | 1 | 1 |
| K | Bacterial RNA polymerase, alpha chain C terminal domain | 1 | 1 | 1 | 1 | 1 |
| L | DNA primase | 1 | 1 | 1 | 1 | 1 |
| D | DNA recombination | 1 | 1 | 1 | 1 | 1 |
| U | Dynamin family | 1 | 1 | 1 | 1 | 1 |
| S | Dynamin family | 1 | 1 | 1 | 1 | 1 |
| S | Dynamin family | 1 | 1 | 1 | 1 | 1 |
| L | Helicase activity | 1 | 1 | 1 | 1 | 1 |
| M | PFAM Glycosyl transferase family 2 | 1 | 1 | 1 | 1 | 1 |
| S | Phage minor structural protein | 1 | 1 | 1 | 1 | 1 |
| L | Domain of unknown function (DUF4277) | 1 | 0# | 1 | 1 | 1 |
| L | Uncharacterized conserved protein (DUF2075) | 1 | 1 | 1 | 1 | 1 |
| L | TIGRFAM type I restriction system adenine methylase (hsdM) | 1 | 1 | 1 | 1 | 1 |
| EGP | Major facilitator superfamily | 1 | 1 | 1 | 1 | 1 |
| H | Catalyzes the cyclization of GTP to (8S)-3′,8-cyclo-7,8-dihydroguanosine 5′-triphosphate | 1 | 1 | 1 | 1 | 1 |
| G | N-Acetylmuramoyl- | 1 | 1 | 1 | 1 | 1 |
| K | DNA binding | 1 | 1 | 1 | 1 | 1 |
| V | Type I restriction modification DNA specificity domain | 1 | 1 | 1 | 1 | 1 |
| V | Type I restriction modification DNA specificity domain | 1 | 1 | 1 | 1 | 1 |
| EGP | Major facilitator superfamily | 1 | 1 | 1 | 1 | 1 |
| GM | Methyltransferase FkbM domain | 1 | 1 | 1 | 1 | 1 |
| GM | Methyltransferase FkbM domain | 1 | 1 | 1 | 1 | 1 |
| M | transferase activity, transferring glycosyl groups | 1 | 1 | 1 | 1 | 1 |
| S | Protein of unknown function DUF262 | 1 | 1 | 1 | 1 | 1 |
| M | transferase activity, transferring glycosyl groups | 1 | 1 | 1 | 1 | 1 |
| L | Belongs to the “phage” integrase family | 1 | 1 | 1 | 1 | 1 |
| S | PFAM transposase YhgA family protein | 1 | 1 | 1 | 1 | 1 |
| L | Psort location Cytoplasmic, score | 1 | 1 | 1 | 1 | 1 |
| L | Transposase | 0 | 0 | 1 | 1 | 1 |
| K | Bacterial regulatory proteins, tetR family | 1 | 1 | 1 | 1 | 1 |
| K | LysR family | 1 | 1 | 1 | 1 | 1 |
| M | Catalyzes the reduction of dTDP-6-deoxy- | 1 | 1 | 1 | 1 | 1 |
| S | Glycosyltransferase like family 2 | 1 | 1 | 1 | 1 | 1 |
| D | Cell division | 1 | 1 | 1 | 1 | 1 |
| S | Protein of unknown function (DUF2971) | 1 | 1 | 1 | 1 | 1 |
| S | PD-(D/E)XK nuclease family transposase | 1 | 1 | 1 | 1 | 1 |
| S | PFAM Abortive infection protein | 1 | 1 | 1 | 1 | 1 |
| K | PFAM Helix-turn-helix | 1 | 1 | 1 | 1 | 1 |
| E | Pfam: DUF955 | 1 | 1 | 1 | 1 | 1 |
| S | head morphogenesis protein, SPP1 gp7 family | 1 | 1 | 1 | 1 | 1 |
| KT | Lecithin retinol acyltransferase | 1 | 1 | 1 | 1 | 1 |
| T | Diguanylate cyclase | 1 | 1 | 1 | 1 | 1 |
| L | Transposase DDE domain | 1 | 1 | 1 | 1 | 1 |
| M | Cell wall binding | 1 | 1 | 1 | 1 | 1 |
| D | Cell wall binding repeat | 1 | 1 | 1 | 1 | 1 |
| L | Transposase DDE domain | 1 | 1 | 1 | 1 | 1 |
| L | Resolvase, N terminal domain | 1 | 1 | 1 | 1 | 1 |
| S | Putative restriction endonuclease | 1 | 1 | 1 | 1 | 1 |
| E | Zn peptidase | 1 | 1 | 1 | 1 | 1 |
| S | Protein of unknown function (DUF2691) | 1 | 1 | 1 | 1 | 1 |
| G | PFAM Polysaccharide deacetylase | 0 | 1 | 1 | 1 | 1 |
| S | NADPH-dependent FMN reductase | 1 | 1 | 1 | 1 | 1 |
| S | Helix-turn-helix domain | 1 | 1 | 1 | 1 | 1 |
| S | Protein of unknown function (DUF3268) | 1 | 1 | 1 | 1 | 1 |
| S | Domain of unknown function (DUF4258) | 1 | 1 | 1 | 1 | 1 |
| L | Belongs to the “phage” integrase family | 1 | 1 | 1 | 1 | 1 |
| L | Psort location Cytoplasmic, score 8.87 | 1 | 1 | 1 | 1 | 1 |
| L | Psort location Cytoplasmic, score | 1 | 1 | 1 | 1 | 1 |
| L | Staphylococcal protein of unknown function (DUF960) | 1 | 1 | 1 | 1 | 1 |
| L | Transposase | 1 | 0 | 1 | 1 | 1 |
| L | Belongs to the “phage” integrase family | 1 | 1 | 1 | 1 | 1 |
| L | Belongs to the “phage” integrase family | 1 | 1 | 1 | 1 | 1 |
| L | Psort location Cytoplasmic, score | 1 | 1 | 1 | 1 | 1 |
| K | Helix-turn-helix XRE-family like proteins | 1 | 1 | 1 | 1 | 1 |
| K | Helix-turn-helix XRE-family like proteins | 1 | 1 | 1 | 1 | 1 |
| K | PFAM helix-turn-helix HxlR type | 1 | 1 | 1 | 1 | 1 |
| S | Domain of unknown function (DUF3797) | 1 | 1 | 1 | 1 | 1 |
| C | Electron transfer flavoprotein | 0 | 0 | 1 | 0 | 1 |
| V | Mate efflux family protein | 1 | 1 | 1 | 1 | 1 |
| L | PFAM transposase, mutator | 1 | 1 | 1 | 1 | 1 |
#1: exist, 0: not exist.
Distribution of genome islands in Clostridium beijerinckii strains isolated from pit mud (PM)
| Strains | Total size | Number | Total length of GIs/genome size (%) | Total proteins | Hypothetical protein | Phage-related proteins |
|---|---|---|---|---|---|---|
| 2-1 | 406,702 | 20 | 7.2 | 290 | 155 | 23 |
| 3-8 | 199,431 | 19 | 3.7 | 232 | 112 | 25 |
| G3-1 | 269,305 | 21 | 4.8 | 290 | 125 | 20 |
| G3-3 | 266,058 | 21 | 4.7 | 290 | 127 | 20 |
| G3-5 | 185,086 | 18 | 3.3 | 230 | 124 | 16 |