Literature DB >> 29892569

Proteiniphilum saccharofermentans str. M3/6^T isolated from a laboratory biogas reactor is versatile in polysaccharide and oligopeptide utilization as deduced from genome-based metabolic reconstructions.

Geizecler Tomazetto¹, Sarah Hahnke², Daniel Wibberg³, Alfred Pühler³, Michael Klocke², Andreas Schlüter³.

Abstract

Proteiniphilum saccharofermentans str. M3/6T is a recently described species within the family Porphyromonadaceae (phylum Bacteroidetes), which was isolated from a mesophilic laboratory-scale biogas reactor. The genome of the strain was completely sequenced and manually annotated to reconstruct its metabolic potential regarding biomass degradation and fermentation pathways. The P. saccharofermentans str. M3/6T genome consists of a 4,414,963 bp chromosome featuring an average GC-content of 43.63%. Genome analyses revealed that the strain possesses 3396 protein-coding sequences. Among them are 158 genes assigned to the carbohydrate-active-enzyme families as defined by the CAZy database, including 116 genes encoding glycosyl hydrolases (GHs) involved in pectin, arabinogalactan, hemicellulose (arabinan, xylan, mannan, β-glucans), starch, fructan and chitin degradation. The strain also features several transporter genes, some of which are located in polysaccharide utilization loci (PUL). PUL gene products are involved in glycan binding, transport and utilization at the cell surface. In the genome of strain M3/6T, 64 PUL are present and most of them in association with genes encoding carbohydrate-active enzymes. Accordingly, the strain was predicted to metabolize several sugars yielding carbon dioxide, hydrogen, acetate, formate, propionate and isovalerate as end-products of the fermentation process. Moreover, P. saccharofermentans str. M3/6T encodes extracellular and intracellular proteases and transporters predicted to be involved in protein and oligopeptide degradation. Comparative analyses between P. saccharofermentans str. M3/6T and its closest described relative P. acetatigenes str. DSM 18083T indicate that both strains share a similar metabolism regarding decomposition of complex carbohydrates and fermentation of sugars.

Entities: CellLine Chemical Disease Species

Keywords: Anaerobic digestion; Bioconversion; Biomethanation; Carbohydrate-active enzymes; Metabolic pathway reconstruction; Polysaccharide utilization loci

Year: 2018 PMID： 29892569 PMCID： PMC5993710 DOI： 10.1016/j.btre.2018.e00254

Source DB: PubMed Journal: Biotechnol Rep (Amst) ISSN： 2215-017X

Introduction

Biogas can be produced by anaerobic digestion (AD) of a wide range of plant materials, organic wastes, and residual organic materials. The biogas-production process is regarded as eco-friendly technology to generate energy from biomass [1,2]. AD is commonly divided into four phases, i.e. hydrolysis, acidogenesis, acetogenesis and methanogenesis, which are conducted by complex consortia consisting of several hundreds of microbial species. Despite the fact that the overall biological process, which finally leads to the production of biogas, is well known, the majority of microbial biogas community members and their metabolic activities in particular are largely unknown [[3], [4], [5], [6], [7]]. In recent years, several bioreactors were taxonomically profiled by high-throughput sequencing of the 16S rRNA marker gene [5,[8], [9], [10]]. These studies reported that members of the classes Clostridia and Bacteroidia frequently dominate biogas communities. Members of both classes are responsible for degradation of complex carbohydrates and proteins to monomers and are able to ferment sugar molecules yielding volatile organic acids [11,12]. To deduce functional profiles of biogas communities, shotgun metagenome sequencing has been done [6,[13], [14], [15], [16], [17], [18]]. The first metagenomic studies were based on non-assembled short reads and on small numbers of short contigs [4,[14], [15], [16],19], providing a gene content overview of microbial communities involved in anaerobic digestion. Recently, deep metagenome sequencing of DNA from biogas communities enabled assembly of sequence reads and binning of contigs to bin-genomes improving gene prediction and functional interpretation of metagenome sequence data [6,10,20]. However, the reconstruction of complete genome sequences from metagenome sequence data is demanding which is due to highly related sequences originating from different organisms [21,22]. Culture-independent approaches helped to elucidate taxonomic structures and gene contents of many microbiomes [[23], [24], [25], [26]]. Complementary, traditional cultivation-based microbiological analyses including genome sequencing still yield reliable genome sequence information of individual microorganisms and corresponding phenotypic features, which are also useful for interpretation of bin-genomes. Recently, a number of new bacterial species were isolated from biogas reactors [80]. Subsequent genome sequencing and metabolic reconstructions based on genome sequence information led to the prediction of the role of these microorganisms within the biogas process [[27], [28], [29], [30], [31]]. Among these newly characterized strains, Proteiniphilum saccharofermentans str. M3/6T was isolated from a mesophilic laboratory-scale biogas reactor [31]. Microbiological characterization revealed that, besides utilization of complex proteinaceous substrates such as yeast extract and peptone, the isolate was able to ferment mono- and disaccharides. Moreover, it produced extracellular enzymes involved in degradation of complex carbohydrates, namely β-glucan, xylan, arabinoxylan, starch, arabinogalactan, phosphoric acid-swollen cellulose and carboxymethyl cellulose (CM-cellulose). Considering these phenotypic features indispensable for effective biomass conversion, it was worthwhile to establish and analyze the complete genome sequence of P. saccharofermentans str. M3/6T to uncover its genetic potential regarding carbohydrate-active enzymes involved in AD of biomass. The genome was manually annotated and interpreted to reconstruct metabolic pathways dedicated to biomass degradation and fermentation processes. Gene clusters encoding polysaccharide utilization loci (PUL) were analyzed in detail. Obtained findings are of importance regarding the biotechnological process of biomass conversion to biofuels.

Material and methods

Strain cultivation and DNA isolation

P. saccharofermentans str. M3/6T was cultivated at 37 °C in anoxic basal medium with yeast extract and proteose peptone (5 g l−1 each) as described by Hahnke et al. [31]. The extraction of genomic DNA was performed using the Gentra Puregene Yeast/Bact. Kit (Qiagen, Hilden, Germany) following the manufacturer’s instructions. The obtained DNA was purified using the NucleoSpin® gDNA Clean-up kit (Macherey-Nagel, Düren, Germany).

Sequencing, assembly and annotation

The total genomic DNA was used for the construction of a standard shotgun library applying the Nextera® Mate-Pair Library Preparation Kit (Illumina), according to the manufacturer’s protocol. The genomic library was sequenced on the Illumina MiSeq system. After processing of the raw data, the reads were assembled into contigs using the Newbler De Novo Assembler (version 2.8). Genome finishing was performed using the CONSED software package [32] for ordering and joining the contigs. Genome analyses, interpretation and reconstruction of metabolic pathways were performed as recently described [33]. Briefly, the assembled genome sequence was imported into the annotation platform GenDB [34] for automatic prediction of genes. All predicted genes were analyzed and validated manually by means of BLAST against different databases including Pfam, TIGRFAM, InterPro, SwissProt, and the non-redundant NCBI protein sequence database (NR). Putative tRNA and rRNA genes were identified with RNAmmer [35], RBSfinder [36] and tRNAscan-SE [37] and SignalP [38] and TMHMM [39] were used to predict signal peptides and transmembrane proteins. To identify phage-related genes and genomic islands (GI), the P. saccharofermentans str. M3/6T genome was uploaded in PHAST [40] and IslandViewer [41], respectively. The CRISPRfinder [42] in combinantion with the CRISPRdb database [43] was applied to identified CRISPR arrays (Clustered Regularly Interspaced Short Palindromic Repeats) in the strain M3/6T genome. The cas gene predicted by GenDB was manually verified by means of BLAST against the database cited above. Finally, comparisons between the genome of strain M3/6T and that of the type strain of its most closely related species, Proteiniphilum acetatigenes DSM 18083T, were carried out using the EDGAR tool [44]. Synteny analyses, identification of orthologous genes and classification of genes as core genes or singletons were done within EDGAR.

Reconstruction of metabolic pathways

To determine the diversity of carbohydrate-active enzyme (CAZyme) families present in the P. saccharofermentans str. M3/6T genome, all predicted gene products were compared against the HMM profile-based database dbCAN [45] using hmmsearch in the HMMER software package [46]. Predicted CAZymes were also analyzed using Priam profiles [47]. The metabolic pathways of biomass degradation represented in the genome of strain M3/6T were reconstructed based on EC numbers of predicted enzymes in combination with Pathway tools [48]. Finally, ABC transporters were predicted and classified comparing the predicted proteins to the TCDB database [49].

Nucleotide sequence accession number

The genome of P. saccharofermentans str. M3/6T was deposited in the EMBL-EBI database (European Bioinformatics Institute database) under the accession number LT605205.

Results and discussion

General features of the Proteiniphilum saccharofermentans str. M3/6T genome

The strain M3/6T was isolated from a two-phase Upflow Anaerobic Solid-State (UASS) reactor fed with 95% maize silage and 5% wheat straw as substrates [31]. The bacterium belongs to the family Porphyromonadaceae within the phylum Bacteroidetes and it was characterized as an acidogenic microorganism producing acetate, propionate and isovalerate [31]. To uncover the genetic potential of the strain in the context of the biogas production process, its genome sequence was established, manually annotated and analyzed including reconstruction of metabolic pathways and comparative examination. Sequencing on the Illumina MiSeq system resulted in 3,174,424 sequence reads corresponding to a 165-fold coverage of the 4.4 Mb genome. The Newbler assembler (version 2.8) was used to assemble obtained reads into 60 (>100 bp) contigs. In silico finishing applying the platform CONSED led to closure of all gaps between contigs and circularization of the genome. The finished chromosomal sequence consists of 4,414,963 bp and has a GC content of 43.63% (Table 1, Fig. 1). Gene prediction resulted in identification of 3396 protein coding sequences (CDS), 48 tRNA genes, and six ribosomal RNA (rrn) operons. Among the CDSs, 58.5% could be classified according to COG categories comprising 20 higher-ranking functional groups (Table 1, Supplementary file 1: Table S1).

Table 1

General features of the P. saccharofermentans str. M3/6T genome.

Feature	Chromosome
Genome size (bp)	4,414,963
GC content (%)	43.63
Total genes	3450
Protein coding genes	3396
Genes with functional prediction	2385
Genes assigned to COGs (%)	58.5
Genes encoding proteins with signal peptides	472
Genes encoding proteins with transmembrane helices	847
rRNA operons	6
tRNA genes	48

Fig. 1

Genome plot of . The first outer circle represents the genome scale in kb. The origin of replication was determined based on GC skew analyses. Second and third circle: predicted protein-coding sequences (CDS) on the forward and the reverse strands colored according to the assigned COG categories. Fourth and fifth inner circles represent the GC-content and GC-skew, respectively. Red squares on the outer circle indicate CRISPR-cas systems. General features of the P. saccharofermentans str. M3/6T genome. The strain M3/6T genome contains some putative phage genes, but no complete prophage cluster could be identified. Regarding mobile genetic elements (MGE), 89 transposase genes were identified. Moreover, the strain possesses two CRISPR-cas systems which commonly play a role in preventing invasion of phages and mobile genetic elements [50]. No virulence, antibiotic resistance or pathogenicity-related genes were identified in the strain M3/6T genome by IslandViewer [41].

Genes involved in complex carbohydrate or protein degradation

A number of species of the phylum Firmicutes, namely of the families Clostridiaceae and Ruminococcaceae are able to hydrolyze long-chained carbohydrates such as cellulose [51,52]. Most prominent and well-studied is the thermophilic species Clostridium thermocellum, but also other members of the genera Clostridium, Acetivibrio, and Ruminococcus adapted to mesophilic or thermophilic process regimes are involved in cellulose degradation [53]. In some biogas reactors operated at mesophilic temperatures, also a higher abundance of members of the phylum Bacteroidetes was observed indicating their participation in hydrolysis of organic compounds [54,55]. In contrast, in thermophilic biogas reactors operated at 50–60 °C, Bacteroidetes were not or only detectable in minor amounts [7]. Interestingly, upon further increase of the process temperature to 65 °C and above, Bacteroidetes members proliferate again leading to a reduced abundance of Firmicutes [56]. However, most of the Bacteroidetes species detected in biogas reactors are not sufficiently characterized yet to interpret their role in AD. To overcome these limitations, the genome of the newly isolated P. sacchorofermentans str. M3/6T was screened for the presence of carbohydrate-active enzymes listed in the CAZy database (www.cazy.org) [57]. According to the CAZy classification scheme, the genome of the strain M3/6T encodes 116 glycoside hydrolyses (GH), 25 glycosyl-transferases (GT), seven carbohydrate esterases (CE), two polysaccharide lyases (PL), and seven gene products featuring carbohydrate binding motifs (CBM). A summarizing overview is given in Fig. 2 and Table 2 and in Supplementary file 1 (Tables S2 and S3). Among them, the strain M3/6T encodes a variety of carbohydrate-active enzymes required for degradation of different complex carbohydrates, including pectin, arabinogalactan, hemicellulose (arabinan, xylan, mannan, β-glucans), starch, fructan, and chitin. A recent study reported that the microorganism showed only weak extracellular enzyme activities against β-glucan, starch, arabinogalactan, xylan, arabinoxylan, CM-cellulose, and phosphoric acid-swollen cellulose [31]. However, it is important to note that different environmental conditions may cause different physiological responses and that the pure-culture conditions applied to determine extracellular enzyme activities did not correspond to the environmental conditions in the biogas reactor from which the strain M3/6T was isolated. Thus, regarding its repertoire of carbohydrate-active enzymes, the potential of the strain to degrade carbohydrates might be more versatile as previously reported. The repertoire of carbohydrate active enzymes encoded in the genome of strain M3/6T is described below.

Fig. 2

Table 2

Carbohydrate active enzymes (CAZy) encoded in the P. saccharofermentans str. M3/6T genome.

Substrate target	Enzyme name	ECa number	Familyb
Pectin	Pectinesterase	3.1.1.11	CE8
	Exo-poly-α-d-galacturonosidase	3.2.1.82	GH28
	α-l-rhamnosidase	3.2.1.40	GH33
	α-l-rhamnosidase	3.2.1.40	GH78
	Rhamnogalacturonyl hydrolase	3.2.1.172	GH105
	Pectate lyase superfamily protein		GH28

Arabinogalactan	Arabinogalactan endo-β-1,4-galactanase	3.2.1.89	GH53
Arabinogalactan	Putative Galactan endo-1,6-β-galactosidase	3.2.1.164	GH30

Arabinan (Hemicellulose)	Putative α-l-arabinofuranosidase	3.2.1.55	GH43
	α-l-arabinofuranosidase	3.2.1.55	GH51
	β-l-arabinofuranosidase	3.2.1.185	GH127
	Arabinan endo-1,5-α-l-arabinosidase	3.2.1.99	GH43

Xylan (Hemicellulose)	Xylan 1,4-β-xylosidase	3.2.1.37	GH43
	Xylan 1,3-β-xylosidase	3.2.1.72
	Acetyl xylan esterase	3.1.1.72

Mannan (Hemicellulose)	β-mannanase	3.2.1.78	GH26
Mannan (Hemicellulose)	Putative mannan endo-1,6-α-mannosidase.	3.2.1.101	GH76

Lichenan	Licheninase	3.2.1.73	GH16

Cellulose	Cellulase	3.2.1.4	GH5
	Cellulose 1,4-β-cellobiosidase	3.2.1.91	GH9
	β-glucosidase	3.2.1.21	GH3

Chitin	Chitinase	3.2.1.14	GH18
	β-N-acetylhexosaminidase	3.2.1.52	GH20
	N-acetylglucosamine kinase	2.7.1.59
	Putative β-hexosaminidase	3.2.1.52	GH3

Starch	α-amylase	3.2.1.1	GH13
	Putative glucan 1,4-α-glucosidase	3.2.1.3	GH15
	Putative α-amylase	3.2.1.1	GH57
	4-α-glucanotransferase	2.4.1.25
	Glycogen Phosphorylase	2.4.1.1	GT35

Pullulan	Neopullulanase	3.2.1.135	GH13

Fructan	Fructan β-fructosidase	3.2.1.80	GH32
Fructan	β-fructofuranosidase	3.2.1.26	GH32

Enzymes encoded in the P. saccharofermentans str. M3/6T genome were compared against the HMM profile-based database dbCAN using hmmsearch in the HMMER software package. Results were manually curated considering best BLAST hits, alignments, and e-values. In addition, PRIAM profiles were used for classifications.

Enzyme Commission number.

CAZy families.

Schematic overview of PUL (Polysaccharide Utilization Loci) predicted in the . To facilitate visualization of gene arrangements, these are colored according to the function of the encoded proteins SusC (blue), SusD (purple), SusE (light blue), HP (hypothetical protein, gray), GHs (Glycoside Hydrolase, pink), CBM (Carbohydrate-Binding Module, green), PL (Polysaccharide Lyases, light orange), CE (Carbohydrate Esterase, dark orange), TonB (TonB-linked outer membrane protein, light pink), Pept (Peptidase, light green), MFS (Major Facilitator Superfamily, light pink), regulators (AraC, LacI, HTCS, ECF, Gntr, Anti-sigma, yellow). Genes that do not encode PUL components were marked with’ non-PUL genes’ (nPULg, dark brown). Carbohydrate active enzymes (CAZy) encoded in the P. saccharofermentans str. M3/6T genome. Enzymes encoded in the P. saccharofermentans str. M3/6T genome were compared against the HMM profile-based database dbCAN using hmmsearch in the HMMER software package. Results were manually curated considering best BLAST hits, alignments, and e-values. In addition, PRIAM profiles were used for classifications. Enzyme Commission number. CAZy families.

Pectin

Pectin is present in the primary cell wall of higher plants and responsible for its structural integrity. It is composed of homogalacturonan (HG), rhamnogalacturonan I (RGI) and rhamnogalacturonan II (RGII), all containing d-galacturonic acid in their side chains. The P. saccharofermentans str. M3/6T genome encodes the enzymes pectinesterase (EC 3.1.1.11), polygalacturonase (EC 3.2.1.15), exo-poly-α-d-galacturonosidase (EC 3.2.1.82) and pectate lyases. Pectin degradation starts with the action of pectinesterases, followed by polygalacturonases and pectate lyases catalyzing deesterification of pectin. Pectic acid is released which than is broken down by endo- or exo-polygalacturonases producing mono- or di-galacturonate. Genes involved in degradation of RGI, encoding rhamnogalacturan acetylesterase (EC 3.1.1.86), rhamnogalacturonyl hydrolase (EC 3.2.1.172) and α-l-rhamnosidase (EC 3.2.1.40) were also identified in the genome of strain M3/6T. These enzymes act on RGI and produce rhamnose residues. Furthermore, two genes for polysaccharide lyases (PL), which catalyze random cleavage of pectin, were predicted. One gene encodes a family PL8 enzyme whereas the second one represents family PL11.

Arabinogalactan

Arabinogalactan (AG) is a polysaccharide consisting of galactose and arabinose residues. AGs are associated with pectin in plant cell walls and occur in two structurally different forms denominated as type I and type II. Two genes encoding endo-β-1,4-galactanase (EC 3.2.1.89) responsible for degradation of AG type I were predicted in the P. saccharofermentans str. M3/6T genome. Endo-β-1,4-galactanases act on d-galactosidic linkages in AGs releasing galactotetraose. Moreover, a putative galactan endo-1,6-β-galactosidase (EC 3.2.1.164) was predicted.

Hemicellulose

Hemicellulose is a heteropolymer of the plant cell wall closely associated with cellulose and lignin. It is composed of a variety of polysaccharides derived from sugars such as d-xylose, d-galactose, d-mannose, d-glucose, and l-arabinose. The P. saccharofermentans str. M3/6T genome possesses 47 genes encoding enzymes predicted to be responsible for arabinan, xylan, mannan and β-glucan degradation. Arabinan is a polysaccharide composed of arabinose residues, which are linked by α-1,5 glycosidic bonds in its backbone and α-1,2 and α-1,3 connections in its side chains. The strain M3/6T genome encodes arabinan endo-1,5-α-l-arabinanase (EC 3.2.1.99) that catalyzes the endo-hydrolysis of arabinan into arabinan oligosaccharides, which are further decomposed to l-arabinose residues by α-L-arabinofuranosidase (EC 3.2.1.55). Genes encoding β-l-arabinofuranosidase (EC 3.2.1.185) are also present in the genome. The enzyme hydrolyzes side chain α-(1,2) glycosidic bonds. Xylan is composed of d-xylose residues. Complete hydrolysis of xylan requires the synergistic action of GH and CE. In strain M3/6T, three of seven enzymes commonly described as enzymes needed for complete xylan hydrolysis were predicted. Acetyl xylan esterase (EC 3.1.1.72) catalyzes release of acetyl groups from polymeric xylan and two other enzymes successively cleave xylose residues from the non-reducing termini of xylan molecules: xylan 1,4-β-xylosidase (EC 3.2.1.37) and xylan 1,3-β-xylosidase (EC 3.2.1.72). Mannans are polysaccharides composed of mannose, glucose and galactose residues. The strain M3/6T genome harbors genes encoding β-mannanases (EC 3.2.1.78) that hydrolyze linear mannan by randomly cleaving its backbone. (1,4)-β-manno-oligomers are released. The strain may also catalyze cleavage of α-(1-6) mannosidic bonds in unbranched mannan oligossacharides producing mannose residues by a putative endo-1,6-α-mannosidase (EC 3.2.1.101). β-Glucans are polysaccharides consisting of repeated glucose residues linked by β-1,3 and β-1,4 glycosidic bonds. Hydrolysis of β-glucans is mainly carried out by four types of β-glucanases: β-1,3(4)-glucanase (EC 3.2.1.6), laminarinase (E.C. 3.2.1.39), cellulase (E.C. 3.2.1.4), and lichenase (E.C. 3.2.1.73). The strain M3/6T genome possesses genetic determinants encoding a cellulase and a lichenase. Lichenases catalyze cleavage of β-1-4-d-glucosidic bonds of β-glucans yielding shorter glucans. Subsequently, the resulting glucans can be hydrolyzed by a cellulase releasing glucose units.

Cellulose

Cellulose is the major polysaccharide of plant cell walls consisting of repeated d-glucose residues linked by β-1,4 glycosidic bonds. The P. saccharofermentans str. M3/6T genome encodes an endoglucanase (EC 3.2.1.4) that catalyzes the cleavage of β-(1-4) glucosidic bonds releasing cellodextrin, which can be further hydrolyzed by 1,4-β-cellobiosidase (EC 3.2.1.91) producing cellobiose. Subsequently, the latter disaccharide is cleaved into glucose monomers by a β-glucosidase (EC 3.2.1.21).

Chitin

Chitin is frequently considered to represent the second most abundant polysaccharide in nature after cellulose. It is present in several organisms such as shrimps, crabs and insects as well as in the cell walls of fungi, yeast and green algae. The polysaccharide is composed of repeating units of β-(1,4)-linked N-acetyl-β-d-glucosamine. Even if chitin is not present in biogas reactors in high amounts, a total of nine genes encoding putative chitinolytic enzymes were identified in P. saccharofermentans str. M3/6T, including a chitinase (EC 3.2.1.14), β-N-acetyl-hexosaminidase (EC 3.2.1.52) and N-acetyl-glucosamine kinase (EC 2.7.1.59). Chitinases act on N-acetyl-β-d-glucosamine in chitin yielding chitodextrins, which can be further hydrolyzed by chitinases releasing chitotriose. Subsequently, chitotriose is hydrolyzed to N,N’-diacetylchitobiose by β-N-acetyl-hexosaminidase, producing N-acetyl-d-glucosamine.

Starch

Starch is the main storage polysaccharide of higher plants. It is a mixture of two polysaccharides, namely α-amylose and amylopectin. α-Amylose consists of several thousand glucose residues commonly connected via α-(1,4) glycosidic bonds. On the other hand, amylopectin is made up of hundreds of shorter linear chains (α-1,4-glucan) with branches of α-(1,6) glycosidic bonds occurring every 24–30 glucose residues. CAZy analyses of the P. saccharofermentans str. M3/6T genome revealed seven genes encoding enzymes involved in starch degradation, including α-amylase (EC 3.2.1.1), a putative glucan 1,4-α-glucosidase (EC 3.2.1.3), 4-α-glucanotransferase (EC 2.4.1.25), and glycogen phosphorylase (EC 2.4.1.1). The α-amylase acts on the inner part of the starch chain cleaving α-(1,4) glycosidic linkages to release maltose or maltodextrin. The α-(1,4) and α-(1,6) glycosidic bonds of the external glucose residues of the polysaccharide are cleaved by glucan 1,4-α-glucosidase yielding glucose. 4-α-Glucanotransferase cleaves and transfers an 1,4 glycosidic bond present in maltose to a new position in an acceptor molecule releasing a glucose residue. Finally, α-1,4-glycosidic bonds of maltodextrin are cleaved by glycogen phosphorylase to release glucose residues.

Pullulan

Pullulan is a linear polysaccharide frequently described as repeating unit of maltotriose linked by α-(1,6) glycosidic bonds. Nevertheless, also α-(1,4) linkages can occur. Pullulan is one of the exo-polysaccharides produced by the yeast-like fungus Aureobasidium pullulans [58,59]. Pullulanases are classified according to their substrate specificities and reaction products. There are five groups: pullulanases type I (PULI), pullulanases type II (PULII), amylopullulanase, isopullulanase and neopullulanase [60]. A gene encoding neopullulanase (EC 3.2.1.135) was identified in P. saccharofermentans str. M3/6T. The enzyme catalyzes the break-down of α-(1,4) glycosidic linkages in pullulan to produce maltose, glucose and panose as main end-products [61].

Fructan

Fructan is a polysaccharide composed of fructose residues linked by β-(2,1)- or β-(2,6)-glycosidic bonds. Fructan is classified in three types: inulin (linear structure linked by β-(2,1)-glycosidic bonds), levan (linear structure linked by β-(2,6)-glycosidic bonds) and graminan (branched structure linked by β-(2,1)- and β-(2,6)-glycosidic bonds). Analyses indicated that the P. saccharofermentans str. M3/6T genome encodes two enzymes involved in fructan degradation, namely fructan β-fructosidase (EC 3.2.1.80) and β-fructofuranosidase (EC 3.2.1.26). Fructan β-fructosidase is able to cleave β-(2,1)- and β-(2,6)-glycosidic bonds in inulin and levan at the non-reducing end releasing sucrose (or fructose residues). The disaccharide sucrose can be hydrolyzed into fructose and glucose by fructofuranosidase. Moreover, thirty-two other genes of strain M3/6T were predicted to encode glycoside hydrolases. These enzymes could not be further specified. With respect to the increasing demand on microbial capabilities regarding the degradation of organic substrates rich in protein components, additionally the potential of P. saccharofermentans str. M3/6T to decompose proteins was also examined based on its functional genome annotation. The proteolytic system of strain M3/6T consists of intracellular and extracellular proteases and transporters (Supplementary file 1: Table S4). The intracellular proteolytic system is composed of Lon- and Clp serine proteases and prolyl oligo-peptidases, which play an essential role in degrading abnormal proteins or proteins participating in regulatory processes. Among the extracellular proteases, strain M3/6T encodes aminopeptidases which are exopeptidases acting on N-terminal amino acid residues from small oligopeptides or proteins. These findings are in accordance with the earlier observation that peptone supports anaerobic growth of P. saccharofermentans str. M3/6T [31].

Polysaccharide utilization loci (PUL)

Polysaccharide utilization loci (PUL) were identified in the genomes of Bacteroidetes members recovered from rumen or human gut samples [[62], [63], [64], [65]]. Commonly, the genes susD and susC are central elements of PUL clusters. They encode an outer membrane glycan binding protein (SusD) and an outer membrane protein (SusC) involved in transfer of the maltooligosaccharide to the periplasmic space for complete degradation [62,63,65,66]. Genes coding for carbohydrate-active enzymes and transcriptional regulators (e.g. HTCS, ECF-σ/Anti-σ regulators, SusR, AraC, GntR and LacI) were frequently located in close proximity to PUL clusters. Corresponding gene products contribute to the transcriptional regulation and functionality of the multi-component complex [67]. In P. saccharofermentans str. M3/6T, 64 distinct PUL were identified (Fig. 2). Among them, 38 PUL comprise 115 out of 116 GH genes predicted in the genome of the strain. GHs encoded in PUL represent 33 different CAZyme families. The most frequent ones are GH43 and GH2, indicating that corresponding PUL are mainly dedicated to xylan degradation and metabolism of galactose, respectively. At the example of Gramella flava JLKT2011, a recent study confirmed involvement of PUL and associated GH genes in xylan and homogalacturonan utilization by exploiting multi-omics data [68]. PUL of P. saccharofermentans str. M3/6T are also associated with PL, CE and peptidase genes. Likewise, PUL of Prevotella species are linked to peptidase genes, suggesting that the encoded complex is involved in protein degradation and peptide transfer across the membrane [64]. Moreover, 25 PUL of strain M3/6T are linked to transcriptional regulator genes indicating that the encoded regulators are involved in control of PUL expression. Overall, the strain M3/6T seems to be capable to utilize a wide spectrum of complex polysaccharides and proteins.

Putative fermentation metabolic pathways

P. saccharofermentans str. M3/6T encodes a versatile set of carbohydrate-active enzymes indicating that the strain is able to generate energy by fermentation of sugar molecules. To obtain deeper insights into fermentation pathways, the central metabolism of strain M3/6T, was reconstructed from genome sequence data. In this context, also the corresponding transport mechanisms were considered. In total, 225 genes encoding transporters for amino acids, peptides, monosaccharides, inorganic and metal ions are present in the genome of strain M3/6T. A summary on monosaccharide, amino acid and peptide transporters is given in Supplementary file 2 (Table S1). Five different transporter families encoded in the genome may be involved in uptake of xylose, fucose, arabinose, and maltose. Four other transporter families were predicted to facilitate peptide and/or amino acid transport into the cell. P. saccharofermentans strain M3/6T is able to ferment a variety of mono- and disaccharides producing carbon dioxide, hydrogen, acetate, formate, propionate and isovalerate as end-products of the fermentation process [31]. Glucose, arabinose, cellobiose, fructose, galactose, lactose, maltose, rhamnose, sucrose, and trehalose were among the substrates that supported growth of strain M3/6T. Except for rhamnose and fucose degradation, all genes encoding relevant enzymes involved in utilization of these simple sugar molecules were predicted (Fig. 3 and Supplementary file 1: Table S5). Moreover, metabolic reconstruction also indicated that strain M3/6T is able to degrade xylose, lyxose, mannose, and melibiose as well as some amino acids (e.g. proline, alanine, asparagine). The majority of metabolites of corresponding degradation pathways are converted to pyruvate, and subsequently yield fermentation end-products (see above and Supplementary file 1: Table S5). Hence, the strain M3/6T is able to produce acids from sugar fermentation suggesting that its function in the biogas production process is associated with the acidogenic phase.

Fig. 3

Schematic overview on carbohydrate and protein degradation pathways based on enzymes predicted from the . Intracellular and extracellular reactions are separated by the cell envelope. Names in blue denote the hemicellulose polysaccharides. The following intracellular metabolic pathways are shown: glycolysis, pentose phosphate pathway, central pyruvate metabolism and the tricarboxylic acid cycle (TCA). Arrows symbolize enzymatic reactions. Crossed red arrows mark enzymatic reactions for which corresponding enzymes were not predicted in the P. saccharofermentans str. M3/6T genome. Green metabolites represent fermentation pathway end-products (acetate, propionate, isovalerate, formate, and molecular hydrogen). Abbreviations: Asp, aspartic acid; Gln, glutamine; His, histidine; Ile, isoleucine; Leu, leucine; Thr, threonine; Val, valine; OPT, Oligopeptide Transporter Family proteins; ABC, ABC transporter; SSS, Solute-Sodium Symporter Family proteins; MFS, Major Facilitator Superfamily proteins; NP, no prediction.

P. saccharofermentans M3/6T harbors two CRISPR-cas systems

Bacteriophages and viruses of micro-eukaryotes are considered to represent the most abundant and genetically diverse group on earth [[69], [70], [71]]. Viral infections are responsible for lysis of microbial cells thereby affecting microbial community structures [70,71]. A recent study indicated that phages can affect biogas microbial communities thus causing profound effects on the performance of the anaerobic digestion process [72]. Obtained results suggested that more than 40% of variation within biogas community compositions might be caused by phages. Clustered regularly interspaced short palindromic repeats (CRISPR-cas) systems are considered to represent prokaryotic immune systems against foreign DNA molecules and in some cases, also against RNA molecules [[73], [74], [75], [76]]. Briefly, they are composed of a cas operon followed by a leader sequence and arrays of highly conserved short repeat sequences, which are separated by variable sequences (called ‘spacers’) originating from phage or plasmid DNA. Moreover, cas operons are classified in three major types and subtypes according to their architecture [76,77]. P. saccharofermentas strain M3/6T harbors two CRISPR-cas systems (Fig. 3 and Supplementary file S2). The first CRISPR-cas system was classified as type II since it comprises the cas9 gene near cas1 and cas2. A predicted leader region (promoter region) is located downstream of the cas2 gene and features on average an AT-content of 63.3%. The CRISPR array is composed of 39 47-bp direct repeats and 38 spacers of 29–30 bp. The second CRISPR-cas system belonging to the subtype I-B is composed of an operon of eight genes, a leader region (211 bp featuring an AT-content of 66.98%), and a large CRISPR array of 114 29-bp direct repeats and 113 spacers of 34–39 bp (Fig. 4).

Fig. 4

CRISPR-. CRISPR-cas system type II-C: cas operon, eight genes (7460 bp); Leader, 109 bp; CRISPR array (2967 bp), 39 47-bp-direct repeats and 38 spacers of 29–30 bp. CRISPR-cas system type I-B: cas operon, eight genes (9131 bp); Leader, 211 bp; CRISPR array (7307 bp), 114 29-bp-direct repeats and 113 spacers of 34–39 bp; Coding sequences (CDS) colored in grey encode hypothetical proteins. The coordinates and sequences of the CRISPR-cas systems are provided as Supplemental material (File S2). The host retains genome signatures from phages as spacer sequences in CRISPR arrays after phage infection. These can be used to identify corresponding phages based on database searches. All spacer sequences located in P. saccharofermentas str. M3/6T CRISPR arrays were compared against all phage genomes in the NCBI nucleotide sequence database applying the BLASTn-short function (BLAST+package) with an e-value threshold of 1.0 × 10−10. Spacer sequences could not be assigned to any known bacteriophage genome stored in the database indicating that phage diversity is not sufficiently covered in the NCBI database.

Comparative genomic analyses

The P. saccharofermentans str. M3/6T genome was compared to its closest described relative P. acetatigenes str. DSM 18083T that is available in the NCBI nucleotide sequence database. According to the Integrated Microbial Genomes pipeline (IMG/EM) [78], the strain DSM 18083T is able to degrade the same types of complex carbohydrates and mono- and disaccharides as predicted for strain M3/6T. Genes for utilization of pectin, xylan and pullulan are not present in strain DSM 18083T. Automatic IMG/EM annotation indicated that strain DSM 18083T possesses 90 PUL gene clusters compared to 64 that are present in strain M3/6T. However, both strains are able to degrade β-glycan compounds. The comparative genomics tool EDGAR [44] was applied to calculate the shared core-genome between P. saccharofermentans str. M3/6T and P. acetatigenes str. DSM 18083T. The latter strain was isolated from a reactor treating brewery wastewater (Leibniz-Institute DSMZ GmbH, Braunschweig, Germany), while P. saccharofermentans str. M3/6T originates from a biogas reactor. Therefore, both strains were isolated from biotechnological environments. Nothing is known on the occurrence and dissemination of these strains in natural habitats. Moreover, P. acetatigenes str. DSM 18083T and P. saccharofermentans str. M3/6T represent the only types strains of the genus Proteiniphilum. Therefore, comparative genomic analyses are limited. It appeared that both strains share 2482 orthologous genes, representing 73.5% and 68.2% of all genes predicted for strains M3/6T and DSM 18083T, respectively. Moreover, these strains feature an Average Nucleotide Identity (ANI) value of 91.4% confirming that these bacteria belong to different species [79]. To complement genomic analyses regarding genome architectures, a synteny plot was calculated. As illustrated in Fig. S1 (Supplementary file S1), there are considerable rearrangements, deletions and insertions within the strain M3/6T genome compared to that of strain DSM 18083T, suggesting that both strains have undergone diverging evolution for a longer period of time probably promoting adaptation to their specific ecological niches. However, these adaptations cannot be completely evaluated due to missing metadata on natural habitats of these Proteiniphilum species.

Concluding remarks

Proteiniphilum saccharofermentans str. M3/6T features a versatile metabolism contributing to substrate hydrolysis and acidogenesis within the biogas process as revealed by systematic genome analyses. The strain possesses a set of genes encoding carbohydrate-active enzymes needed for decomposition of pectin, arabinogalactan, hemicellulose (arabinan, xylan, mannan, β-glucans), cellulose, starch, fructan, chitin and pullulan. Most of these genes as well as peptidase genes are associated with PUL, suggesting their involvement in degradation of a broad spectrum of substrates, especially polysaccharides. Metabolites of carbohydrate decomposition feed fermentation pathways finally yielding volatile organic acids. Therefore, this strain is a promising candidate for the development of inoculant cultures regarding enhancement of biomass decomposition, bioaugmentation processes as well as production of volatile organic acids in anaerobic digestion and may also contribute to increased biogas yields. Finally, two CRISPR-cas systems were detected in the genome of strain M3/6T, indicating that the bacterium possesses protection mechanisms against invasive foreign DNA elements (e.g., phages or plasmids). CRISPR-spacer sequences of M3/6T do not match sequences within databases, indicating that Proteiniphilum phages and/or plasmids are not very well represented in corresponding databases. However, presence of a CRISPR-cas ‘immune system’ may render the strain insensitive towards infections by specific bacteriophages which also has implications on the performance of the strain in anaerobic digestion processes.

Competing interests

The authors declare that they do not have any competing interests.

76 in total

1. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

Authors: A Krogh; B Larsson; G von Heijne; E L Sonnhammer
Journal: J Mol Biol Date: 2001-01-19 Impact factor: 5.469

2. Identification of genes that are associated with DNA repeats in prokaryotes.

Authors: Ruud Jansen; Jan D A van Embden; Wim Gaastra; Leo M Schouls
Journal: Mol Microbiol Date: 2002-03 Impact factor: 3.501

3. Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology.

Authors: Peter D Karp; Suzanne M Paley; Markus Krummenacker; Mario Latendresse; Joseph M Dale; Thomas J Lee; Pallavi Kaipa; Fred Gilham; Aaron Spaulding; Liviu Popescu; Tomer Altman; Ian Paulsen; Ingrid M Keseler; Ron Caspi
Journal: Brief Bioinform Date: 2009-12-02 Impact factor: 11.622

4. Consed: a graphical tool for sequence finishing.

Authors: D Gordon; C Abajian; P Green
Journal: Genome Res Date: 1998-03 Impact factor: 9.043

5. Pattern of action of Bacillus stearothermophilus neopullulanase on pullulan.

Authors: T Imanaka; T Kuriki
Journal: J Bacteriol Date: 1989-01 Impact factor: 3.490

Review 6. An updated evolutionary classification of CRISPR-Cas systems.

Authors: Kira S Makarova; Yuri I Wolf; Omer S Alkhnbashi; Fabrizio Costa; Shiraz A Shah; Sita J Saunders; Rodolphe Barrangou; Stan J J Brouns; Emmanuelle Charpentier; Daniel H Haft; Philippe Horvath; Sylvain Moineau; Francisco J M Mojica; Rebecca M Terns; Michael P Terns; Malcolm F White; Alexander F Yakunin; Roger A Garrett; John van der Oost; Rolf Backofen; Eugene V Koonin
Journal: Nat Rev Microbiol Date: 2015-09-28 Impact factor: 60.633

7. Bacterial cellulose hydrolysis in anaerobic environmental subsystems--Clostridium thermocellum and Clostridium stercorarium, thermophilic plant-fiber degraders.

Authors: Vladimir V Zverlov; Wolfgang H Schwarz
Journal: Ann N Y Acad Sci Date: 2008-03 Impact factor: 5.691

8. Pullulanase: role in starch hydrolysis and potential industrial applications.

Authors: Siew Ling Hii; Joo Shun Tan; Tau Chuan Ling; Arbakariya Bin Ariff
Journal: Enzyme Res Date: 2012-09-06

9. PHAST: a fast phage search tool.

Authors: You Zhou; Yongjie Liang; Karlene H Lynch; Jonathan J Dennis; David S Wishart
Journal: Nucleic Acids Res Date: 2011-06-14 Impact factor: 16.971

10. Bacteriophage-prokaryote dynamics and interaction within anaerobic digestion processes across time and space.

Authors: Junyu Zhang; Qun Gao; Qiuting Zhang; Tengxu Wang; Haowei Yue; Linwei Wu; Jason Shi; Ziyan Qin; Jizhong Zhou; Jiane Zuo; Yunfeng Yang
Journal: Microbiome Date: 2017-05-31 Impact factor: 14.650

8 in total

1. Multi-omic Directed Discovery of Cellulosomes, Polysaccharide Utilization Loci, and Lignocellulases from an Enriched Rumen Anaerobic Consortium.

Authors: Geizecler Tomazetto; Agnes C Pimentel; Daniel Wibberg; Neil Dixon; Fabio M Squina
Journal: Appl Environ Microbiol Date: 2020-09-01 Impact factor: 4.792

2. Organic Waste Substrates Induce Important Shifts in Gut Microbiota of Black Soldier Fly (Hermetia illucens L.): Coexistence of Conserved, Variable, and Potential Pathogenic Microbes.

Authors: Chrysantus M Tanga; Jacqueline Wahura Waweru; Yosef Hamba Tola; Abel Anyega Onyoni; Fathiya M Khamis; Sunday Ekesi; Juan C Paredes
Journal: Front Microbiol Date: 2021-02-12 Impact factor: 5.640

3. Effective methane production from the Japanese weed Gyougi-shiba (Cynodon dactylon) is accomplished by colocalization of microbial communities that assimilate water-soluble and -insoluble fractions.

Authors: Shuhei Matsuda; Takashi Ohtsuki
Journal: FEMS Microbiol Lett Date: 2021-03-03 Impact factor: 2.742

4. Activity-Based Protein Profiling for the Identification of Novel Carbohydrate-Active Enzymes Involved in Xylan Degradation in the Hyperthermophilic Euryarchaeon Thermococcus sp. Strain 2319x1E.

Authors: Thomas Klaus; Sabrina Ninck; Andreas Albersmeier; Tobias Busche; Daniel Wibberg; Jianbing Jiang; Alexander G Elcheninov; Kseniya S Zayulina; Farnusch Kaschani; Christopher Bräsen; Herman S Overkleeft; Jörn Kalinowski; Ilya V Kublanov; Markus Kaiser; Bettina Siebers
Journal: Front Microbiol Date: 2022-01-12 Impact factor: 5.640

5. Archaeal and Bacterial Content in a Two-Stage Anaerobic System for Efficient Energy Production from Agricultural Wastes.

Authors: Lyudmila Kabaivanova; Venelin Hubenov; Lyudmila Dimitrova; Ivan Simeonov; Haoping Wang; Penka Petrova
Journal: Molecules Date: 2022-02-23 Impact factor: 4.411

6. Resilience and limitations of MFC anodic community when exposed to antibacterial agents.

Authors: Oluwatosin Obata; John Greenman; Halil Kurt; Kartik Chandran; Ioannis Ieropoulos
Journal: Bioelectrochemistry Date: 2020-03-08 Impact factor: 5.373

Review 7. Colorectal cancer: The epigenetic role of microbiome.

Authors: Hussein Sabit; Emre Cevik; Huseyin Tombuloglu
Journal: World J Clin Cases Date: 2019-11-26 Impact factor: 1.337

8. Metagenomic and Metatranscriptomic Analyses Revealed Uncultured Bacteroidales Populations as the Dominant Proteolytic Amino Acid Degraders in Anaerobic Digesters.

Authors: Ran Mei; Masaru K Nobu; Takashi Narihiro; Wen-Tso Liu
Journal: Front Microbiol Date: 2020-10-30 Impact factor: 5.640

8 in total