Literature DB >> 26731480

Decoding Biomass-Sensing Regulons of Clostridium thermocellum Alternative Sigma-I Factors in a Heterologous Bacillus subtilis Host System.

Iván Muñoz-Gutiérrez1, Lizett Ortiz de Ora1, Inna Rozman Grinberg1, Yuval Garty1, Edward A Bayer2, Yuval Shoham3, Raphael Lamed1, Ilya Borovok1.   

Abstract

The Gram-positive, anaerobic, cellulolytic, thermophile Clostridium (Ruminiclostridium) thermocellum secretes a multi-enzyme system called the cellulosome to solubilize plant cell wall polysaccharides. During the saccharolytic process, the enzymatic composition of the cellulosome is modulated according to the type of polysaccharide(s) present in the environment. C. thermocellum has a set of eight alternative RNA polymerase sigma (σ) factors that are activated in response to extracellular polysaccharides and share sequence similarity to the Bacillus subtilis σI factor. The aim of the present work was to demonstrate whether individual C. thermocellum σI-like factors regulate specific cellulosomal genes, focusing on C. thermocellum σI6 and σI3 factors. To search for putative σI6- and σI3-dependent promoters, bioinformatic analysis of the upstream regions of the cellulosomal genes was performed. Because of the limited genetic tools available for C. thermocellum, the functionality of the predicted σI6- and σI3-dependent promoters was studied in B. subtilis as a heterologous host. This system enabled observation of the activation of 10 predicted σI6-dependent promoters associated with the C. thermocellum genes: sigI6 (itself, Clo1313_2778), xyn11B (Clo1313_0522), xyn10D (Clo1313_0177), xyn10Z (Clo1313_2635), xyn10Y (Clo1313_1305), cel9V (Clo1313_0349), cseP (Clo1313_2188), sigI1 (Clo1313_2174), cipA (Clo1313_0627), and rsgI5 (Clo1313_0985). Additionally, we observed the activation of 4 predicted σI3-dependent promoters associated with the C. thermocellum genes: sigI3 (itself, Clo1313_1911), pl11 (Clo1313_1983), ce12 (Clo1313_0693) and cipA. Our results suggest possible regulons of σI6 and σI3 in C. thermocellum, as well as the σI6 and σI3 promoter consensus sequences. The proposed -35 and -10 promoter consensus elements of σI6 are CNNAAA and CGAA, respectively. Additionally, a less conserved CGA sequence next to the C in the -35 element and a highly conserved AT sequence three bases downstream of the -10 element were also identified as important nucleotides for promoter recognition. Regarding σI3, the proposed -35 and -10 promoter consensus elements are CCCYYAAA and CGWA, respectively. The present study provides new clues for understanding these recently discovered alternative σI factors.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 26731480      PMCID: PMC4711584          DOI: 10.1371/journal.pone.0146316

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Clostridium (Ruminiclostridium) thermocellum is a Gram-positive, anaerobic, cellulolytic thermophile that produces one of the most efficient enzymatic systems to digest cellulose [1]. The cellulolytic capacities of C. thermocellum have been the subject of study for many years [2], and the main motivation in these efforts has been the production of high-value products, such as ethanol, from cellulosic wastes [3]. To solubilize such carbohydrates, C. thermocellum secretes a multi-enzyme complex termed the cellulosome that is anchored to the cell surface [4,5]. Although during the exponential phase of growth most of the cellulosomes are cell-associated, part of them are released from the cells into the milieu [4,6,7]. The C. thermocellum cellulosome consists of a nonhydrolytic scaffoldin subunit CipA that integrates various catalytic subunits into the complex [8,9]. Depending on the C. thermocellum strain, the scaffoldin can attach 8 or 9 catalytic subunits; e.g., the CipA scaffoldin of strain DSM 1313 attaches 8 catalytic subunits, whereas that of ATCC 27405 attaches 9 catalytic subunits [10]. Additionally, the scaffoldin subunit has a family 3 carbohydrate-binding module (CBM3) that binds the cellulosome to cellulose [8,11]. C. thermocellum can express over 80 different cellulosomal components encoded in its genome, which include an arsenal of different saccharolytic enzymes, such as, cellulases, hemicellulases, pectin-degrading enzymes and a chitinase [12,13]. This battery of enzymes helps C. thermocellum to unwrap its preferred substrate, cellulose, that is covered with different types of polysaccharides in the plant cell wall [5]. During the saccharolytic process, the enzymatic content of the cellulosome is adjusted to suit the type of polysaccharide present in the biomass [14-16]. Hence, C. thermocellum should possess biomass-sensing mechanisms that allow the cells to detect which polysaccharide(s) is(are) present in the environment and regulate the relevant genes accordingly the enzymatic requirements. At present, however, the regulation of cellulosomal genes is poorly understood. During the course of our efforts to gain knowledge about the biomass-sensing mechanisms in C. thermocellum, our research group discovered a collection of eight alternative σ factors and their cognate membrane-associated anti-σ factors that may play a role in regulating genes encoding cellulosomal enzymes and other proteins [17]. In the C. thermocellum genome, these alternative σ factor genes are positioned adjacent to their anti-σ factor genes in an operon-like organization [17]. This set of eight alternative σ factors (C. thermocellum σI1 to σI8) are related to the B. subtilis σI [17,18], and the expression of six of them (σI1 to σI6) was shown to be influenced by the presence of polysaccharides (e.g., cellulose and xylan) in the growth medium [19]. Furthermore, a recent study performed by Wei and colleagues [16] showed that C. thermocellum sigI3-rsgI3, sigI4-rsgI4 and sigI7-rsgI7 operons are up-regulated when the bacterium was grown in dilute acid-pretreated yellow poplar. Additionally, in vitro experiments showed that σI1 directed the transcription from sigI1 promoter and from the promoter of the gene cel48S [19] that encodes for the most abundant cellulosomal enzyme Cel48S [12,20]. The C. thermocellum anti-σI factors of σI1 to σI6 (RsgI1 to RsgI6) embody three domains: (I) a C-terminal carbohydrate-binding module (CBM) localized on the outer cell surface, (II) an internal transmembrane/wall-spanning segment, and (III) an N-terminal cytoplasmic portion (RsgI-N) which would bind the cognate σI factor [17,18]. The N-terminal segments (~165 residues) of the C. thermocellum RsgI proteins resemble B. subtilis RsgI, a negative regulator of its cognate σI factor [17,18]. Moreover, the binding capacities of the N-terminal cytoplasmic portions of RsgI1, RsgI2 and RsgI6 to their corresponding σI factors was demonstrated in vitro [19]. The C-terminal domains of the RsgIs showed binding capacities to different polysaccharides, including cellulose (RsgI1, RsgI2, RsgI4 and RsgI6), xylan (RsgI6), and pectin (RsgI3) [17,21,22]. Additionally, the crystal structures of the C-terminal CBMs of RsgI1, RsgI2 and RsgI4 were solved showing a high degree of similarity to the family 3 CBMs [22]. In the case of RsgI3, its C-terminal CBM is constituted by two tandem PA14-superfamily motifs (pfam07691, smart00758) that are found in a wide variety of other bacterial and eukaryotic proteins, which include the anthrax protective antigen (PA) [23], and the PA14 modular dyad was predicted to be a putative CBM by virtue of its binding to pectin-like polysaccharides [17]. Interestingly, the C-terminal domain of RsgI6 belongs to the glycoside hydrolase family 10 (GH10), however, its catalytic activity was shown to be very low [17,21]. Nevertheless, RsgI6-GH10 retains its binding capacity to its corresponding carbohydrates, suggesting an evolutionary adaptation to function as a polysaccharide-binding domain rather than an authentic enzymatic component [21]. The multiple C. thermocellum alternative σI factors resemble to some extent the ECF (extracytoplasmic function) σ factors [24-26], since they share common characteristics which include the following: (I) both kinds of σ factors autoregulate their own expression; (II) both kinds of σ factors are usually co-transcribed with another ORF encoding a transmembrane anti-σ factor that controls the activity of its cognate σ factor; (III) the anti-σ factor is composed of an extracytoplasmic sensory domain and an intracellular inhibitory domain that binds the σ factor; (IV) the activity of the σ factor is induced by inhibiting activity of the anti-σ factor [18,25]. We assume that the main difference between σI-like factors and ECF σ factors is related to their "architectures" (Muñoz-Gutierrez et al, unpublished). While the ECF σ factors are formed with only two of the four domains of the σ70 family of proteins (σ2 and σ4) [25,26], the σI-like factors have only one predictable functional domain associated with the amino-terminal sequence, σ2, and the sigma domain σ4 is absent [17]. In lieu of the sigma domain σ4, the σI factors contain a novel 100-residue conserved C-terminal domain termed σI-C [17], that might serve to recognize -35 sequences of the σI promoters. Until now, the knowledge we have regarding the regulation of cellulosomal genes by C. thermocellum σI-like factors is a recent report of Sand and co-workers [27] which showed that the xylanase genes xyn10Z (or Clo1313_2635 according to the DSM 1313 genome annotation), xyn11B (Clo1313_0522) and xyn10D (Clo1313_0177) were under the control of σI6. Previously, Nataf and co-workers [19] showed that the cellulase gene celS (cel48S) was likely under the control of σI1. Therefore, the present work was devoted to demonstrating whether individual C. thermocellum σI factors regulate specific cellulosomal target genes. Taking advantage of the fact that the transcription start sites of C. thermocellum sigI6, xyn10Z and xyn11B were mapped previously in our research group [19,27], we performed a bioinformatics analysis to identify σI-dependent promoters in the genome of C. thermocellum DSM 1313. This analysis allowed us to identify 40 possible σI-dependent promoters upstream of the sigI-like genes and certain cellulosomal genes of C. thermocellum. To corroborate the functionality of the 40 predicted promoters, we fused their DNA sequences to a promoterless lacZ reporter gene. To overcome the lack of genetic tools in C. thermocellum, we used a B. subtilis Δ(sigI-rsgI) strain as a heterologous host and studied the activation of the 40 predicted promoters by C. thermocellum σI6 and σI3. This strategy allowed us to show that C. thermocellum σI6 could recognize the predicted promoters associated with sigI6, sigI1, rsgI5, xyn11B, xyn10D, xyn10Z, xyn10Y, cel9V, cseP and the major scaffoldin cipA. Additionally, C. thermocellum σI3 could recognize the predicted promoters detected upstream of sigI3, pl11 (Clo1313_1983 encodes a family 11 polysaccharide lyase (PL11) containing a CBM35 and a dockerin), ce12 (Clo1313_0693 encodes for a protein that contains two family 12 carbohydrate esterase (CE12), a CBM35 and a dockerin) and cipA. The combination of these methodologies revealed a putative C. thermocellum σI6 and σI3 promoter consensus. Our results show that C. thermocellum σI6 and σI3 factors expressed in B. subtilis can recognize its potential promoters, supporting our hypothesis that the multiple C. thermocellum σI–like factors might regulate cellulosomal genes.

Material and Methods

Bacterial strains, growth media and culture conditions

C. thermocellum strain DSM 1313 (LQ8) was obtained from the DSMZ (German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany). The B. subtilis strains constructed in this work are isogenic derivatives of the B. subtilis strain PY79 (laboratory stock) [28]. Additional information regarding all derivatives of B. subtilis PY79 that were constructed in this work is shown in S3 Table. B. subtilis BKE13460 was obtained from the BGSC (Bacillus Genetic Stock Center, Ohio, USA). Escherichia coli DH5α (BioSuper Competent Cells, Bio-Lab Ltd, Jerusalem, Israel) was used for plasmid propagation during plasmid construction. E. coli and B. subtilis were grown routinely at 37°C in liquid (at 250 rpm) or on solid LB-agar Broth (Lennox, Difco, BD Diagnostics, Maryland, USA). During β-galactosidase activity assays, B. subtilis was grown in Spizizen´s minimal medium (SMM) employing 5 g/L fructose as carbon source and supplemented with trace elements. The SMM contained (per liter) 2 g (NH4)2SO4, 14 g K2HPO4, 6 g KH2PO4, 1 g Na3Citrate·2H2O, and 0.2 g MgSO4·7H2O. The trace elements used were (per liter) 125 mg MgCl2·6H2O, 5.5 mg CaCl2, 13.5 mg FeCl2·6H2O, 1 mg MnCl2·4H2O, 1.7 mg ZnCl2, 0.43 mg CuCl2·2H2O, 0.6 mg CoCl2·6H2O, and 0.6 mg Na2MoO4·2H2O. When appropriate, antibiotics were included at the following final concentrations: 100 μg/mL ampicillin (Amp), 50 μg/mL kanamycin (Kan), 100 μg/mL spectinomycin (Spt), 5 μg/mL chloramphenicol (Cam) or 3 μg/mL erythromycin (Erm). The induction of genes under the P promoter was carried out with D-xylose (10 g/L final). All chemicals were purchased from Sigma-Aldrich (Missouri, USA).

DNA manipulation techniques

The oligonucleotide primers used in the present study are shown in S1 Table. Standard procedures were employed for DNA isolation, polymerase chain reaction (PCR), restriction-enzyme digestion, dephosphorylation, transformations, and gel electrophoresis as described elsewhere [29]. Plasmids were built using a combination of standard molecular cloning techniques [29] and ligase-independent cloning using the In-Fusion HD Cloning Kit (Clontech Laboratories, Inc., California, USA). C. thermocellum DNA sequences were PCR-amplified using C. thermocellum DSM 1313 genomic DNA as template. The upstream and downstream regions of the B. subtilis sigI-rsgI operon were PCR-amplified using B. subtilis PY79 genomic DNA as template. The lox71-erm-lox66 cassette was PCR-amplified using B. subtilis BKE13460 genomic DNA as template. Amplification of DNA for cloning was performed using TaKaRa Ex Taq (Takara Bio Inc., Shiga, Japan). Colony PCR was performed using Hy-Taq Ready Mix (Hy Laboratories Ltd, Rehovot, Israel). PCR primers were purchased from hy·labs (Hy Laboratories Ltd). Restriction enzymes, alkaline phosphatase, and ligase were purchased from Fermentas (Thermo Fisher Scientific Inc., Massachusetts, USA). PCR and agarose-gel products were isolated and purified using the hy·labs Gel/PCR Extraction Kit (Hy Laboratories Ltd). Purification of plasmids was carried out using the Presto™ Mini Plasmid Kit (Geneaid Biotech Ltd., Shijr, Taiwan). All clones were verified by PCR and sequencing in the Instrumentation and Service Center of the Life Sciences Faculty at Tel Aviv University.

Construction of plasmids

Plasmids constructed in the present work are listed in S2 Table. The pLOXErysigIrsgIBs plasmid was constructed to knockout the B. subtilis sigI-rsgI operon including its promoter using resistance to Erm as a selective marker. The upstream (464 bp) and downstream (505 bp) regions of the B. subtilis sigI-rsgI operon were PCR-amplified using primer pairs P1-P2 and P3-P4, respectively. The lox71-erm-lox66 cassette was amplified using primer pair P5-P6. Subsequently, the three PCR products were cloned simultaneously with the In-Fusion HD Cloning Kit into a linearized pUC19 vector (generated by PCR and provided with the kit) following the kit protocol, obtaining the pLOXErysigIrsgIBs plasmid (S1 Fig). To express the C. thermocellum σI6 and σI3 factors in B. subtilis, we used the pAX01 plasmid [30]. This vector was designed for integration at the B. subtilis lacA chromosomal locus, carries an erm resistance cassette as a selectable marker, and has the xylose-inducible promoter P. First, pAX01 was linearized with the restriction enzyme BamHI. Subsequently, the DNA sequence of C. thermocellum sigI6 and sigI3 were PCR-amplified using primer pair P7-P8 and P9-P10, respectively (S1 Table). Finally, the PCR products were cloned using the In-Fusion HD Cloning Kit into the linearized pAX01 vector, obtaining the pAX01-sigI6 and pAX01-sigI3 plasmids. To study the promoters that are under the control of C. thermocellum σI6 and σI3, we used the pBS1ClacZ plasmid that contains a promoterless lacZ reporter gene [31]. This vector was designed to integrate at the B. subtilis amyE locus and carries a cat-resistance cassette as a selectable marker. The upstream region of the C. thermocellum σI-factor genes that contain the predicted promoter and the upstream region of some cellulosomal genes that contain predicted σI-dependent promoters were PCR-amplified, using the primer pairs listed in S1 Table (primers P11 to P90). Subsequently, each PCR product was digested with restriction enzymes EcoRI and BamHI. Finally, each digested PCR was cloned into the pBS1ClacZ plasmid that was digested previously with the same restriction enzymes, thus obtaining the pBS1ClacZ derived plasmids listed in S2 Table. In order to study the important bases for promoter recognition by C. thermocellum σI6, mutant versions of the xyn10Z σI6-dependent promoter were created by site-directed mutagenesis. To introduce mutations in the conserved bases of the -35 element, the forward primers from P93 to P99, which contain the mutated nucleotides, were used with the reverse primer P92 (S1 Table). To introduce individual mutations in the conserved bases of the -10 element, the reverse primers from P100 to P105, which contain the mutated nucleotides, were used with the forward primer P91 (S1 Table). In order to compare the mutant version, a short version of the xyn10Z σI6-dependent promoter with the same length of the mutant versions was PCR-amplified using the primer pair P91-P92 (S1 Table). Subsequently, each PCR product was digested with restriction enzymes EcoRI and BamHI. Finally, each digested PCR was cloned into the pBS1ClacZ plasmid that was digested previously with the same restriction enzymes, thereby obtaining the pBS1ClacZ-derived plasmids listed in S2 Table.

Construction of B. subtilis strains

B. subtilis was transformed by using the natural competence method [32]. Chromosomal integration of plasmids by a double-crossover event was confirmed by colony PCR using the primer pairs listed in S1 Table (primers P106 to P115). The different B. subtilis strains obtained were stored at -80°C in 20% (v/v) glycerol. The strains constructed in the present work are listed in S3 Table. To construct a B. subtilis PY79 devoid of its sigI-rsgI operon, B. subtilis PY79 was first transformed with the pLOXErysigIrsgIBs plasmid, and the cells were selected with Erm, obtaining the B. subtilis CO01 strain. Subsequently, B. subtilis CO01 was transformed with the pDR244 plasmid (obtained from the BGSC) that encodes the sequence of the Cre recombinase and has a thermosensitive origin of replication. The cells were plated on LB-agar containing Spt and were incubated at 30°C. Several individual colonies were then streaked on a plain LB-agar plate and incubated overnight at 42°C to cure pDR244. The resulting colonies were screening for plasmid curing (Spt sensitivity) and the loss of the lox71-erm-lox66 cassette (Erm sensitivity). Finally, a single colony was streaked on plain LB-agar plate and grown at 37°C. The loss of the lox71-erm-lox66 cassette was confirmed by PCR with the primer pair P118-P119 (S1 Table) thus obtaining the B. subtilis CO02 strain (S3 Table).

β-Galactosidase activity assays

To measure the β-galactosidase activity, strain samples were taken from the -80°C glycerol stock and inoculated in 5 mL of SMM with Cam. Subsequently, the cells were grown overnight at 37°C with shaking (250 rpm). The next day, the cells were inoculated in 2.5 mL of SMM to an OD600 between 0.1–0.2 and grown at 37°C (250 rpm). When the cells reached mid-log growth phase (approx. 0.4–0.5 OD600), the culture was separated into two tubes, and one tube was supplemented with xylose (1% final concentration) whereas the other was used as a blank. Then, the cells were allowed to grow for another hour at 37°C (250 rpm). Finally, the cells were recovered by centrifuging at 16,000 g for 5 min, washed twice with Z-buffer (60 mM Na2HPO4·7H2O, 40 mM NaH2PO4, 10 mM KCl, 1 mM MgSO4·7H2O, and 50 mM β-mercaptoethanol, pH 7.0) and recovered in 0.5 mL of Z-buffer. Enzymatic activity was measured with the fluorogenic substrate (4 mg/mL) 4-methylumbelliferyl β-D-galactopyranoside (4-MUG, Sigma-Aldrich) in a microplate reader (Biotek Synergy HT, Vermont, USA). The cells, recovered in Z-buffer (150 μL), were placed in a 96-well plate and 2 mL of MUG (4 mg/mL) were added to initiate the enzymatic reaction. The release of the fluorescent compound 4-methylumbelliferone (4-MU) was measured (using the excitation filter 360/40 and the emission filter 460/40) every 10 min at 30°C with medium agitation for one hour. The reaction was stopped by adding 100 μL of 1M Na2CO3. To calculate the β-galactosidase activity, a standard curve with 4-MU was prepared. One unit of enzyme activity was defined as the amount of β-galactosidase that releases 1 μmol of 4-MU per minute. All the β-galactosidase activities were normalized with cell density (OD600).

Bioinformatics analysis

Primary DNA sequence analyses and DNA motif searches were performed using the Clone Manager 9 Professional Edition software (Scientific & Educational Software, Durham, NC). The B. subtilis 168 and C. thermocellum DSM 1313 sigI genes and their promoter sequences (extracted from GenBank NZ_CP010052.1 and NC_017304.1, respectively) were used as BLAST [33] queries to mine public databases including that at the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/). In order to prevent a possible loss of promoter candidates during BLAST mining, we used both discontiguous megablast ("more dissimilar sequences") as well as blastn ("somewhat similar sequences") as implemented at NCBI (http://blast.ncbi.nlm.nih.gov/Blast.cgi). Pairwise and multiple sequence alignments were performed with the CLUSTALW program [34] using either the Network Protein Sequence Analysis server (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_clustalw.html), or the ClustalW2 at the EMBL-EBI (http://www.ebi.ac.uk/Tools/msa/clustalw2/). WebLogos [35] were generated by using a public logo generator web application (http://weblogo.berkeley.edu/).

Results

Bioinformatics comparison of promoter sequences of alternative σI factor from various cellulosome-producing bacteria and Bacillales species

In order to identify the conserved sequence motifs that could be used for the analysis of putative C. thermocellum sigI-like gene promoters, we performed multiple sequence alignments of the experimentally detected σI-dependent promoter sequences. The initial analysis was performed using the sigI promoter sequences experimentally identified in B. subtilis, Bacillus licheniformis ATCC 14580, Bacillus thuringiensis serovar israelensis ATCC 35646 and Bacillus sp. strain NRRL B-14911 [18,36]. Additionally, the experimentally identified promoters of the σI-dependent genes bcrC and mreBH, which are involved in cell envelope integrity and homeostasis during heat stress in B. subtilis [36], were also included. The analysis was improved by including DNA sequences located immediately upstream of sigI-like genes in various species of the order Bacillales. The alignment is shown in S4 Table, and a high conservation of two short DNA sequences upstream of the Bacillales sigI-rsgI operons can be observed. These basic putative promoter motifs can also be observed in Fig 1A that shows a WebLogo generated with the Bacillales σI-dependent promoters shown in S4 Table. As already proposed by Tseng and Shaw [36], the suggested Bacillales σI promoter consensus sequence is ACCCCC for the -35 element and CGAA for the -10 element (Fig 1A and S4 Table). Interestingly, a conserved sequence AA downstream of the -35 element (already mentioned by Tseng and Shaw [36]), and a conserved T downstream of the -10 element can also be observed (Fig 1A and S4 Table). For future comparisons, we named the conserved sequence AA as "extended -35".
Fig 1

Identification of conserved elements of σI-dependent promoter sequences.

(A) WebLogo generated with the Bacillales sigI promoters shown in S4 Table. (B) WebLogo generated with the C. thermocellum and C. straminisolvens sigI promoters shown in Table 1, and the C. clariflavum, A. cellulolyticus and Pseudobacteroides cellulosolvens sigI promoters shown in S5 Table.

Identification of conserved elements of σI-dependent promoter sequences.

(A) WebLogo generated with the Bacillales sigI promoters shown in S4 Table. (B) WebLogo generated with the C. thermocellum and C. straminisolvens sigI promoters shown in Table 1, and the C. clariflavum, A. cellulolyticus and Pseudobacteroides cellulosolvens sigI promoters shown in S5 Table.
Table 1

Alignment of putative C. thermocellum and C. straminisolvens sigI promoters.

GeneaLocus tagsbRsgI-C terminal domainPromoter region 5'→3'5' UTR
Ct sigI1-rsgI1Clo1313_2174–2173, Cthe_0058–0059CBM3taatatacacaaAAAaa-gcagatgtataCGaAgtaatctactG16
Cs sigI1-rsgI1JCM21531_2023–2024CBM3taatatacataaAAAaa-gcaggcttgaaCGaAgtaatctactg17
Ct sigI2-rsgI2Clo1313_1961–1962, Cthe_0268–0267CBM3tggtatCccccgAAAaaatgttccctttaCGaAataactagtaa147
Cs sigI2-rsgI2JCM21531_2790–2789CBM3tgatatCcccctAAAatttgttcctcttaCGaAataacttatta159
Ct sigI3-rsgI3Clo1313_1911–1910, Cthe_0315–03162xPA14tatgaaCccctcAAAaaaatcatttggtgCGtAcaagtattgaa13
Cs sigI3-rsgI3JCM21531_2365–23662xPA14tgtaaaCccctcAAAaaa-taactttgtgCGtAcaagtattaaa15
Ct sigI4-rsgI4Clo1313_1818–1817, Cthe_0403–0404CBM3aacgtcCagctgAAAattttctgccacgcCGcAttaattttttt13
Cs sigI4-rsgI4JCM21531_1812–1813CBM3aacgtcCaactaAAAgtttgttgccacatCGcAttaatctattt13
Ct sigI6-rsgI6Clo1313_2778–2777, Cthe_2120–2119GH10acaatgCgacatAAAaccattccggtataCGaAtcgatataaGa20
Cs sigI6-rsgI6JCM21531_4109–4108GH10atgatgCgacatAAAgctattccagtctaCGaAttcatatagga22
Ct sigI7-rsgI7Clo1313_0104–0105, Cthe_2521–2522UNKattcgaCtgatgttAtt-taaatttgtgtCGaActttgctgatg52
Cs sigI7-rsgI7JCM21531_3721–3720UNKattcgaCtagtggtttg-tagatttatgtCGaActttgctgaca61
Ct sigI8-rsgI8Clo1313_0525–0524, Cthe_2975–2974UNKactttcCgaatcAAAatgaaatccatataCGaAttttctatagt16
Cs sigI8-rsgI8JCM21531_4043–4045UNKttttacCgaattAAAatagaagtcatataCGaAtcctctatagc18
ConsensusC-----AAA 13-14(N) CGWA
Bacillales consensusACCCCC--AA 15(N) CGAA----T

The most conserved bases (> 85%) are shown in bold capital fonts. Transcription start sites in Ct sigI1-rsgI1 and Ct sigI6-rsgI6, identified by Nataf and co-workers [19], are shown in uppercase bold italics; and the promoter regions proposed by the same authors are underlined [19]. Ct, Clostridium thermocellum; Cs, Clostridium straminisolvens. W represents A or T.

C. thermocellum and C. straminisolvens sigI5 are not included in the table. During the analysis we failed to predict a σI-dependent promoter sequence for both C. thermocellum and C. straminisolvens sigI5, because of the low similarity of the upstream intergenic regions of these genes.

Clo1313 and Cthe are the locus tag prefixes of C. thermocellum strains DSM 1313 and ATCC 27405, respectively.

The deduced consensus sequences of the Bacillales sigI promoter elements were used to find sequence similarities between the predicted promoters of the different C. thermocellum sigI genes. The upstream intergenic regions of the eight C. thermocellum sigI genes were manually analyzed, focusing on potential conservation of the consensus sequences of the Bacillales sigI promoter elements -35 and -10. Selected promoter candidate sequences were then used for multiple sequence alignments using the ClustalW algorithm [34]. The analysis was facilitated by the fact that the transcription start sites of C. thermocellum sigI1 and sigI6 genes were previously identified by Nataf and co-workers [19]. Additionally, the analysis was improved by comparing the C. thermocellum predicted promoter sequences to those of another cellulosome-producing species, C. straminisolvens, whose genome (NCBI Reference Sequence: NZ_BAVR00000000.1) has a very high similarity to that of C. thermocellum (96.2% similar based on 16S rDNA) [37,38]. The multiple promoter sequence alignment is presented in Table 1. The putative promoters of C. thermocellum sigI2, sigI3, sigI4, sigI7, sigI8, and those of C. straminisolvens sigI1, sigI2, sigI3, sigI4, sigI6, sigI7 and sigI8 were predicted. As shown in Table 1, a conserved AAA sequence for the "extended -35" element and a highly conserved CGWA for the -10 element were identified. Moreover, a highly conserved C upstream of the "extended -35" was also identified (Table 1). The most conserved bases (> 85%) are shown in bold capital fonts. Transcription start sites in Ct sigI1-rsgI1 and Ct sigI6-rsgI6, identified by Nataf and co-workers [19], are shown in uppercase bold italics; and the promoter regions proposed by the same authors are underlined [19]. Ct, Clostridium thermocellum; Cs, Clostridium straminisolvens. W represents A or T. C. thermocellum and C. straminisolvens sigI5 are not included in the table. During the analysis we failed to predict a σI-dependent promoter sequence for both C. thermocellum and C. straminisolvens sigI5, because of the low similarity of the upstream intergenic regions of these genes. Clo1313 and Cthe are the locus tag prefixes of C. thermocellum strains DSM 1313 and ATCC 27405, respectively. It is important to mention that during this analysis we failed to predict a σI-dependent promoter sequence for both C. thermocellum and C. straminisolvens sigI5 genes, owing to the low similarity of their upstream intergenic regions. Interestingly, whereas most of the sequences of the rsgI genes overlap with sequences of their cognate sigI genes, sigI5 and rsgI5 genes are separated by an intergenic region of 97 nucleotides, which contains a predicted σI-dependent promoter (Table 2). This suggests a different type of gene organization and regulation of sigI5 and rsgI5 in both C. thermocellum and C. straminisolvens.
Table 2

Alignment of predicted C. thermocellum σI-dependent promoters.

GeneaLocus TagsbPromoter region 5'→3'5' UTR
xyn10Z cClo1313_2635, Cthe_1963accgacacAAAaatgtgagcgttcaCGaAacaaTaaatAt96
xyn11B c,dClo1313_0522, Cthe_2972agcgacttAAAaaattatatttttgCGaAtagaTaataTg164
ce8Clo1313_0500, Cthe_2949ccccgctcAAAtgttgcataaacctCGaAtcttaaaataa32
ce12Clo1313_0693, Cthe_3141tacccttaAAAaaacttgct-tctaCGtAttttaatatta51
cel5EClo1313_1425, Cthe_0797gctgtccaAAAgaaaa-tgatttttCGaAttaaTataata156
cel8AClo1313_1960, Cthe_0269accctatcAAAtaacccattcaattCGcAtttaTtttaag254
cel9JClo1313_1604, Cthe_0624gcccccttAAAaaatttta-aaattCGaAattaatttttg477
cel9PClo1313_1955, Cthe_0274aacgtctatAAtttttt-atgataaCGataaaaTtaaatt19
cel9QClo1313_1603, Cthe_0625acccacttAAAaatgtgtatgtgcaCGgAtttcTatttgg375
cel9UeClo1313_3023, Cthe_2360agcccctcAAAaattttttccctttCGaAtataTatagat394
cel9VClo1313_0349, Cthe_2760atacccatAAAatttttatgttctaCGaAtataTaatata124
cel48SClo1313_2747, Cthe_2089gccccctcAAAaagtatatttttttCGaAgataTatatat498
cenCClo1313_0420, Cthe_2879cccaatcgAAAaaagaacatgtcatCGaAtctaTatatca102
cipAClo1313_0627, Cthe_3077tgcccctcAAAttccgtttatatatCGaAtataTattaca846
csePClo1313_2188, Cthe_0044taagccacAAAattattt-tttctaCGaAtataTattgaa132
pelB2Clo1313_0501, Cthe_2950tcccaatgAAAtacgacccttgataCGtAttatTaatata67
pilZClo1313_1490, Cthe_0733gccccctcAAAata-tgagaacattCGaAatatTataata321
pl11Clo1313_1983, Cthe_0246ctacccctAAAaaaa-ttagaatttCGtAtttaTaaaaag39
rsgI5Clo1313_0985, Cthe_1273atggaccaAAAagtactttcaaacaCGaAattaTtaaata43
rsgI9Clo1313_1969, Cthe_0260ctctaaaaAtAtcgggatttttttcCGaAataactaatag31
sdbAClo1313_0950, Cthe_1307caacgctcAAtacgaactctttctcCGaAtttaTtctatt157
xgh74AClo1313_0851, Cthe_1398ggtacatcAAAggaaagtacaggtcCGaAtttaTatagcg147
xyn10DClo1313_0177, Cthe_2590tgcgaccaAAAggcgtcaaatttcaCGaAatacatataaa33
xyn10YClo1313_1305, Cthe_0912cccaacgtAAAaattcaatacctttCGaAtaaaTaacata277
GH30-CBM6-DocClo1313_0563, Cthe_3012ccgtacatAAAaagaagttttgattCGaAtaatTaacaca67
GH43-2xCBM6-DocClo1313_0987, Cthe_1271cccaacccAAActtgccatatgtttCGtAcaaaTaaattg67
HPClo1313_1436, Cthe_0785atcccctttAAgaattgacataaaaCGcAttaactattat106
HP-DocClo1313_1494, Cthe_0729acggaaatAAAaacaactccaattaCGaAtaaaTatacca35
GH43-CBM42-DocClo1313_2216, Cthe_0015cccactccAAAaaacatttaattctCGtAttatTataaca46
GH39-2xCBM35-DocClo1313_2793, Cthe_2137fctcaacttAAAaaatacattcttctCGtAtatgTaagtta160
GH43-CBM42-DocClo1313_2794, Cthe_2138tacgtcacAAAccaaaaacccagaaCGaAccaaTtaataa121
GH2-CBM6-DocClo1313_2861, Cthe_2197fcccaactaAAAaaaataggtacttcCGtAaaagTaaaaca163
ABC transporterClo1313_2866ftaaccctaAAAatttaatgccgattCGaAtaaaaaagcct149
ConsensusAAA 13-14(N) CGWA----TW
Bacillales consensusACCCCC--AA 15(N) CGAA----T

The most conserved bases (> 85%) are shown in bold capital fonts. W represents A or T.

The genes without trivial names are denoted with their main protein product domain(s). GH, glycoside hydrolase; CBM, carbohydrate-binding module; Doc, dockerin. HP, hypothetical protein.

Clo1313 and Cthe are the locus tag prefixes of C. thermocellum strains DSM 1313 and ATCC 27405, respectively.

Transcription start sites identified by Sand and co-workers [27] are indicated in uppercase bold italics, and the underlined sequences are the promoter sequence proposed by the same authors [27].

In C. thermocellum DSM 1313 xynB is part of the operon xynB-xynA, whereas xynB is absent in other strains, such as ATCC 27405 and JW20.

All the promoter sequences shown in the table are exactly the same for both C. thermocellum strains, DSM 1313 and ATCC 27405. The only small difference is in the gene cel9U at the 5' position. Whereas strain DSM 1313 has the sequence 5' AGCCCCTCAAA 3' (bold fonts are part of the consensus in the -35 element), the sequence of strain ATCC 27405 is 5' AGCTCCCTCAAA 3' (underline designates an inserted T).

In C. thermocellum strain ATCC 27405 the genes Cthe_2137 and Cthe_2197 are interrupted by an IS element. Additionally, strain ATCC 27405 does not contain a gene orthologous to Clo1313_2866 (an ABC transporter ATP-binding protein).

The most conserved bases (> 85%) are shown in bold capital fonts. W represents A or T. The genes without trivial names are denoted with their main protein product domain(s). GH, glycoside hydrolase; CBM, carbohydrate-binding module; Doc, dockerin. HP, hypothetical protein. Clo1313 and Cthe are the locus tag prefixes of C. thermocellum strains DSM 1313 and ATCC 27405, respectively. Transcription start sites identified by Sand and co-workers [27] are indicated in uppercase bold italics, and the underlined sequences are the promoter sequence proposed by the same authors [27]. In C. thermocellum DSM 1313 xynB is part of the operon xynB-xynA, whereas xynB is absent in other strains, such as ATCC 27405 and JW20. All the promoter sequences shown in the table are exactly the same for both C. thermocellum strains, DSM 1313 and ATCC 27405. The only small difference is in the gene cel9U at the 5' position. Whereas strain DSM 1313 has the sequence 5' AGCCCCTCAAA 3' (bold fonts are part of the consensus in the -35 element), the sequence of strain ATCC 27405 is 5' AGCTCCCTCAAA 3' (underline designates an inserted T). In C. thermocellum strain ATCC 27405 the genes Cthe_2137 and Cthe_2197 are interrupted by an IS element. Additionally, strain ATCC 27405 does not contain a gene orthologous to Clo1313_2866 (an ABC transporter ATP-binding protein). To investigate how conserved are the "extended -35" and -10 elements of C. thermocellum and C. straminisolvens sigI-like gene promoters, we performed a search of sigI-rsgI operons using the publicly available genomic sequences of the known cellulosome-producing bacteria. The S5 Table shows the putative promoters upstream of sigI-like genes found during the mining. Multiple sigI-like genes in Clostridium clariflavum, Acetivibrio cellulolyticus and Pseudobacteroides cellulosolvens were found, and most of their cognate rsgI-like genes encode proteins containing a C-terminal CBM. As shown in S5 Table, a conserved AAA sequence for the "extended -35" element and a highly conserved CGWA for the -10 element were identified in C. clariflavum, A. cellulolyticus and P. cellulosolvens. These results confirm the high conservation of the -35 and -10 elements in sigI-like promoters of taxonomically divergent cellulosome-producing bacteria. The high conservation of the "extended -35" and -10 promoter elements of cellulosome-producing bacteria is more evident in the WebLogo generated with the predicted promoter sequences of C. thermocellum, C. straminisolvens, C. clariflavum, A. cellulolyticus and P. cellulosolvens shown in Fig 1B. Comparison between the predicted promoters of the sigI-like genes of cellulosome-producing bacteria and the promoter consensus sequence of Bacillales sigI genes shows a different level of similarities (Table 1 and S5 Table). For example, while the putative promoters of C. thermocellum and C. straminisolvens sigI2 and sigI3 are most similar to the promoter consensus sequence of Bacillales sigI genes, the putative promoters of C. thermocellum and C. straminisolvens sigI7 are less similar (Table 1). This observation is quite interesting because the deduced amino acid sequence of C. thermocellum σI7 has the highest similarity to B. subtilis σI (data not shown). Our analysis shows that the cellulosome-producing bacteria which use multiple sigI-like genes probably maintain different levels of similarity in promoter sequences to fine-tune the regulation of individual sigI-like genes, as well as cellulosomal target genes. With the predicted promoter sequences of the multiple sigI-like genes of C. thermocellum, C. straminisolvens, C. clariflavum, A. cellulolyticus and P. cellulosolvens we suggest that AAA of the "extended -35" and CGWA of the -10 elements represent the general motifs for σI-dependent promoters of cellulosome-producing bacteria.

Searching for σI–dependent promoter sequences of cellulosomal genes in C. thermocellum

Based on the assumption that σI factors autoregulate their own expression, and hence the genes that are under their control should have similar promoter sequences, we performed a search of putative promoter sequences of the cellulosomal genes of C. thermocellum. The search was performed by exploiting the conserved sequences in the "extended -35" (AAA) and -10 (CGWA) elements of the general motifs in the sigI-like gene promoters of cellulosome-producing bacteria. The analysis was facilitated with the recent identification of the transcriptional start sites of C. thermocellum xyn10Z and xyn11B by Sand and co-workers [27]. In Table 2 are listed the 33 putative predicted promoters that were identified during the analysis. Additionally, Table 2 shows the conserved AAA sequence for the "extended -35" element and the conserved CGWA for the -10 element. Interestingly, a highly conserved TW dinucleotide (W represents A or T), downstream of the -10 element, was also identified (Table 2).

Searching for σI6–dependent promoter sequences of cellulosomal genes in C. thermocellum

Given the limited genetic tool set available for C. thermocellum, we used B. subtilis as a heterologous host to test the ability of C. thermocellum σI6 and σI3 factors to recognize the C. thermocellum predicted promoters. A similar strategy has been successfully used by several research groups to analyze regulatory proteins from different Firmicutes species as Clostridium difficile, Enterococcus faecalis and Oceanobacillus iheyensis [39-42]. Additionally, the high homology presented by RNAPs of B. subtilis and C. thermocellum (e.g., more than 67% of identical residues for subunits α, β and β'; see S2 Fig) gave more support to this approach. To avoid the interference of the native B. subtilis σI during the study of C. thermocellum σI factors, we constructed the B. subtilis CO02 strain which is devoid of its sigI-rsgI operon (S3 Table). The present work was first focused on the activation of putative promoter sequences by C. thermocellum σI6 factor. This initial analysis was facilitated by the fact that the C. thermocellum σI6 promoter was previously identified by Nataf and co-workers [19] and that the xylanase genes xyn10Z, xyn11B and xyn10D were shown to be under the control of σI6 by Sand and co-workers [27]. The 7 predicted promoters of C. thermocellum sigI-like genes (Table 1) and the 33 C. thermocellum σI-dependent predicted promoters (Table 2) were fused to a lacZ reporter gene (S2 Table) and integrated into the B. subtilis amyE locus (S3 Table). Subsequently, the C. thermocellum σI6 factor was integrated into the B. subtilis lacA locus, and the recognition of the predicted promoters by C. thermocellum σI6 was analyzed by measuring LacZ activity (Table 3). Our analysis showed that C. thermocellum σI6 factor recognized 10 predicted promoters that correspond to the C. thermocellum genes xyn10Z, xyn11B, cipA, sigI6, xyn10Y, cseP (Clo1313_2188, Cthe_0044), sigI1, rsgI5, xyn10D and cel9V (Clo1313_0349, Cthe_2760). The LacZ activities of these 10 predicted σI6-dependent promoters are shown in the Table 3.
Table 3

Quantitative evaluation of possible promoters under the control of C. thermocellum σI6.

GeneLocus tagsaGene producteRef.Activity (U)f
xyn10ZClo1313_2635, Cthe_1963Cellulosomal xylanase:CE1-CBM6-Doc-GH10[43]1 126 ± 149
xyn11BbClo1313_0522, Cthe_2972aCellulosomal xylanase:GH11-CBM6-Doc[44]919 ± 72
cipAcClo1313_0627, Cthe_3077Cellulosomal scaffoldin subunit:2(Coh)-CBM3-6(Coh)-X-Doc[45]266 ± 13
cel9VClo1313_0349, Cthe_2760Cellulosomal endoglucanase:GH9-2(CBM3)-Doc[46]223 ± 49
sigI6Clo1313_2778, Cthe_2120Alternative σI6 factor[17]139 ± 6
xyn10YClo1313_1305, Cthe_0912Cellulosomal xylanase:CBM22-GH10-CBM22-Doc-CE1[47]113 ± 10
csePClo1313_2188, Cthe_0044Cellulosomal component:CotH-Doc[48]93 ± 17
sigI1Clo1313_2174, Cthe_0058Alternative σI1 factor[17]28 ± 4
rsgI5Clo1313_0985, Cthe_1273Anti-σI5 factor:RsgI_N-UNK-CBM42[17]25 ± 5
xyn10DClo1313_0177, Cthe_2590Cellulosomal xylanase:CBM22-GH10-Doc[49]20 ± 3
sigI3dClo1313_1911, Cthe_0315Alternative σI3 factor[17]ND

Clo1313 and Cthe are the locus tag prefixes of C. thermocellum strains DSM 1313 and ATCC 27405, respectively.

In C. thermocellum DSM 1313 xyn11B is part of the operon xyn11B-xyn11A, whereas xyn11B is absent in other strains, such as ATCC 27405 and JW20.

In C. thermocellum DSM 1313 cipA encodes for a scaffoldin with 8 cohesins, whereas in other strains, such as ATCC 27405, cipA encodes for a scaffoldin with 9 cohesins [10].

The predicted promoter of C. thermocellum sigI3 was used as negative control.

CE, carbohydrate esterase; CBM, carbohydrate binding module; Doc, dockerin; GH, glycoside hydrolase; Coh, cohesin; X, CipA X-module; CotH, spore coat protein H; UNK, unknown domain.

The activity was measured using MUG as substrate. One unit of enzyme activity (U) was defined as the amount of β-galactosidase that releases 1 μmol of 4-MU per minute Numbers in parenthesis represent the standard deviation of at least three independent experiments. ND means not detected.

Clo1313 and Cthe are the locus tag prefixes of C. thermocellum strains DSM 1313 and ATCC 27405, respectively. In C. thermocellum DSM 1313 xyn11B is part of the operon xyn11B-xyn11A, whereas xyn11B is absent in other strains, such as ATCC 27405 and JW20. In C. thermocellum DSM 1313 cipA encodes for a scaffoldin with 8 cohesins, whereas in other strains, such as ATCC 27405, cipA encodes for a scaffoldin with 9 cohesins [10]. The predicted promoter of C. thermocellum sigI3 was used as negative control. CE, carbohydrate esterase; CBM, carbohydrate binding module; Doc, dockerin; GH, glycoside hydrolase; Coh, cohesin; X, CipA X-module; CotH, spore coat protein H; UNK, unknown domain. The activity was measured using MUG as substrate. One unit of enzyme activity (U) was defined as the amount of β-galactosidase that releases 1 μmol of 4-MU per minute Numbers in parenthesis represent the standard deviation of at least three independent experiments. ND means not detected. As expected, the heterologous expression of the C. thermocellum σI6 in B. subtilis allowed the recognition of its own promoter. This result is in agreement with the recent report of Sand and co-workers [27], which shows that C. thermocellum σI6 is autoregulated. Interestingly, five of the 10 activated promoters correspond to genes encoding the cellulosomal associated GH9, GH10 and GH11 glycoside hydrolases (xyn10Z, xyn11B, xyn10Y, xyn10D and cel9V). Furthermore, the highest β-galactosidase (LacZ) activities were obtained with the predicted promoters of the two xylanase genes xyn10Z and xyn11B (Table 3). Moreover, xyn11B is the first gene of the bicistronic operon xyn11B-xyn11A in C. thermocellum DSM 1313, whereas the xyn11B gene is lacking in other C. thermocellum strains, such as ATCC 27405 and JW20 (S6 Table). Bioinformatics analysis, performed with seven strains of C. thermocellum (DSM 1313, ATCC 27405, DSM 2360, YS, AD2, JW20 and BC1) and C. straminisolvens, showed identical predicted σI6 promoter upstream of a single xyn11A gene of strains ATCC 27405 and JW20 (S6 Table). This suggests a strong prediction for the regulation of both xylanases, Xyn11B and Xyn11A, by the C. thermocellum σI6 factor. Finally, the recognition of predicted promoters of genes encoding non-enzymatic proteins (cipA, cseP, sigI1 and rsgI5) by the C. thermocellum σI6 factor suggests a more complex regulon for this kind of alternative σ factors. As already mentioned, sigI1 and rsgI5 are also proposed to be involved in the regulation of cellulosomal genes [17]. Hence, the recognition of the sigI1 and rsgI5 promoters by the C. thermocellum σI6 factor suggests the possibility of crosstalk between different C. thermocellum σI factors and an overlap of their respective regulons.

Identification of conserved promoter elements for σI6 recognition

In order to identify the essential bases for σI6 recognition, we performed an alignment using the experimentally validated promoter sequences recognized by C. thermocellum σI6. To improve the analysis, the 10 experimentally validated σI6-dependent promoter sequences of C. thermocellum DSM 1313 were compared with orthologous promoter sequences of C. straminisolvens JCM 21531. The result is shown in Fig 2. It can be observed that the σI6-dependent promoters share two highly conserved sequences. The suggested C. thermocellum σI6-promoter consensus motifs are CNNAAA for the -35 element and CGAA for the -10 element (where N represents any base). The spacing between the suggested -35 and -10 elements is between 13 and 14 nucleotides (Fig 2). It is interesting to note that downstream of the -10 element there is a highly conserved AT sequence.
Fig 2

Identification of conserved elements of σI6-dependent promoter sequences.

WebLogo generated with σI6-dependent promoter sequences of C. thermocellum and orthologous promoter sequences of C. straminisolvens.

Identification of conserved elements of σI6-dependent promoter sequences.

WebLogo generated with σI6-dependent promoter sequences of C. thermocellum and orthologous promoter sequences of C. straminisolvens. In the -35 element, next to the highly conserved C, a less conserved CGA sequence can be observed in the WebLogo generated with the 10 σI6-dependent promoter sequences of C. thermocellum and C. straminisolvens (Fig 2). Interestingly, this CGA sequence is present in the promoter sequences of sigI6, xyn10D, and in the sequences of the two strongest promoters identified, xyn10Z and xyn11B (Fig 2 and Table 3). In order to analyze the importance of these bases for the recognition by C. thermocellum σI6, we performed site-directed mutagenesis analysis using the promoter sequence of xyn10Z. Additionally, we evaluated the most conserved bases, which are suggested as the C. thermocellum σI6-promoter consensus, including the highly conserved AT sequence downstream of the -10 element. The analysis is shown in Fig 3. It can be seen that individual changes in the nucleotides C and G of the CGA sequence (xyn10Z mut1 and mut2) at the 5' of the -35 element reduced dramatically the LacZ activity. Interestingly, the mutation from A to T in the CGA sequence (xyn10Z mut3) increased the LacZ activity by 33%. This result shows that although the nucleotides C and G in the CGA sequence of the -35 element is less conserved, they play an important role in recognition by C. thermocellum σI6.
Fig 3

Evaluation of σI6 promoter sequence validity by mutagenesis.

The activities are shown as relative activities, with the control promoter xyn10Zshort without mutations set to 100%. ND means not detected.

Evaluation of σI6 promoter sequence validity by mutagenesis.

The activities are shown as relative activities, with the control promoter xyn10Zshort without mutations set to 100%. ND means not detected. As expected, individual mutations in the highly conserved nucleotide C (xyn10Z mut4) and in the AAA sequence (xyn10Z mut5, mut6 and mut7) of the -35 element abolished or reduced dramatically the LacZ activity. The least "sensitive" nucleotide was the middle A (xyn10Z mut6) in the AAA triplet of the -35 element. Regarding the -10 element, the most conserved bases were also highly sensitive to mutations. Individual changes in the highly conserved CGAA sequence showed that the first 3 nucleotides CGA (xyn10Z mut10, mut11 and mut13) are more sensitive to changes than the last A at the 3' of the CGAA sequence (xyn10Z mut14). Finally, changes of the highly conserved AT sequence at the 3' of the -10 element showed a dramatic reduction of LacZ activity when the sequence was changed to TA (xyn10Z mut12). However, when the AT sequence was changed to CC, LacZ activity was not detected (xyn10Z mut13). All these results confirm the importance of the highly conserved sequences of the -35 and -10 elements, as well as the less conserved nucleotides C and G of the CGA sequence at the 5' of the -35 element.

Searching for σI3–dependent promoter sequences of cellulosomal genes in C. thermocellum

In this work, we developed a new methodology that employs B. subtilis as a heterologous host to verify C. thermocellum σI-dependent promoters. By exploiting this approach, we extended the promoter analysis to other C. thermocellum σI factors. The above-mentioned 40 predicted σI-dependent promoters (Tables 1 and 2) were also analyzed with C. thermocellum σI3 expressed in B. subtilis. Table 4 shows LacZ activity of the four predicted promoters that were recognized by C. thermocellum σI3. These promoters were deduced upstream of the C. thermocellum genes sigI3, pl11, ce12 and cipA. As expected, C. thermocellum σI3 was able to recognize its own promoter, again suggesting autoregulation in C. thermocellum. Two of the predicted promoters recognized by C. thermocellum σI3 belong to genes coding for pectin-degrading enzyme (pl11 and ce12). Furthermore, during the quantification of σI3-dependent promoter activities, the promoters of pl11 and ce12 showed the highest activities (Table 4). Interestingly, the previously verified σI6-dependent promoter of cipA was also recognized by C. thermocellum σI3 (but to a lesser degree) suggesting a possible overlap between the regulons of C. thermocellum σI3 and σI6 (Tables 3 and 4).
Table 4

Quantitative evaluation of possible promoters under the control of C. thermocellum σI3.

GeneLocus tagsaGene productbRef.Activity (U)d
pl11Clo1313_1983, Cthe_0246Cellulosomal pectinase:Doc-CBM35-RGL11----195 ± 13
ce12Clo1313_0693, Cthe_3141Cellulosomal pectinase:RGAE-Doc-CBM35-RGAE----62 ± 8
cipAClo1313_0627, Cthe_3077Cellulosomal scaffoldin subunit:2(Coh)-CBM3-6(Coh)-X-Doc[45]58 ± 7
sigI3Clo1313_1911, Cthe_0315Alternative σI3 factor[17]28 ± 13
sigI6cClo1313_2778, Cthe_2120Alternative σI6 factor[17]ND

Clo1313 and Cthe are the locus tag prefixes of C. thermocellum strains DSM 1313 and ATCC 27405, respectively.

CE, carbohydrate esterase; CBM, carbohydrate binding module; Doc, dockerin; RGL, rhamnogalacturonan lyase; RGAE, rhamnogalacturonan acetylesterase Coh, cohesin; X, CipA X-module; UNK, unknown domain.

The predicted promoter of C. thermocellum sigI6 was used as negative control.

The activity was measured using MUG as substrate. One unit of enzyme activity (U) was defined as the amount of β-galactosidase that releases 1 μmol of 4-MU per minute Numbers in parenthesis represent the standard deviation of at least three independent experiments. ND means not detected.

Clo1313 and Cthe are the locus tag prefixes of C. thermocellum strains DSM 1313 and ATCC 27405, respectively. CE, carbohydrate esterase; CBM, carbohydrate binding module; Doc, dockerin; RGL, rhamnogalacturonan lyase; RGAE, rhamnogalacturonan acetylesterase Coh, cohesin; X, CipA X-module; UNK, unknown domain. The predicted promoter of C. thermocellum sigI6 was used as negative control. The activity was measured using MUG as substrate. One unit of enzyme activity (U) was defined as the amount of β-galactosidase that releases 1 μmol of 4-MU per minute Numbers in parenthesis represent the standard deviation of at least three independent experiments. ND means not detected. In order to identify the important promoter nucleotides for σI3 recognition, we performed an alignment using the experimentally validated promoter sequences recognized by C. thermocellum σI3. To improve the analysis, the four experimentally validated σI3-dependent promoter sequences were compared with orthologous promoter sequences of C. straminisolvens. The result is shown in Fig 4. It can be observed that σI3-dependent promoters have two highly conserved sequences. The suggested C. thermocellum σI3-promoter consensus is CCCYYAAA for the -35 element and CGWA for the -10 element (where Y represents C or T, and W represents A or T). The spacing between the -35 and -10 elements is between 13 and 14 nucleotides, resembling the organization of the σI6 promoter (Fig 2).
Fig 4

Identification of conserved elements of σI3-dependent promoter sequences.

WebLogo generated with σI3-dependent promoter sequences of C. thermocellum and orthologous promoter sequences of C. straminisolvens.

Identification of conserved elements of σI3-dependent promoter sequences.

WebLogo generated with σI3-dependent promoter sequences of C. thermocellum and orthologous promoter sequences of C. straminisolvens.

Discussion

Since the original discovery of the cellulosome, numerous observations indicated that its production and composition is influenced by the nature of the carbon source present in the growth media [7,50]. However, until now, there are only a few reports in the literature regarding the regulation of cellulosomal genes [51-54]. Most of these studies were focused on key cellulosomal genes, such as cipA [51], cel48S [52], cel8A [53], cel9D [54] and cel9F [54]. In the majority of these works, the authors were able to identify σA-dependent promoters upstream of the analyzed genes [51-54]. Additionally, these genes appeared to be regulated by alternative σ factors [51-54]. Nonetheless, the assignment of alternative σ factors was problematic, and in the case of cel9D [54] and cel9F [54], the authors could not suggest a convincing alternative σ factor. This obstacle surfaced since basic knowledge about C. thermocellum was limited, and the DNA sequence upstream of the start sites did not contain homologies with described consensus promoters [54]. In an effort to gather knowledge about the regulation of cellulosomal genes, our research group discovered a set of eight alternative σI factors [17] where six of them showed up-regulation by environmental polysaccharides [19]. This set of C. thermocellum σI factors is homologous to the B. subtilis σI factor [17,18]. Hence, in order to identify the important C. thermocellum σI promoter elements, in the present work, we decided to compare the upstream regions of C. thermocellum σI factor genes with the Bacillales σI-dependent promoter sequences. The high conservation of the Bacillales σI-dependent promoter sequences identified in this study indicated that the C. thermocellum sigI-rsgI operons may likely have similar -35 and -10 promoter elements. Additionally, it was interesting that the Bacillales σI-dependent promoters have a C "enrichment" in their -35 elements (Fig 1A). This observation suggested an easy way to search for σI-dependent promoter candidates in other species of the Firmicutes phylum, and notably, for our purposes, the cellulosome-producing bacteria, especially since one striking characteristic of the Firmicutes phylum is the low G/C content of their genomes [55]. However, during the search of putative σI-dependent promoters in C. thermocellum, only sigI2 and sigI3 showed the anticipated high C-enrichment in the -35 element motif (Table 1). Subsequently, we compared the C. thermocellum sigI2 and sigI3 predicted promoter sequences with those of C. thermocellum sigI1 and sigI6 genes proposed by Nataf and co-workers [19]. This comparison revealed that the promoter regions of sigI1, sigI2, sigI3 and sigI6 have specific "signatures", such as an AAA triad at the -35 region and a CGWA tetrad at the -10 region (Table 1). Using these specific "signatures", we were able to predict the putative promoters of C. thermocellum sigI4, sigI7 and sigI8. Moreover, the conservation of these specific "signatures" within σI-dependent promoters was additionally supported by the orthologous sigI-promoter sequences of the closely related cellulosome-producing bacterium, C. straminisolvens JCM 21531 (Table 1). Furthermore, these specific "signatures" were corroborated with the predicted promoter sequences of sigI-like genes of C. clariflavum, A. cellulolyticus and P. cellulosolvens and the currently identified 33 σI-dependent promoters of C. thermocellum cellulosomal genes (Table 2 and S5 Table). The predicted promoter sequences of the σI-dependent promoters of cellulosome-producing bacteria can be divided into three regions. Two regions are highly conserved and contain the proposed specific "signatures" of the σI-dependent promoters, namely the AAA triad and the CGWA tetrad sequences of the "extended -35" and -10 element, respectively (Tables 1 and 2, and Fig 1). The third region is highly divergent and is located in the 5'-terminal sequence of the -35 elements (corresponding to the B. subtilis ACCCC sequence, Tables 1 and 2 and Fig 1). We predict that, whereas the most conserved sequences of the "extended -35" and -10 elements are implicated in the "general" recognition of promoters by their cognate σI factors, the most divergent 5'-terminal sequence of the -35 elements is likely implicated in the specificity of the different σI factors. This phenomenon could reflect the general strategy followed by σI factors in order to recognize their target promoters in cellulosome-producing bacteria, whose genomes encode multiple σI factors. Our hypothesis was herein supported by experimental identification of the putative C. thermocellum σI6- and σI3-dependent promoters (Figs 2 and 3). Whereas the C. thermocellum σI6-dependent promoters have a highly conserved C nucleotide upstream of the AAA in the -35 element, the C. thermocellum σI3-dependent promoters have a highly conserved CCC triad (Figs 2 and 3). Additionally, the analysis of the C. thermocellum σI6- and σI3-dependent promoters indicated that, in addition, some nucleotides in the -10 element probably have an important role in the specificity of the different σI factors of cellulosome-producing bacteria. For example, all of the identified C. thermocellum σI6-dependent promoters have the 5'-located CGAA sequence in their -10 elements, and most of the promoters have an AT sequence three bases downstream (Fig 2). Regarding the identified C. thermocellum σI3-dependent promoters, they have a less conserved -10 element with the sequence CGWA, where W could be A or T (Fig 3). Four of the 10 promoters that were activated by C. thermocellum σI6 belong to genes implicated in the hydrolysis of xylan (xyn10Z, xyn11B-xyn11A operon, xyn10Y and xyn10D). This observation is in accordance with previous experiments performed by Nataf and co-workers [19] which showed that when C. thermocellum was grown on cellulose, the expression of the sigI6 gene was up-regulated 2.5-fold; and when the cells were grown on cellulose and xylan, sigI6 was up-regulated at least 10-fold. Moreover, the anti-σI6 factor, RsgI6, bears an extracytoplasmic C-terminal sensing module that belongs to the glycoside hydrolase family 10 (GH10). Interestingly, the RsgI6 GH10-family module is highly similar to Xyn10D (it is 57% identical and has 79% similar residues in the 381-aa gapless alignment; data not shown). Bahari and co-workers [21] showed that the RsgI6 GH10-like domain binds to oat-spelt xylan and Avicel (cellulose). Furthermore, a recent study performed by Wei and co-workers [16], showed that the genes xyn10Y and xyn10D were up-regulated when C. thermocellum was grown in dilute acid-pretreated yellow poplar, containing 65% cellulose, 4% xylan and 31% lignin. Regarding C. thermocellum σI3, two of the four promoters that were activated by C. thermocellum σI3 belong to genes implicated in the solubilization of pectin (pl11/Clo1313_1983 and ce12/Clo1313_0693). Interestingly, the anti-σI3 factor RsgI3, has an extracytoplasmic sensing module that is composed of two tandem PA14 superfamily motifs that were shown to bind pectin by Kahel-Raifer and co-workers [17]. Taken together, these results suggest that while σI6 likely plays a role in the regulation of xylan-degrading enzymes, σI3 likely plays a role in the regulation of pectin-degrading enzymes. The C. thermocellum sigI1 and rsgI5 genes are part of the proposed genes involved in the regulation of cellulosomal genes in response to environmental polysaccharides [17]. In the present work, the two predicted σI-dependent promoters for sigI1 and rsgI5 were recognized by C. thermocellum σI6 (Table 3). Furthermore, the predicted σI-dependent promoter for cipA was recognized by both C. thermocellum σI6 and σI3 (Tables 3 and 4). These results suggest possible crosstalk between different C. thermocellum σI factors and a possible overlap of their respective regulons. Interestingly, this phenomenon is common in ECF sigma factors. For example, Huang and co-workers [56] found in B. subtilis that σW recognizes a subset of promoters that are partially dependent on σX for expression. Additionally, Mascher and co-workers [57] found 7 ECF sigma factors in B. subtilis that regulate partially overlapping regulons related to cell envelope homeostasis and antibiotic resistance. To unwrap its preferred substrate, cellulose, that is covered with different types of polysaccharides in the plant cell wall, C. thermocellum would presumably produce an array of different hydrolytic cellulosomal components. Hence, to have partially overlapping regulons for the multiple σI factors could be advantageous, because expression of a variety of cellulosomal components is crucial for efficient solubilization of the different types of polysaccharides that conceal the cellulose fibers in their native state. Our results and observations reveal several promising options to improve the performance of the industrially prominent bacterium C. thermocellum. First of all, by changing promoter designs by metabolic engineering, we may try to modify the expression of selected cellulosomal genes that might be crucial for production of natural forms of designer cellulosomes. Secondly, by using strong sigma-dependent promoters (e.g., those of xyn10Z, xyn11B, pl11, etc.) one can introduce additional, synthetic cellulosomal genes in C. thermocellum and use their products for improvement of either saccharolytic activity or, alternatively, ethanol production. Continued analysis and harnessing of the various σ and anti-σ factors in C. thermocellum will allow us to control and enhance the capacity of this ecologically prominent and industrially relevant bacterium for deconstruction of plant-derived polysaccharides en route to the production of biofuels.

Schematic depiction of plasmid pLOXErysigIrsgIBs.

(PDF) Click here for additional data file.

ClustalW alignment of the RNAP subunits sequences of Bacillus subtilis strain 168 and Clostridium thermocellum strain DSM 1313.

(PDF) Click here for additional data file.

Primers used in the present work.

(PDF) Click here for additional data file.

Plasmids constructed in the present work.

(PDF) Click here for additional data file.

Bacillus subtilis strains constructed in the present work.

(PDF) Click here for additional data file.

Alignment of experimentally confirmed and putative sigI promoters from different Bacillales species, including the experimentally confirmed sigI-dependent promoters of the B. subtilis bcrC and mreBH genes.

(PDF) Click here for additional data file.

Alignments of predicted sigI promoters from Clostridium clariflavum, Acetivibrio cellulolyticus, Pseudobacteroides cellulosolvens, Clostridium thermocellum and Clostridium straminisolvens.

(PDF) Click here for additional data file.

σI6-dependent promoter sequence alignment of xyn11B-xyn11A operon and xyn11A of available sequences of C. thermocellum and C. straminisolvens JCM 21531.

(PDF) Click here for additional data file.
  53 in total

1.  Chi18A, the endochitinase in the cellulosome of the thermophilic, cellulolytic bacterium Clostridium thermocellum.

Authors:  Vladimir V Zverlov; Klaus-Peter Fuchs; Wolfgang H Schwarz
Journal:  Appl Environ Microbiol       Date:  2002-06       Impact factor: 4.792

2.  Global view of the Clostridium thermocellum cellulosome revealed by quantitative proteomic analysis.

Authors:  Nicholas D Gold; Vincent J J Martin
Journal:  J Bacteriol       Date:  2007-07-20       Impact factor: 3.490

3.  Structure of a family 3a carbohydrate-binding module from the cellulosomal scaffoldin CipA of Clostridium thermocellum with flanking linkers: implications for cellulosome structure.

Authors:  Oren Yaniv; Ely Morag; Ilya Borovok; Edward A Bayer; Raphael Lamed; Felix Frolow; Linda J W Shimon
Journal:  Acta Crystallogr Sect F Struct Biol Cryst Commun       Date:  2013-06-27

4.  Two new major subunits in the cellulosome of Clostridium thermocellum: xyloglucanase Xgh74A and endoxylanase Xyn10D.

Authors:  Vladimir V Zverlov; Nicolaus Schantz; Philippe Schmitt-Kopplin; Wolfgang H Schwarz
Journal:  Microbiology       Date:  2005-10       Impact factor: 2.777

5.  Promoter recognition by Bacillus subtilis sigmaW: autoregulation and partial overlap with the sigmaX regulon.

Authors:  X Huang; K L Fredrick; J D Helmann
Journal:  J Bacteriol       Date:  1998-08       Impact factor: 3.490

6.  Using CLUSTAL for multiple sequence alignments.

Authors:  D G Higgins; J D Thompson; T J Gibson
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

7.  Regulatory role of RsgI in sigI expression in Bacillus subtilis.

Authors:  Kei Asai; Takafumi Ootsuji; Kazue Obata; Takashi Matsumoto; Yasutaro Fujita; Yoshito Sadaie
Journal:  Microbiology       Date:  2007-01       Impact factor: 2.777

8.  Mapping of mRNA encoding endoglucanase A from Clostridium thermocellum.

Authors:  P Béguin; M Rocancourt; M C Chebrou; J P Aubert
Journal:  Mol Gen Genet       Date:  1986-02

9.  Heterologous expression of the Oceanobacillus iheyensis SigW and its anti-protein RsiW in Bacillus subtilis.

Authors:  Koichi Yano; Hiromi Inoue; Hirokazu Mori; Lii Mien Yee; Satoshi Matsuoka; Yoshito Sadaie; Kei Asai
Journal:  Biosci Biotechnol Biochem       Date:  2011-05-20       Impact factor: 2.043

10.  Evidence for a general role for non-catalytic thermostabilizing domains in xylanases from thermophilic bacteria.

Authors:  C M Fontes; G P Hazlewood; E Morag; J Hall; B H Hirst; H J Gilbert
Journal:  Biochem J       Date:  1995-04-01       Impact factor: 3.857

View more
  11 in total

1.  Revisiting the Regulation of the Primary Scaffoldin Gene in Clostridium thermocellum.

Authors:  Lizett Ortiz de Ora; Iván Muñoz-Gutiérrez; Edward A Bayer; Yuval Shoham; Raphael Lamed; Ilya Borovok
Journal:  Appl Environ Microbiol       Date:  2017-03-31       Impact factor: 4.792

2.  Growth and expression of relevant metabolic genes of Clostridium thermocellum cultured on lignocellulosic residues.

Authors:  Vanessa O Leitão; Eliane F Noronha; Brenda R Camargo; Pedro R V Hamann; Andrei S Steindorff; Betania F Quirino; Marcelo Valle de Sousa; Cirano J Ulhoa; Carlos R Felix
Journal:  J Ind Microbiol Biotechnol       Date:  2017-02-08       Impact factor: 3.346

Review 3.  Cellulosomes: bacterial nanomachines for dismantling plant polysaccharides.

Authors:  Lior Artzi; Edward A Bayer; Sarah Moraïs
Journal:  Nat Rev Microbiol       Date:  2016-12-12       Impact factor: 60.633

4.  σI from Bacillus subtilis: Impact on Gene Expression and Characterization of σI-Dependent Transcription That Requires New Types of Promoters with Extended -35 and -10 Elements.

Authors:  Olga Ramaniuk; Martin Převorovský; Jiří Pospíšil; Dragana Vítovská; Olga Kofroňová; Oldřich Benada; Marek Schwarz; Hana Šanderová; Jarmila Hnilicová; Libor Krásný
Journal:  J Bacteriol       Date:  2018-08-10       Impact factor: 3.490

5.  Novel clostridial cell-surface hemicellulose-binding CBM3 proteins.

Authors:  Almog Hershko Rimon; Oded Livnah; Inna Rozman Grinberg; Lizett Ortiz de Ora; Oren Yaniv; Raphael Lamed; Edward A Bayer; Felix Frolow; Milana Voronov-Goldman
Journal:  Acta Crystallogr F Struct Biol Commun       Date:  2021-03-31       Impact factor: 1.056

6.  Colocalization and Disposition of Cellulosomes in Clostridium clariflavum as Revealed by Correlative Superresolution Imaging.

Authors:  Lior Artzi; Tali Dadosh; Elad Milrot; Sarah Moraïs; Smadar Levin-Zaidman; Ely Morag; Edward A Bayer
Journal:  MBio       Date:  2018-02-06       Impact factor: 7.867

7.  Inducing effects of cellulosic hydrolysate components of lignocellulose on cellulosome synthesis in Clostridium thermocellum.

Authors:  Renmin Li; Yingang Feng; Shiyue Liu; Kuan Qi; Qiu Cui; Ya-Jun Liu
Journal:  Microb Biotechnol       Date:  2018-06-25       Impact factor: 5.813

8.  Unraveling essential cellulosomal components of the (Pseudo)Bacteroides cellulosolvens reveals an extensive reservoir of novel catalytic enzymes.

Authors:  Olga Zhivin-Nissan; Bareket Dassa; Ely Morag; Meital Kupervaser; Yishai Levin; Edward A Bayer
Journal:  Biotechnol Biofuels       Date:  2019-05-09       Impact factor: 6.040

9.  Alternative σI/anti-σI factors represent a unique form of bacterial σ/anti-σ complex.

Authors:  Zhen Wei; Chao Chen; Ya-Jun Liu; Sheng Dong; Jie Li; Kuan Qi; Shiyue Liu; Xiaoke Ding; Lizett Ortiz de Ora; Iván Muñoz-Gutiérrez; Yifei Li; Hongwei Yao; Raphael Lamed; Edward A Bayer; Qiu Cui; Yingang Feng
Journal:  Nucleic Acids Res       Date:  2019-06-20       Impact factor: 16.971

10.  Regulation of biomass degradation by alternative σ factors in cellulolytic clostridia.

Authors:  Lizett Ortiz de Ora; Raphael Lamed; Ya-Jun Liu; Jian Xu; Qiu Cui; Yingang Feng; Yuval Shoham; Edward A Bayer; Iván Muñoz-Gutiérrez
Journal:  Sci Rep       Date:  2018-07-23       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.