Szabolcs Semsey1. 1. Center for Models of Life, Niels Bohr Institute, University of Copenhagen, Copenhagen, Denmark semsey@nbi.dk.
Abstract
UNLABELLED: Bacterial cells monitor their environment by sensing a set of signals. Typically, these environmental signals affect promoter activities by altering the activity of transcription regulatory proteins. Promoters are often regulated by more than one regulatory protein, and in these cases the relevant signals are integrated by certain logic. In this work, we study how single amino acid substitutions in a regulatory protein (GalR) affect transcriptional regulation and signal integration logic at a set of engineered promoters. Our results suggest that point mutations in regulatory genes allow independent evolution of regulatory logic at different promoters. IMPORTANCE: Gene regulatory networks are built from simple building blocks, such as promoters, transcription regulatory proteins, and their binding sites on DNA. Many promoters are regulated by more than one regulatory input. In these cases, the inputs are integrated and allow transcription only in certain combinations of input signals. Gene regulatory networks can be easily rewired, because the function of cis-regulatory elements and promoters can be altered by point mutations. In this work, we tested how point mutations in transcription regulatory proteins can affect signal integration logic. We found that such mutations allow context-dependent engineering of signal integration logic at promoters, further contributing to the plasticity of gene regulatory networks.
UNLABELLED: Bacterial cells monitor their environment by sensing a set of signals. Typically, these environmental signals affect promoter activities by altering the activity of transcription regulatory proteins. Promoters are often regulated by more than one regulatory protein, and in these cases the relevant signals are integrated by certain logic. In this work, we study how single amino acid substitutions in a regulatory protein (GalR) affect transcriptional regulation and signal integration logic at a set of engineered promoters. Our results suggest that point mutations in regulatory genes allow independent evolution of regulatory logic at different promoters. IMPORTANCE: Gene regulatory networks are built from simple building blocks, such as promoters, transcription regulatory proteins, and their binding sites on DNA. Many promoters are regulated by more than one regulatory input. In these cases, the inputs are integrated and allow transcription only in certain combinations of input signals. Gene regulatory networks can be easily rewired, because the function of cis-regulatory elements and promoters can be altered by point mutations. In this work, we tested how point mutations in transcription regulatory proteins can affect signal integration logic. We found that such mutations allow context-dependent engineering of signal integration logic at promoters, further contributing to the plasticity of gene regulatory networks.
Many transcription regulatory proteins sense the level of small molecule signals and bind specific sites in the cis-regulatory region of a defined set of genes, modifying their transcription. The logic of regulation at these genes depends on whether the transcription regulatory protein is an activator (positive control) or a repressor (negative control) and whether the small molecule signal enhances or inhibits the protein’s activity. However, expression of many genes depends on more than one transcription regulatory protein-signal molecule pair. In these cases, the combined effect of the incoming signals depends on how the transcription regulatory proteins affect each other’s binding to DNA and interaction with RNA polymerase. That is, the signal integration logic may be different from the sum of the logic observed in the case of the individual signals (1–5). The simplest cases for studying signal integration are the two-input systems. Sugar regulatory systems are classical examples for integration of two signals, one of which is a global signal for carbon starvation (cyclic AMP [cAMP]), and the other is the specific sugar transported and metabolized by the system (5–9). In the presence of cAMP, the cAMP receptor protein (CRP) can specifically bind to a 16-bp sequence and activate or repress transcription depending on the location of the binding site (3, 10). The intracellular concentration of a given sugar is sensed by a specific transcription regulatory protein. Binding of the sugar to the regulator typically induces an allosteric change in the protein’s structure, altering its DNA binding properties (11, 12).Transcription regulatory proteins in these systems typically possess surfaces for protein-protein, protein-DNA, and protein-small molecule interactions. Single amino acid substitutions allow engineering the binding characteristics of individual surfaces, leaving the other surfaces unchanged. Such changes can affect dimerization, tetramerization (13, 14), DNA binding specificity (15, 16), RNAP contact (17, 18), and inducer binding (19, 20). A special class of signal molecule binding mutants in the case of lactoserepressor can bind to DNA only in the presence of the signal, as opposed to the wild-type (WT) protein which is inactivated by signal molecule binding (19). Such point mutations allow reversion of the regulatory logic of the lac operon, which would otherwise require extensive rearrangement of the lac regulatory region (21).In this work, we used the galactose regulon of Escherichia coli as a model system to study how single amino acid substitutions in regulatory proteins affect signal integration logic. The gal regulon consists of five operons (galETKM, galP, mglBAC, galR, and galS) which are controlled by the galactoserepressor (GalR). Promoters of these operons are also controlled by cAMP-CRP, except P, which is not affected by cAMP-CRP in vitro. Only two operons are transcribed in the absence of regulatory proteins in vitro (5, 22), galR and galETKM. The galETKM operon contains genes required for d-galactose (d-gal) metabolism. These genes are transcribed from two promoters, P1 and P2, which are repressed by the Gal repressosome. Repressosome assembly requires (i) binding of two dimeric GalR proteins to two spatially separated operator elements, O and O, (ii) negatively supercoiled DNA, (iii) optimal angular orientation of the two operator sites, and (iv) specific binding of the architectural protein HU to a DNA site (hbs) in the interoperator region (23, 24).Here, we analyze the effect of 44 single amino acid substitutions on the performance of GalR. The majority of these substitutions are neutral to repressosome-mediated transcription inhibition. However, we find that even such “neutral” substitutions can affect signal integration logic in a regulatory context-dependent manner.
RESULTS AND DISCUSSION
Construction of the experimental system.
The experimental system utilized to study signal integration logic at different promoters (Fig. 1) is similar to the system used by Hunziker et al. (3) except that the chromosomal galR gene was deleted and GalR is supplied using a multicopy plasmid. The cells used are unable to produce cAMP due to a deletion in the cyaA gene; therefore, intracellular cAMP and galactose levels can be controlled by the addition of these molecules to the growth medium.
FIG 1
Schematic drawing of the experimental system used for testing signal integration at promoters. The two inputs of the system are cAMP and d-galactose, which interact with CRP and GalR, respectively. GalR and cAMP-CRP can bind specific sequences in the regulatory regions of the promoters and influence transcription of the reporter gene (uidA). The logic of signal integration, i.e., the activity of the promoter at different combinations of signals, depends on the activity and on the combined action of the regulatory proteins. GalR is expressed constitutively from a multicopy plasmid.
Schematic drawing of the experimental system used for testing signal integration at promoters. The two inputs of the system are cAMP and d-galactose, which interact with CRP and GalR, respectively. GalR and cAMP-CRP can bind specific sequences in the regulatory regions of the promoters and influence transcription of the reporter gene (uidA). The logic of signal integration, i.e., the activity of the promoter at different combinations of signals, depends on the activity and on the combined action of the regulatory proteins. GalR is expressed constitutively from a multicopy plasmid.Structures of the three promoters at which integration of the two external signals, cAMP and d-galactose, was studied are shown in Fig. 2.
FIG 2
Structures of promoters and regulatory regions used in this study. The sequences shown were inserted between the EcoRI and PstI sites located upstream of the uidA reporter gene and downstream of the rrnBT1T2 terminators (T). GalR and cAMP-CRP binding sites are marked blue and red, respectively. Sites which are recognized by both GalR and cAMP-CRP are marked by both colors. Arrowheads indicate transcriptional start sites. Gate A contains part of the galP promoter region and performs AND logic. Gate B performs d-gal NIMPLIES cAMP logic, while Gate C functions as a d-gal gate in the presence of wild-type GalR expressed from a multicopy plasmid. Signal integration at these three promoters (Fig. 5) is represented schematically on the top right. Blue color indicates that the reporter gene is expressed in a given combination of the two input signals, cAMP and d-galactose. White color indicates that the promoter of the reporter gene is inactive.
Structures of promoters and regulatory regions used in this study. The sequences shown were inserted between the EcoRI and PstI sites located upstream of the uidA reporter gene and downstream of the rrnBT1T2 terminators (T). GalR and cAMP-CRP binding sites are marked blue and red, respectively. Sites which are recognized by both GalR and cAMP-CRP are marked by both colors. Arrowheads indicate transcriptional start sites. Gate A contains part of the galP promoter region and performs AND logic. Gate B performs d-gal NIMPLIES cAMP logic, while Gate C functions as a d-gal gate in the presence of wild-type GalR expressed from a multicopy plasmid. Signal integration at these three promoters (Fig. 5) is represented schematically on the top right. Blue color indicates that the reporter gene is expressed in a given combination of the two input signals, cAMP and d-galactose. White color indicates that the promoter of the reporter gene is inactive.
FIG 5
Effect of seven single amino acid substitutions on the signal integration logic at three different promoters. Structures of the promoters are shown in Fig. 2. The activity of the reporter gene (uidA) was monitored in the four possible combinations of d-gal and cAMP, following the arrangement shown in the top left corner. Blue color results from successful conversion of the chromogenic substrate (X-gluc) by the UidA protein, indicating that the reporter gene is expressed (ON) in a given combination of input signals. The lack of blue color indicates that expression is OFF. In case of WT GalR, the lower right corner (d-gal) represents the basal promoter activity (none of the regulators bind to DNA), whereas the lower left corner (no signals) represents promoter activity when GalR is bound. In the presence of cAMP (top left), both GalR and cAMP-CRP can bind DNA, whereas in the presence of both signals (top right), only cAMP-CRP can regulate the promoter.
Effect of single amino acid substitutions on repressosome-mediated repression.
In previous studies, we mutagenized the galR coding region in plasmid pSEM1077 to obtain mutations that result in single amino acid substitutions in the N-terminal third of GalR (15, 17). A collection of 44 substitutions was tested for repressosome-mediated repression of the Gate C promoter (Fig. 3). This promoter is active in the absence of GalR (Fig. 3, GalR−). It can be repressed by DNA looping through repressosome formation but not by binding of individual GalR dimers to the operators. This is demonstrated by the activity of the promoter in the presence of GalR T322R (Fig. 3, T322R), a mutant that can bind operators similar to the wild type but is defective in tetramerization and thus unable to form a repressosome (13, 15, 25).
FIG 3
Repression of the Gate C promoter by GalR mutants. Blue color results from successful conversion of the chromogenic substrate (X-gluc) by the UidA protein, indicating that the reporter gene is expressed, i.e., not repressed efficiently by a given mutant.
Repression of the Gate C promoter by GalR mutants. Blue color results from successful conversion of the chromogenic substrate (X-gluc) by the UidA protein, indicating that the reporter gene is expressed, i.e., not repressed efficiently by a given mutant.Most of the substitutions allowed efficient repression of the reporter gene, indicated by the lack of blue color in the colonies in Fig. 3. Six amino acid substitutions allowed substantial reporter gene expression (T3I, D6G, V7A, A16T, S29D, and N48I). These are all located in the DNA binding headpiece of GalR (13) (Fig. 4). Because GalR is expressed constitutively from a multicopy plasmid, the lack of repression most likely results from weaker DNA binding and not from decreased expression or stability of the mutant GalR proteins. Substitutions located close to or at the dimerization interface did not interfere with repression of the Gate C promoter, probably because the dimer is stabilized by a large number of interactions.
FIG 4
Location of the tested amino acid substitutions in the GalR dimer. Positions of substitutions which allowed strong repression of the Gate C promoter are colored yellow, while the ones which allowed substantial transcription of the promoter are marked blue. The red and green colors indicate positions of the Y244F and T322R substitutions, respectively.
Location of the tested amino acid substitutions in the GalR dimer. Positions of substitutions which allowed strong repression of the Gate C promoter are colored yellow, while the ones which allowed substantial transcription of the promoter are marked blue. The red and green colors indicate positions of the Y244F and T322R substitutions, respectively.
Effect of single amino acid substitutions on signal integration logic.
Five substitutions which repressed the Gate C promoter efficiently were selected and studied in different regulatory contexts (Fig. 5). Two of these (K5E and V21A) are located in the DNA binding headpiece, while three are situated at the dimerization interface (D68E, D71E, and Q83P). Wild-type GalR and two previously characterized mutants were used as controls. These were Y244F, which is insensitive to d-galactose (20), and the tetramerization mutant T322R.Effect of seven single amino acid substitutions on the signal integration logic at three different promoters. Structures of the promoters are shown in Fig. 2. The activity of the reporter gene (uidA) was monitored in the four possible combinations of d-gal and cAMP, following the arrangement shown in the top left corner. Blue color results from successful conversion of the chromogenic substrate (X-gluc) by the UidA protein, indicating that the reporter gene is expressed (ON) in a given combination of input signals. The lack of blue color indicates that expression is OFF. In case of WT GalR, the lower right corner (d-gal) represents the basal promoter activity (none of the regulators bind to DNA), whereas the lower left corner (no signals) represents promoter activity when GalR is bound. In the presence of cAMP (top left), both GalR and cAMP-CRP can bind DNA, whereas in the presence of both signals (top right), only cAMP-CRP can regulate the promoter.Two different signal integration patterns were observed in the case of Gate A, AND and FALSE (Fig. 5, left). This promoter is based on the P promoter, which was previously shown to perform AND logic (5, 8). P is inactive in the absence of cAMP-CRP. Because GalR binding inhibits cAMP-CRP-mediated activation, both signals are required for transcription. Similar signal integration logic was observed in the presence of K5E, V21A, Q83P, and T322R as with WT GalR, indicating that these mutations allow normal DNA binding, d-galactose binding, and inhibition of cAMP-CRP. However, FALSE logic was observed in the presence of D68E and D71E, similar to Y244F. Repression of the promoter in the presence of both d-galactose and cAMP indicates that these mutants bind operators normally but DNA binding is not inhibited by d-galactose. Y244 was previously associated with one of the inducer binding segments (20). D68 and D71 are located relatively far from the predicted d-galactose binding cleft. Therefore, we speculate that substitutions at these positions interfere with the allosteric transition between the DNA binding and d-galactose binding states.Three different signal integration patterns were observed in the case of Gate B, which performs d-gal NIMPLIES cAMP (equivalent to d-gal AND NOT cAMP) logic operation in the presence of WT GalR (Fig. 5, middle column). This construct is based on the galETKM regulatory region. The −10 element of the P2 promoter was replaced by the consensus sequence (TATAAT), and the P1 promoter was inactivated. The promoter is strongly repressed by both GalR-mediated DNA looping and cAMP-CRP binding. Therefore, transcription occurs only in the presence of d-galactose (inhibition of DNA looping) and absence of cAMP (cAMP-CRP binding is not allowed). K5E, V21A, and Q83P showed signal integration logic similar to that of WT GalR, although higher reporter expression was observed in the cases of K5E and V21A in the presence of d-galactose and absence of cAMP (Fig. 5, middle column, right bottom corners). TRUE logic was observed in the presence of T322R, i.e., the reporter showed strong expression at all combinations of input signals. This substitution does not allow DNA loop formation but allows binding of individual dimers to the operators. In the Gate B construct, binding of a GalR dimer to the upstream operator activates the promoter, resulting in a high level of reporter expression. Such activation can occur even in the presence of d-galactose (25), because the stability of the GalR-operator complex is reduced only 7-fold by the presence of d-galactose in the complex (11). FALSE logic was observed with Y244F, D68E, and D71E due to the uninducible nature of these proteins.Signal integration at the Gate C promoter displayed the highest diversity (Fig. 5, right). This construct was made by introducing a set of mutations into the galETKM regulatory region (G-10A, Λ-7T, Δ2T, A5C, A7T, A24G, G54A, T55A, G60A, T62G, A63T, T66C, relative to the P2 transcription start site). These mutations inactivate the P1 promoter and allow cAMP-CRP binding to the downstream GalR operator site (Fig. 2). In the presence of wild-type GalR provided from a multicopy plasmid, the promoter is active only in the presence of d-galactose, regardless of the presence of cAMP, i.e., it functions as a single input (d-gal) logic gate. Repression of the promoter requires DNA looping, because the P2 promoter is activated by GalR when DNA loop formation is not allowed (26). Therefore, as expected, TRUE logic was observed in the case of the nonlooping mutant T322R, and FALSE logic was found in the case of the noninducible mutant Y244F (Fig. 5, right). Cells carrying K5E showed an expression pattern similar to that of wild-type GalR, but in the absence of d-galactose the reporter levels were higher than what was observed with WT GalR. However, OR logic was observed in the cases of V21A and Q83P, i.e., the promoter was active in the presence of any one of the signals and also when both signals were present. This observation suggests that cAMP-CRP can destabilize the GalR-mediated DNA loop by interfering with the binding of GalR to the downstream operator. Binding of these mutants to the upstream operator seems to be unaffected by cAMP-CRP, because in the absence of GalR binding to the upstream operator cAMP-CRP would repress the P2 promoter (26). The D68E- and D71E-mediated DNA loops were also sensitive to the presence of cAMP-CRP; however, in these cases, the signal integration logic resembled a single input switch that responds only to cAMP and not to d-gal. This result further confirms that these mutants are not inducible by d-galactose and also suggests that a single amino acid substitution can affect two distant ligand binding interfaces.
Evolution of regulatory proteins and regulatory networks.
Bacteria can tolerate radical changes in their gene regulatory networks, which can be easily rewired by gain or loss of cis-regulatory elements and thus can evolve rapidly (1–3, 27–30). The strength of specific protein-DNA interactions can be fine-tuned by mutations in the binding sites (31) to optimize the performance of the network. Analysis of evolutionary dynamics in prokaryotic networks revealed that transcription regulatory proteins and their target genes evolve relatively independently. Major phenotypical differences between organisms are the result of changes in the regulatory proteins rather than in the regulated gene repertoire. Also, the structure of the regulatory network reflects the lifestyle of the organism better than its phylogenetic relations (32). These observations are in line with a recent study which shows that accumulation of intermediary metabolites can cause cellular stress (33), suggesting that metabolic networks may have less plasticity than regulatory networks.Results presented in this work suggest that gene regulatory networks can also be rewired by point mutations in the genes encoding transcription regulatory proteins. These mutations can change how metabolites are handled in certain conditions and can allow fast optimization of the network for a different lifestyle.Although regulatory proteins can become nonfunctional due to point mutations, certain regions of these proteins are tolerant for substitutions. For example, more than 44% of the amino acid positions are tolerant to substitutions in the Lac repressor (LacI) (19), which has a structure similar to GalR. The examples of the V21A and Q83P substitutions in GalR suggest that changes in such positions can alter the signal integration logic in certain regulatory contexts without perturbing the logic in other contexts. These mutants regulate all three promoters shown in Fig. 2 the same way as wild-type GalR in the absence of cAMP-CRP, i.e., the promoters are repressed in the absence of d-galactose and transcription is allowed in the presence of d-galactose (Fig. 5). However, in the presence of cAMP-CRP, the regulatory logic becomes qualitatively different from the wild type in the case of Gate C and remains the same for Gate A and Gate B. That is, a small quantitative difference in the affinity of a regulator to a given operator site can result in a qualitative change in the overall function of a network.
Concluding remarks.
In summary, we can conclude that mutations in transcription regulatory proteins allow context-dependent engineering of signal integration logic at promoters, contributing to the plasticity of gene regulatory networks. This plasticity adds an additional layer of complexity to bioinformatics-based network reconstruction, because sequences of transcription regulatory proteins are rarely identical in different organisms.
MATERIALS AND METHODS
Plasmid and strain construction.
Synthetic regulatory regions were created by PCR and inserted upstream of the uidA open reading frame (ORF) on the chromosome of E. coli CH1200 cells (Δcya854) (13) by the method described earlier (22). The galR gene in the obtained cells was replaced by a chloramphenicol resistance gene (Cm) according the protocol described by Datsenko and Wanner (34). The Cm gene was PCR amplified using the primers galRCmup (5′ CCAACGGGCGTT TTCCGTAACACTGAAAGAATGTAAGCGTTTACCCACTAAGGTATTTTCATGCCGTTACGCACCACCCCGTC 3′) and galRCmdn (5′ TCAGGCGCGGTTGATTCGCCGTCGCCAGACCATCGAAGAATTACTGGCGCTGGAATTACGCCCCGCCCTGCCACTC 3′) and pRFB122 as the template (35). Sequences of the constructed regions and their flanking DNAs were verified (Eurofins MWG Operon).Derivatives of plasmid pSEM1077 carrying mutations in the galR gene were obtained in a previous mutagenesis study (17). Plasmid pSEM1077 was created from pSEM1029 (14) by engineering a PvuII site in the GalR coding region (17). In these high-copy-number plasmids, the galR gene is transcribed constitutively from the synthetic pEM7 promoter.
Screening the logic of gene regulation.
Signal integration in the constructed circuits was characterized by monitoring gene expression at four different combinations of d-galactose and cAMP, representing the four extremes of the two-dimensional input functions. We prepared 4-lb plates containing 100 µg/ml ampicillin, 30 µg/ml chloramphenicol, 50 µg/ml X-gluc (5-bromo-4-chloro-3-indolyl-beta-d-glucuronic acid; Fermentas), and (i) no d-galactose and no cAMP (no input signals), (ii) 8 mM d-galactose (d-gal = 1), (iii) 0.16 mM cAMP (cAMP = 1), and (iv) 8 mM d-gal and 0.16 mM cAMP (d-gal = 1 and cAMP = 1). We spotted 1 µl of cell suspension on each LB agar plate.This characterization resembles Boolean logic, where inputs and outputs are 0 (absent) or 1 (present at high concentration). The output was evaluated based on the color of the colonies. The criterion of Boolean-type integration is that the high and low states of reporter gene expression can be clearly distinguished. The presence of blue color (1) reflects expression of the reporter operon (uidABC), which is responsible for transport and metabolism of the chromogenic substrate X-gluc. In the absence of blue color, the output is 0.The logic gates were chosen to satisfy these criteria with wild-type GalR. The signal integration could be described by Boolean algebra in the presence of GalR mutants as well, except in the case of K5E and Gate C, where the high- and low-expression states could not be clearly distinguished.
Authors: Szabolcs Semsey; Michail Y Tolstorukov; Konstantin Virnik; Victor B Zhurkin; Sankar Adhya Journal: Genes Dev Date: 2004-08-01 Impact factor: 11.361
Authors: A Bell; K Gaston; R Williams; K Chapman; A Kolb; H Buc; S Minchin; J Williams; S Busby Journal: Nucleic Acids Res Date: 1990-12-25 Impact factor: 16.971
Authors: Emil D Agerschou; Gunna Christiansen; Nicholas P Schafer; Daniel Jhaf Madsen; Ditlev E Brodersen; Szabolcs Semsey; Daniel E Otzen Journal: Sci Rep Date: 2016-06-09 Impact factor: 4.379
Authors: Kelly N Phillips; Scott Widmann; Huei-Yi Lai; Jennifer Nguyen; J Christian J Ray; Gábor Balázsi; Tim F Cooper Journal: mBio Date: 2019-11-12 Impact factor: 7.867