Literature DB >> 35381174

DNA Input Classification by a Riboregulator-Based Cell-Free Perceptron.

Ardjan J van der Linden1,2, Pascal A Pieters1,2, Mart W Bartelds3, Bryan L Nathalia2, Peng Yin4,5, Wilhelm T S Huck3, Jongmin Kim6, Tom F A de Greef1,2,3,7.   

Abstract

The ability to recognize molecular patterns is essential for the continued survival of biological organisms, allowing them to sense and respond to their immediate environment. The design of synthetic gene-based classifiers has been explored previously; however, prior strategies have focused primarily on DNA strand-displacement reactions. Here, we present a synthetic in vitro transcription and translation (TXTL)-based perceptron consisting of a weighted sum operation (WSO) coupled to a downstream thresholding function. We demonstrate the application of toehold switch riboregulators to construct a TXTL-based WSO circuit that converts DNA inputs into a GFP output, the concentration of which correlates to the input pattern and the corresponding weights. We exploit the modular nature of the WSO circuit by changing the output protein to the Escherichia coli σ28-factor, facilitating the coupling of the WSO output to a downstream reporter network. The subsequent introduction of a σ28 inhibitor enabled thresholding of the WSO output such that the expression of the downstream reporter protein occurs only when the produced σ28 exceeds this threshold. In this manner, we demonstrate a genetically implemented perceptron capable of binary classification, i.e., the expression of a single output protein only when the desired minimum number of inputs is exceeded.

Entities:  

Keywords:  cell-free systems; genetic classifier; perceptron; synthetic biology; synthetic genetic networks; weighted sum operations (WSO)

Mesh:

Substances:

Year:  2022        PMID: 35381174      PMCID: PMC9016768          DOI: 10.1021/acssynbio.1c00596

Source DB:  PubMed          Journal:  ACS Synth Biol        ISSN: 2161-5063            Impact factor:   5.249


Introduction

Fundamental to the survival of living organisms is their ability to process a wide variety of information, continuously sensing, and, in turn, adapting to their surroundings. Synthetic biology employs the use of naturally occurring biological parts implemented in synthetic networks designed to either mimic existing or introduce novel functionalities to both living cells and artificial platforms.[1−5] The parallels in information processing between living systems and electronic devices have led to the development of numerous synthetic biological logic circuits inspired by their counterparts in the field of electrical engineering.[6−13] While the development of such circuits can be seen as a purely scientific exercise,[14−16] these systems are increasingly being applied toward practical applications such as biomarker recognition for medical diagnostics,[17−21] the detection of pollutants,[22,23] and a variety of cell therapies.[24−27] Herein, the ability to differentiate between various unique biomarkers, as well as combinations thereof, and the subsequent categorization of biomarker patterns into identifiable classes is critical. Due to their innate ability to function as classifiers, there has been broad interest in the development of synthetic biological neural networks.[28−35] Perceptrons (Figure a), the most basic building block found in neural networks, act as linear classifiers, accepting a range of inputs, each with a corresponding weight, and provide a single binary output.[36,37] Perceptrons compute a weighted sum of the inputs and their weights, which undergoes thresholding using an activation function to return a single output value corresponding to the classification of the inputs provided. In addition to the simplistic design of a perceptron, it is their ability to accept analog input signals and return a single digital output to perform binary classifications that has led to the prominence of perceptron-based classifiers within the field of synthetic biology.[20,31,33,38] Furthermore, the possibility to expand the basic perceptron circuit by introducing additional layers and the inclusion of memory devices[39−42] in the system is pushing the development of biological neural networks that combine molecular pattern classification and molecular data storage.[43]
Figure 1

(a) Overview of a perceptron. A range of inputs, each with a unique weight, are summed, with the weighted sum output serving as the input of an activation function, which determines the perceptron output. (b) RNA toehold switch. The transcription of RNA from DNA inputs yields trans-acting RNAs that activate the RNA riboregulator, resulting in the translation of a gene. The DNA templates encoding the trans-acting and transducer RNA strands are designated as the input and weight, respectively. Binding of the trans-acting strand to the toehold of the transducer strand initiates toehold-mediated RNA–RNA strand displacement, whereby the hairpin sequence is unfolded and the ribosome-binding site is exposed, allowing for the expression of the output gene: deGFP. TA and H refer to the trans-acting strand and the hairpin sequence of the transducer strand, respectively. TX: transcription and TL: translation. (c) deGFP expression levels achieved when combining a single input and weight pair, for a range of concentrations. The bar height corresponds to the average of two data points (dots). (d) Normalized relative expression for each input in the presence of on- and off-target toehold switches. Each unique input (5 nM) is exposed to each of the weights (with the concentration corresponding to the weight pattern) in isolation, and the end-point expression level is recorded following a batch expression experiment, as described in the Methods section. The bar height corresponds to the average of three data points (dots). For each input, the data were normalized by dividing the average expression level by the average expression level determined for the on-target (i.e., cognate input and weight pair) reaction. * p < 0.001, ** p < 0.006, and *** p < 0.02. (e) Expression of each unique input (5 nM) in the presence of all three weights (1, 5, and 7.5 nM, respectively, for weights 1, 2, and 3). The control was performed in the presence of the weight pattern, without the addition of any inputs. Input and weight concentrations were optimized to ensure approximately equal expression levels for all inputs. Bars depict the average expression level of at least three experiments (dots). ns = not significant and * p < 0.0001. All experiments were conducted using linear DNA constructs in a self-made cell lysate solution under batch conditions, as specified in the Methods section.

(a) Overview of a perceptron. A range of inputs, each with a unique weight, are summed, with the weighted sum output serving as the input of an activation function, which determines the perceptron output. (b) RNA toehold switch. The transcription of RNA from DNA inputs yields trans-acting RNAs that activate the RNA riboregulator, resulting in the translation of a gene. The DNA templates encoding the trans-acting and transducer RNA strands are designated as the input and weight, respectively. Binding of the trans-acting strand to the toehold of the transducer strand initiates toehold-mediated RNA–RNA strand displacement, whereby the hairpin sequence is unfolded and the ribosome-binding site is exposed, allowing for the expression of the output gene: deGFP. TA and H refer to the trans-acting strand and the hairpin sequence of the transducer strand, respectively. TX: transcription and TL: translation. (c) deGFP expression levels achieved when combining a single input and weight pair, for a range of concentrations. The bar height corresponds to the average of two data points (dots). (d) Normalized relative expression for each input in the presence of on- and off-target toehold switches. Each unique input (5 nM) is exposed to each of the weights (with the concentration corresponding to the weight pattern) in isolation, and the end-point expression level is recorded following a batch expression experiment, as described in the Methods section. The bar height corresponds to the average of three data points (dots). For each input, the data were normalized by dividing the average expression level by the average expression level determined for the on-target (i.e., cognate input and weight pair) reaction. * p < 0.001, ** p < 0.006, and *** p < 0.02. (e) Expression of each unique input (5 nM) in the presence of all three weights (1, 5, and 7.5 nM, respectively, for weights 1, 2, and 3). The control was performed in the presence of the weight pattern, without the addition of any inputs. Input and weight concentrations were optimized to ensure approximately equal expression levels for all inputs. Bars depict the average expression level of at least three experiments (dots). ns = not significant and * p < 0.0001. All experiments were conducted using linear DNA constructs in a self-made cell lysate solution under batch conditions, as specified in the Methods section. The versatile and highly programmable nature of DNA and RNA has proven to be exceptionally useful when performing biomolecular computations.[25,43−53] This widespread usage stems from the ability to rationally design novel DNA and RNA strands, which hybridize predictably with complementary constructs. A majority of the previously published classification networks utilize single-stranded DNA templates and rely upon binding competition and DNA strand displacement reactions.[38,44,54] However, designs based solely on DNA strand displacement reactions, where DNA strands are prepared synthetically and subsequently combined, lack the information storage capabilities and the broader ability to integrate the circuits into larger, more complex biological systems due to a lack of transcriptional and translational control. Here, we present a genetically implemented perceptron based on in vitro transcription and translation (TXTL) reactions.[55−59] The cell-free genetically implemented perceptron acts as an ON/OFF binary classifier, with the expression of a reporter protein correlating to an ON signal. The final perceptron design consists of two distinct elements: a weighted sum operation (WSO) implemented at the RNA level and a post-translational thresholding reaction. The TXTL-based WSO utilizes toehold switch riboregulators, a class of de novo designed translational riboregulators comprising a cognate pair of RNAs[60−62] (Figure b). A transducer strand is implemented to regulate translation, with a cognate trans-acting RNA serving to modulate its biological activity.[60] In the context of our implemented WSO, the transducer strands and trans-acting strands serve as the weights and inputs, respectively. Inputs initiate toehold-mediated RNA–RNA strand displacement reactions upon the weights, which have been designed to form a hairpin such that bases in the regions surrounding the ribosome-binding site (RBS) and start codon are sequestered. The expression of genes encoded on the weight construct is thereby inhibited until a complementary input is provided, which first binds to a single-stranded toehold sequence at the 5′ end of the hairpin before completing a branch migration process exposing the RBS and start codon, enabling ribosome binding and subsequent gene expression. The WSO output is determined as the sum of the total concentration of protein expressed by all of the unique input and weight pairs. The toehold switch technology allows us to couple the output of a WSO at the RNA level to the production of a protein. To realize a TXTL-based perceptron, we implemented an ultrasensitive sink at the protein level based on molecular titration, resulting in a tunable threshold of the WSO output.[63,64] In this way, the perceptron only presents an output signal (ON state) when the WSO output concentration exceeds that of the threshold set by the concentration of the titrant. By tuning the concentration of the titrant, the classification boundary of the perceptron can be controlled, with higher titrant concentrations requiring greater concentrations of the WSO output to be produced before the ON state of the perceptron is reached. Increasing the WSO output is achieved by increasing the overall concentration of inputs provided to the WSO, either by increasing the concentration of individual inputs or by increasing the number of unique inputs provided. The determination and selection of a specific threshold allow the perceptron to perform a classification of the number of inputs provided to the WSO; only displaying the ON state when a specified minimum number of inputs has been provided. Here, we present the successful in vitro implementation of a novel three-input TXTL-based perceptron, utilizing both RNA- and protein-level regulation technologies to construct a DNA input classifier. Although the addition of transcriptional and translational processes increases the overall complexity of the network design when compared to systems relying solely upon DNA strand displacement reactions,[38,44,54] it also offers significant advantages. The application of genes encoding protein sequences, as opposed to designing DNA templates solely for strand displacement, enables the up- and downstream usage of these proteins, greatly expanding the functionality of these circuits. Furthermore, the interchangeability of both the DNA inputs and the protein outputs facilitates the implementation of this perceptron in larger complex synthetic genetic networks.

Results

Construction of the genetic perceptron occurred in several distinct phases, with initial research focused on the development of two WSO networks: the first incorporating the reporter protein directly on the weight templates and the second utilizing the WSO output to regulate the expression of a downstream reporter construct. The latter of these WSO circuits was subsequently used to implement the perceptron via addition of a downstream ultrasensitive sink. The genetic perceptron was designed for in vitro implementation, with all experiments occurring under batch conditions using a self-made cell lysate derived from bacteria (Escherichia coli, see the Methods section).[55,56] All of the genetic constructs for the inputs, weights, and the independent reporter were constructed using a Golden Gate assembly-based cloning method.[57] The DNA constructs used were optimized for RNA stability as well as the efficacy of the input–weight pair, as reported by Pieters et al.[62] Polymerase chain reactions (PCR) were used to prepare linear DNA templates for use in the experiments (see the Methods section).

Input–Weight Pair Characterization

Three unique input–weight pairs were constructed to perform WSOs. Initially, to investigate the behavior of input–weight pairs and to investigate their ability to function in a WSO, a fluorescent reporter protein (deGFP) was encoded downstream of the transducer construct hairpin sequence. In this manner, binding of the trans-acting RNA strand (obtained via transcription of the input DNA template) to the toehold of the transducer RNA strand (obtained via the transcription of the weight DNA template) allowed the direct expression of the deGFP reporter (Figure b). The independent expression levels for each of the three inputs, solely in the presence of their cognate weight was determined for a range of both input and weight concentrations (Figure c). Despite using identical promoter sites for all inputs and weights, large differences in the reporter end-point concentration of the different input–weight pairs can be seen, revealing disparities in their relative expression strengths. However, the end-point expression levels increased predictably as the concentrations of both the input and weight were increased (Figure S1). To enable the classification of the number of inputs provided, as opposed to the specific combination of inputs, the end-point expression levels of each input–weight pair should be approximately equal, such that the addition of any input results in an equal and predictable increase in the overall WSO output. Guided by the results provided in Figure c, where the output expression for a range of input and weight concentrations is analyzed, a standard input concentration (5 nM) was determined. Subsequently, a weight pattern was determined, wherein the concentration of each of the unique weights was adjusted according to the relative expression strength of each input and weight pair. A weight pattern of weight 1: 1 nM, weight2: 5 nM, and weight 3: 7.5 nM was selected. Upon conducting WSOs, as well as when performing input classification with the perceptron, all weights will be present in concentrations equal to the weight pattern. As such, the orthogonality of the unique input–weight pairs is critical, ensuring that each input is only able to activate the reporter expression of its own complementary weight. The orthogonality of the input–weight pairs was determined by exposing each input to each of the three weights in isolation. For each input, the end-point expression levels of both on- and off-target, where on-target indicates a cognate input–weight pair, were divided by the end-point expression level of the on-target reaction, revealing the relative expression levels of on- and off-target input and weight combinations (Figure d). While inputs 1 and 3 show moderate orthogonality, with both off-target weights exhibiting at least a fivefold reduction in their relative expression, input 2 reveals high background expression levels in the presence of off-target weights, albeit approximately halved with regard to the on-target expression level. The prior determination of the orthogonality of the input–weight pairs was conducted in isolation, as opposed to the combination of weights provided by the weight pattern applied during WSO experiments. Therefore, each of the inputs was individually exposed to the weight pattern, demonstrating the ability of each input to activate reporter expression while simultaneously being exposed to off-target weights (Figure e). Furthermore, the end-point expression levels of the reporter highlighted the efficacy of the chosen input concentration and weight pattern, with each input eliciting satisfactory levels of reporter protein expression, albeit with input 3-induced expression presenting reduced end-point expression levels.

Direct-Expression WSO

Completing the development of the WSO, combinations of inputs were provided to reactions comprising all three weights in their respective concentrations. Each additional input provided is expected to supplement the overall pool of deGFP expressed, with the total expressed concentration serving as a readout of the WSO (Figure a). With the weight concentrations scaled relative to the expression strength, it is expected that each additional input will increase the total expressed concentration by an identical amount, with the number of inputs provided determining the expression level. Expression data revealed that the average expression levels for single-, double-, and triple-input reactions could be differentiated; however, the increases between these averages were not uniform (Figure b). The additional resource burden resulting from the addition of each additional input could explain the nonlinear increase in expression levels.[65−67] Based on the reporter expression levels for single-input reactions, a rudimentary model was developed (Supporting Methods 1). Each input added to a WSO is analogous to adding an additional term to the summation, with a value equal to the input concentration (x) multiplied by the weight concentration (w). By further multiplying each term with a scaling factor, representing biological processes such as transcription, translation, and RNA binding, it is possible to predict expression levels. To determine the scaling factors α, β, and γ, the rudimentary model function (eq ) was fit to the single-input expression data of the WSO experiments (Figure c, Supporting Methods 1).
Figure 2

(a) Schematic overview of a weighted sum operation (WSO) utilizing three unique inputs and their corresponding weights. Inputs take the form of DNA constructs encoding trans-acting RNAs and can either be added or omitted from experiments. Transducer strand encoding DNA constructs form the weights, with all three switches being present during all experiments. Addition of an input to the system will result in the expression of deGFP contributing to the total sum of deGFP produced. The increase in expression because of input addition is dependent on the weight concentration. TA and H refer to the trans-acting strand and the hairpin sequence of the transducer strand, respectively. TX: transcription and TL: translation. (b) Average expression curves of all single-, double-, and triple-input reactions. Bold lines indicate the average of these experiments, with the shaded regions indicating the standard deviation. For each unique combination of inputs, (3× single input, 3× double input, and 1× triple input) a minimum of two batch experiments were conducted, as described in the Methods section. All inputs provided were 5 nM in concentration, and all three weights were present during all reactions with the weight pattern as follows: weight 1: 1 nM, weight 2: 5 nM, and weight 3: 7.5 nM. The expression levels were recorded every 5 min for a duration of 15 h. The average expression level per input class was determined as the average of all reactions conducted with the specified number of inputs for each class. * p = 0.02, ** p < 0.02, and *** p = 0.001. Figure S2 specifies the expression curve for each of the unique input combinations. (c) Model-predicted expression levels plotted against experimentally determined values. Model parameters α, β, and γ were fit to single-input end-point expression data and subsequently used to predict the system behavior for additional inputs (see Supporting Methods 1). Expression data were obtained from batch experiments conducted as per the Methods section. All inputs were provided at a concentration of 5 nM, with all weights present according to the specified weight pattern. At least three batch experiments were conducted for each unique combination of inputs.

(a) Schematic overview of a weighted sum operation (WSO) utilizing three unique inputs and their corresponding weights. Inputs take the form of DNA constructs encoding trans-acting RNAs and can either be added or omitted from experiments. Transducer strand encoding DNA constructs form the weights, with all three switches being present during all experiments. Addition of an input to the system will result in the expression of deGFP contributing to the total sum of deGFP produced. The increase in expression because of input addition is dependent on the weight concentration. TA and H refer to the trans-acting strand and the hairpin sequence of the transducer strand, respectively. TX: transcription and TL: translation. (b) Average expression curves of all single-, double-, and triple-input reactions. Bold lines indicate the average of these experiments, with the shaded regions indicating the standard deviation. For each unique combination of inputs, (3× single input, 3× double input, and 1× triple input) a minimum of two batch experiments were conducted, as described in the Methods section. All inputs provided were 5 nM in concentration, and all three weights were present during all reactions with the weight pattern as follows: weight 1: 1 nM, weight 2: 5 nM, and weight 3: 7.5 nM. The expression levels were recorded every 5 min for a duration of 15 h. The average expression level per input class was determined as the average of all reactions conducted with the specified number of inputs for each class. * p = 0.02, ** p < 0.02, and *** p = 0.001. Figure S2 specifies the expression curve for each of the unique input combinations. (c) Model-predicted expression levels plotted against experimentally determined values. Model parameters α, β, and γ were fit to single-input end-point expression data and subsequently used to predict the system behavior for additional inputs (see Supporting Methods 1). Expression data were obtained from batch experiments conducted as per the Methods section. All inputs were provided at a concentration of 5 nM, with all weights present according to the specified weight pattern. At least three batch experiments were conducted for each unique combination of inputs. Due to the simplicity of the model, with a single variable for each of the input–weight pairs, the model fit aligned with the average expression determined from a series of triplicate experiments. Implementing the model optimized scaling factors, predictions of the expression levels for multi-input experiments were made. In each case, the model prediction exceeds the experimentally determined expression levels, reiterating the nonlinear behavior of the WSO upon increasing the number of inputs. The model assumes expression levels identical to those achieved with single-input reactions, regardless of the burden placed on the system via the introduction of additional inputs.

Coupling WSO to a Downstream Network

Computationally, it is possible to classify single-, double-, and triple-inputs from the results of the described WSO; however, a biological implementation of a perceptron or a classifier using this network is nontrivial. Instead, to facilitate a gene-based classifier network, the deGFP reporter protein serving as the WSO output was exchanged for the E. coli sigma factor σ28.[68] In doing so, the WSO output could be coupled to the expression of downstream genes while also allowing thresholding of the WSO output, providing perceptron-like behavior. To construct the σ28-coupled WSO network, each of the weight constructs was altered to encode for the σ28 gene. Additionally, a reporter construct was designed, coding for the deGFP reporter protein, which was placed under transcription control of the P28a promoter. In the resulting system, σ28 expressed as the WSO output competes with the σ70 present in the cell lysate for binding with core RNA polymerase (RNAP), each forming their respective holoenzymes necessary for the transcription of genes.[69] Subsequently, the σ28 holoenzyme can initiate transcription of the reporter construct, providing a fluorescent readout, which is both dependent on and correlated to the outcome of the WSO (Figure a).
Figure 3

(a) Coupling of the WSO is achieved by replacing the WSO output with the σ28 sigma factor. Here, the WSO output is the total amount of produced σ28, which serves as an activator for the downstream expression of the deGFP reporter protein. TA and H refer to the trans-acting strand and the hairpin sequence of the transducer strand, respectively. TX: transcription and TL: translation. (b) End-point deGFP expression levels of the reporter protein for each of the unique input and weight pairs, across a range of concentrations. The reporter DNA template concentration was 7.5 nM. Bars indicate an average measurement of two reactions (dots). (c) End-point expression levels of each unique input in the presence of all three weights (0.6, 0.6, and 0.8 nM for weights 1, 2, and 3, respectively). Two input concentrations (1 and 5 nM) were tested, and a reporter template concentration of 15 nM was used. In each case, the bar represents the average expression of two individual experiments (dots). A Grubbs outlier test was used to eliminate a statistical outlier present in the 5 nM input 1 data set. ns = not significant. (d) Experimentally determined deGFP expression levels plotted against the model-predicted values (see Supporting Methods 2). The rudimentary model parameters were fit to single-input end-point expression data and subsequently applied to predict expression levels for multiple-input reactions. Experimental expression data were acquired via batch reactions where the end-point expression levels of all unique combinations of inputs (5 nM) were determined in the presence of all weights (corresponding to the specified weight pattern) and the reporter construct (15 nM). At least three batch experiments were conducted for each unique combination of inputs. * p < 0.0001, ** p < 0.007, and *** p < 0.02. All experiments were conducted using linear DNA templates in a self-made cell lysate solution under batch conditions, as specified in the Methods section.

(a) Coupling of the WSO is achieved by replacing the WSO output with the σ28 sigma factor. Here, the WSO output is the total amount of produced σ28, which serves as an activator for the downstream expression of the deGFP reporter protein. TA and H refer to the trans-acting strand and the hairpin sequence of the transducer strand, respectively. TX: transcription and TL: translation. (b) End-point deGFP expression levels of the reporter protein for each of the unique input and weight pairs, across a range of concentrations. The reporter DNA template concentration was 7.5 nM. Bars indicate an average measurement of two reactions (dots). (c) End-point expression levels of each unique input in the presence of all three weights (0.6, 0.6, and 0.8 nM for weights 1, 2, and 3, respectively). Two input concentrations (1 and 5 nM) were tested, and a reporter template concentration of 15 nM was used. In each case, the bar represents the average expression of two individual experiments (dots). A Grubbs outlier test was used to eliminate a statistical outlier present in the 5 nM input 1 data set. ns = not significant. (d) Experimentally determined deGFP expression levels plotted against the model-predicted values (see Supporting Methods 2). The rudimentary model parameters were fit to single-input end-point expression data and subsequently applied to predict expression levels for multiple-input reactions. Experimental expression data were acquired via batch reactions where the end-point expression levels of all unique combinations of inputs (5 nM) were determined in the presence of all weights (corresponding to the specified weight pattern) and the reporter construct (15 nM). At least three batch experiments were conducted for each unique combination of inputs. * p < 0.0001, ** p < 0.007, and *** p < 0.02. All experiments were conducted using linear DNA templates in a self-made cell lysate solution under batch conditions, as specified in the Methods section. Each of the inputs was exposed to its complementary weight in isolation across a range of input and weight concentrations to confirm the downstream coupling pathway while also indicating the relative expression strengths of each of the redesigned input–weight pairs (Figure b). Again, each input–weight pair possessed a unique expression strength while displaying predictable responses to increases in the weight concentration. The weight concentrations applied here were significantly reduced compared to the previous system, with end-point expression levels decreasing at weight concentrations above 0.8 nM in almost all cases. Similarly, the expression levels achieved using an input concentration of 5 nM were often comparable to those achieved using a higher 10 nM input concentration, indicating diminishing returns when using elevated DNA input concentrations (Figure S3). Guided by data from Figure b, an updated weight pattern was determined so as to ensure similar expression levels for each input: weights 1 and 2: 0.6 nM and weight 3: 0.8 nM. Exposing each input individually to the aforementioned weight pattern resulted in statistically equal levels of reporter expression (Figure c). Here, an input concentration of 5 nM resulted in superior expression levels when compared to 1 nM input concentrations, further indicating optimal expression conditions when using 5 nM inputs. Conducting WSOs with the coupled network showed predictable increases in deGFP expression levels when increasing the number of inputs provided (Figure d). Appending the rudimentary model, such that the WSO output is used as an activator for the expression of the reporter, once more enables the computational prediction of expression levels for multiple inputs. As with the direct-expression WSO, the model was first fit to expression data from single-input experiments to determine the system-specific scaling factors (see Supporting Methods 2). Hereafter, predictions of the expression levels were compared to the experimentally acquired results of multi-input reactions. Despite predicting higher expression levels, the model predictions closely replicate the experimentally determined expression levels, which were marginally lower than predictions in all cases. The clear grouping of single-, double-, and triple-input expression levels allows the classification of the system into single-, double-, or triple-input classes by analyzing only the reporter expression levels.

Genetically Implemented Perceptron

To demonstrate a TXTL-based perceptron, thresholding of the WSO output was implemented. The binding of σ28 with the core RNAP was inhibited via the addition of the anti-σ28 protein to the system, which competitively binds to free σ28. Furthermore, the addition of anti-σ28 can also promote the dissociation of σ28 from the core RNAP.[70] The addition of sufficient anti-σ28 is therefore able to inhibit the expression of the deGFP reporter (Figure a). By tuning the concentration of anti-σ28 supplied to each reaction, it is possible to tune the perceptron threshold such that the expression of the reporter requires at least one, two, or three inputs. As shown in Figure b, this is analogous to implementing “OR”, “MAJORITY”, and “AND” functions with the perceptron. Furthermore, it highlights the ability of the perceptron to act as a molecular classifier, which can distinguish the number of inputs, only returning an output when the number of provided inputs matches the desired logic function. Experimentally, each of these classification functions was realized by varying the concentration of anti-σ28 added to each of the reactions (Figure c). As expected, by omitting the anti-σ28, deGFP reporter expression occurred in all cases where at least a single input was provided, correlating with the desired OR function. Increasing the anti-σ28 concentration to 0.5 μM resulted in reactions with fewer than two unique inputs being unable to exceed 0.5 μM deGFP expression, the minimum expression level required for the network output to qualify as ON. At the same anti-σ28 concentration, reactions with at least two inputs (i.e., the majority of inputs being present) were able to express sufficient deGFP to classify the output signal as ON and thereby realize the MAJORITY function. Upon increasing the anti-σ28 concentration to 1 μM, only the three-input reaction was able to express over 0.5 μM deGFP to provide the ON signal, in accordance with the AND function. The concentration of anti-σ28 required to demonstrate each of these classification functions was found experimentally, initially by increasing the concentration of anti-σ28 added to three-input reactions (Figure S4), providing an upper limit for the inhibitor concentration beyond which no deGFP expression would occur regardless of the number of inputs. Hereafter, the concentration was lowered incrementally until the one- and two-input reactions were successfully inhibited. Due to the variations in expression strength for each of the unique input and weight pairs, it was possible to achieve scenarios wherein only one or two of the single-input reactions was successfully repressed (Figure S5), highlighting the importance of adjusting the weight pattern such that each of the input and weight pairs expressed similar levels of the reporter, thereby minimizing the effort required to find a suitable anti-σ28 concentration.
Figure 4

(a) Application of a thresholding mechanism, in the form of anti-σ28 allows for the creation of a TXTL-based perceptron. The anti-σ28 competitively binds to the σ28 produced during the WSO, inhibiting it from activating the expression of the reporter protein deGFP. Altering the concentration of the anti-σ28 changes the WSO output concentration required for reporter expression to occur. (b) For each unique input combination, the desired perceptron output or target can be determined for a range of logic functions. A ‘0’ target indicates no expression of the reporter and a ‘1’ target indicates reporter expression. (c) Variations in the anti-σ28 concentration allowed for the experimental realization of each of the classification functions. Expression is said to be ‘OFF’ when the expressed deGFP concentration is below 0.5 μM. The inputs (5 nM) added to each of the batch reactions are indicated by the green circles. All three weights were present in each reaction with the following weight pattern: weights 1 and 2: 0.6 nM and weight 3: 0.8 nM. Additionally, a 12.5 nM deGFP reporter construct was added to each reaction. The end-point expression was determined following a batch reaction performed as described in the Methods section. The negative control (white bar) was obtained in the presence of all weights and the reporter but lacked any of the inputs. The height of the bars corresponds to the average of at least three experiments (dots). Significance was determined based on a one-sided t-test to determine if the mean of each unique reaction was significantly greater than the threshold value of 0.5 μM. * p < 0.05, ** p < 0.003, *** p < 0.002, and ns = not significant.

(a) Application of a thresholding mechanism, in the form of anti-σ28 allows for the creation of a TXTL-based perceptron. The anti-σ28 competitively binds to the σ28 produced during the WSO, inhibiting it from activating the expression of the reporter protein deGFP. Altering the concentration of the anti-σ28 changes the WSO output concentration required for reporter expression to occur. (b) For each unique input combination, the desired perceptron output or target can be determined for a range of logic functions. A ‘0’ target indicates no expression of the reporter and a ‘1’ target indicates reporter expression. (c) Variations in the anti-σ28 concentration allowed for the experimental realization of each of the classification functions. Expression is said to be ‘OFF’ when the expressed deGFP concentration is below 0.5 μM. The inputs (5 nM) added to each of the batch reactions are indicated by the green circles. All three weights were present in each reaction with the following weight pattern: weights 1 and 2: 0.6 nM and weight 3: 0.8 nM. Additionally, a 12.5 nM deGFP reporter construct was added to each reaction. The end-point expression was determined following a batch reaction performed as described in the Methods section. The negative control (white bar) was obtained in the presence of all weights and the reporter but lacked any of the inputs. The height of the bars corresponds to the average of at least three experiments (dots). Significance was determined based on a one-sided t-test to determine if the mean of each unique reaction was significantly greater than the threshold value of 0.5 μM. * p < 0.05, ** p < 0.003, *** p < 0.002, and ns = not significant.

Discussion and Conclusions

With this work, we demonstrate a TXTL-based perceptron, utilizing toehold switch regulators to compute the necessary weighted sum operations. Central to the behavior of a perceptron is the ability to perform a binary classification of the given inputs by applying an activation function to the output of the WSO; in our case, a thresholding function. This was achieved experimentally via the introduction of a σ28 inhibitor, the concentration of which determined the classification boundary of the perceptron. The mathematical equivalent hereof is the adaptation of the bias term within the perceptron threshold function (eq ). From eq , it follows that the addition of anti-σ28 to the reactions functions as the implementation of the bias term . Without the inclusion of this term, the weighted sum output of any input with its corresponding weight results in a greater than zero outcome, and thus a “” or ON output of the thresholding function . To obtain three unique classifiers, the bias term of each perceptron was altered by varying the anti-σ28 concentration, whereby the overall output of the WSO required to induce an ON state output was also altered. As such, we have demonstrated the ability to engineer a TXTL-based perceptron, capable of conducting three unique classifications via changes in the inhibitor concentration. Unique to our genetic perceptron design is the incorporation of both transcription and translation reactions. Alternative designs have shown the effectiveness of toehold strand displacement reactions for the conducting of WSOs; however, such approaches are limited with regard to the outputs they can offer. The addition of translation enables the production of fluorescent proteins for the straightforward readout of the classification. Alternatively, when applied to theragnostics, the system can be utilized to directly couple the classification of biomarkers with the production of therapeutics specific to the detected input pattern, providing immediate personalized treatment for each unique patient.[71] Furthermore, the TXTL-based approach presented here offers the ability to incorporate the perceptron design into larger genetic networks, in particular, due to the versatility of the input and output constructs. By substituting the current promoter site, any upstream networks expressing transcription factors or transcription inhibitors could be used to (in-)activate the transcription of the DNA inputs. Similarly, when incorporating the perceptron into larger complex networks, the fluorescent reporter protein used here to quantify the perceptron output can be replaced with an alternative, functional protein. Similarly, both the WSO output and the reporter output protein can be substituted for a functional RNA sequence.[72] The broad range of functionalities available to RNA, including the activation and repression of transcription, the control of riboswitches, and fluorescence labeling via aptamers such as spinach,[73] allow for the design of transcription-only networks. In doing so, the metabolic burden of the system can be reduced while retaining the ability to couple multiple perceptrons, either using RNAs to serve as the input of a downstream perceptron, binding directly to the weights, or via the use of RNA transcriptional activators.[74−76] In addition to the versatility of our gene-based approach, the application of RNA toehold reactions allows for the introduction of additional input–weight pairs, with the highly programmable nature of DNA ensuring that orthogonality between the unique pairs is maintained. However, the inherent variability in the transcriptional and translational efficiency between unique input and weight pairs makes the correct identification of distinct input classes a nontrivial exercise. In an attempt to minimize the variations in transcriptional efficiency, the DNA constructs used within this study were kept identical outside of the specific input and weight sequences; however, due to the placement of the RBS and start codon within the hairpin of the weight construct, this was difficult to achieve on a translational level, an issue further compounded by the inherent differences in the kinetics of each of the unique toehold–switch pairs. Moreover, with each additional input–weight pair, the resource depletion increases, reducing the overall output of the system,[67] in turn, minimizing the differences in the overall output concentration between classes. In both cases, careful tuning of the weight pattern to ensure near-identical expression levels for each pair can minimize this issue, while reducing the individual expression levels of all input–weight pairs can further reduce the resource burden. Furthermore, continuous flow reactions can be implemented to provide sufficient transcription and translation resources over prolonged durations.[62,77] Ultimately, the perceptron is regarded as a basic building block of larger neural networks that are capable of learning. The cell-free, genetically implemented perceptron presented in this work can implement a linear classification boundary between two groups of input sets, distinguishing between input sets with the desired minimum number of inputs and those without. Manually, we were able to alter the position of this classification boundary by varying the anti-σ28 concentration. However, for our perceptron to be capable of learning, a means of autonomously altering the weight pattern based on the perceptron output and the given inputs is required. A combination of experimental and computational methods has previously been used to train synthetic networks.[33,35] Here, the experimental output was compared to a desired target output, with the difference being used to determine the updated weight pattern for each iteration, until the difference between the desired and measured outcomes is minimized. In a similar manner, continuous flow reactions could be implemented to update the weight pattern at set intervals following the real-time monitoring of the output concentration, providing semi-autonomous perceptron behavior. Alternatively, to achieve a fully biological implementation of the perceptron, an enzymatic approach can be applied, where specific RNA sequences are amplified or cleaved based on the current perceptron output.[78] By engineering such sequences to competitively bind to the transducer strands, and in doing so, blocking the toehold site, the effective weight pattern can be adapted without altering the concentration of weights added to the system. The current implementation of the perceptron presented here lacks this learning functionality, instead relying on the manual tuning of the classification threshold. As such, our perceptron functions as a tunable classifier, targeted toward implementation within larger complex genetic networks. However, the versatile design of our system should allow for the introduction of learning, thus enabling the development of multilayer neural networks capable of responding to perturbations of the environment while retaining functionality. Furthermore, autonomous learning alleviates the efforts required to optimize the perceptron weight patterns, facilitating the usage of a single system toward multiple applications, wherein the system trains itself to function on a per-application basis.

Methods

DNA Template Preparation

DNA constructs were assembled via the Golden Gate Assembly (GGA) methods, using overlapping sequences previously described by Sun et al.[57] The assembly vector used (pBEST vector) was gifted by Richard Murray and Vincent Noireaux (Addgene plasmid #45779). The vector was adapted for GGA using Gibson assembly (NEB Gibson Assembly Master Mix) with PCR products of the vector (NEB Phusion High-Fidelity DNA Polymerase) using primers pBEST_GA_1_F, pBEST_GA_1_R, pBEST_GA_2_F, and pBEST_GA_2_R (Table S1). The transducer (switch) and trans-acting (trigger) sequences used were obtained from previous studies by the group of Dr. P. Yin (switches 1 and 2 are unpublished, and switch 3 is identical to switch 1 of the second generation of switches published by Green et al.[60]). All switch sequences are around 67 nucleotides in length; however, triggers 1 and 2 are significantly shorter (55 nucleotides) than trigger 3 (105 nucleotides), which was designed to include a hairpin architecture to enable NOT-gate computations. This additional functionality was not investigated within the scope of this study, with trigger 3 being applied in an identical manner to triggers 1 and 2. Transcription of all inputs and weights was regulated by a σ70sigma factor specific P70a promoter derived from the lambda phage,[79] which was edited to remove the OR3 binding site as per the sequence provided by Richard Murray and Vincent Noireaux (Addgene plasmid #45779). σ28 transcription was regulated using an E. coli ptar promoter sequence,[80] where the noncritical −44 to −37 region was adapted from the wild type to match the sequence provided by Richard Murray and Vincent Noireaux (Addgene plasmid #45780). All other additions to the vector construct, such as promoters, coding sequences, and terminators, were ordered as gBlocks from IDT or amplified from the pBEST vector using PCR. Gene sequences can be found in Table S1, alongside the complete vector sequence of the σ28-producing weight 1 construct. PCR products of all of the required components were purified using the QIAquick Gel Extraction Kit (Qiagen) and equimolar amounts of each were added to GGA reactions together with BsaI-HF (NEB), T4 ligase (Promega), and T4 ligase buffer (Promega). The GGA reactions were conducted in a thermocycler following a standard GGA protocol.[81] Completed vectors were transformed into NovaBlue cells (Merck). Plasmid purification was performed using the QIAprep Spin Miniprep Kit (Qiagen), and DNA sequences were confirmed using Sanger sequencing. For TXTL reactions, linear DNA templates were used throughout. The prepared vectors were linearized and amplified by PCR (Phusion High-Fidelity DNA Polymerase, NEB) using the pBEST_LinL2_F and pBEST_LinL2_R primers (Table S1). Final purification of the DNA templates was conducted using the QIAquick PCR Purification Kit (Qiagen).

Preparation of TXTL Reactions

The cell lysate, energy mixture, and amino acid solutions used to prepare the TXTL reaction solution were prepared identically to the protocol described by Pieters et al.[62] To prepare a master mix of all reaction components, excluding the DNA templates, the following were combined (with the final concentration given in brackets): cell lysate (33% of the final reaction volume), energy mixture[55] (7.1% of the final reaction volume), a constant distribution amino acid solution[82] (37.5 mM), magnesium l-glutamate (10 mM), potassium l-glutamate (40 mM), PEG-8000 (2%), and GamS protein (3 μM). When combined, this solution comprised 69% of the total reaction volume, with the linear DNA templates supplemented with Milli-Q water accounting for the remaining 31%. For experiments requiring the use of anti-σ28, the required concentration thereof was incorporated into 31% of the remaining volume alongside the DNA templates. The anti-σ28 was purchased (Gentaur) in purified form following expression by an E. coli host, and was provided with an N-terminal 10× His-tag and a C-terminal Myc-tag.

Batch Reactions

All experiments presented here were performed as batch TXTL reactions, with a total volume of 9.5 μL. A total of 10 μL of each TXTL reaction was prepared (6.9 μL of the master mix solution and 3.1 μL of the DNA template solutions), of which 9.5 μL was transferred to a 384-well, round-bottom, NBS-treated microplate (VWR). A Synergy H1M (Biotek) plate reader was used to incubate the microplate at 29 °C, while the deGFP expression was measured (excitation: 470 nm, emission: 510 nm) every 5 min, for a duration of 16 h. The plate reader was calibrated using a titration range of purified deGFP protein. The concentration of the expressed reporter was calculated using the calibration data.

Statistical Analysis

A two-tailed Welch’s t-test was used to compare data sets for which the hypothesis was that the data sets were significantly different (Figures d, 2b, and 3d). A one-way ANOVA test was used in cases where the hypothesis was that the data sets would be equal (Figures e and 3c). A one-sided t-test was used to determine if the mean of a data set was significantly greater than a given threshold value (Figure c).
  78 in total

1.  Molecular computation: RNA solutions to chess problems.

Authors:  D Faulhammer; A R Cukras; R J Lipton; L F Landweber
Journal:  Proc Natl Acad Sci U S A       Date:  2000-02-15       Impact factor: 11.205

2.  Scaling down DNA circuits with competitive neural networks.

Authors:  Anthony J Genot; Teruo Fujii; Yannick Rondelez
Journal:  J R Soc Interface       Date:  2013-06-12       Impact factor: 4.118

Review 3.  DNAzymes for sensing, nanobiotechnology and logic gate applications.

Authors:  Itamar Willner; Bella Shlyahovsky; Maya Zayats; Bilha Willner
Journal:  Chem Soc Rev       Date:  2008-04-24       Impact factor: 54.564

Review 4.  Biocomputing Based on DNA Strand Displacement Reactions.

Authors:  Hui Lv; Qian Li; Jiye Shi; Chunhai Fan; Fei Wang
Journal:  Chemphyschem       Date:  2021-04-19       Impact factor: 3.102

Review 5.  Overloaded and stressed: whole-cell considerations for bacterial synthetic biology.

Authors:  Olivier Borkowski; Francesca Ceroni; Guy-Bart Stan; Tom Ellis
Journal:  Curr Opin Microbiol       Date:  2016-08-02       Impact factor: 7.934

6.  Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli.

Authors:  G D Stormo; T D Schneider; L Gold; A Ehrenfeucht
Journal:  Nucleic Acids Res       Date:  1982-05-11       Impact factor: 16.971

7.  Competition among seven Escherichia coli sigma subunits: relative binding affinities to the core RNA polymerase.

Authors:  H Maeda; N Fujita; A Ishihama
Journal:  Nucleic Acids Res       Date:  2000-09-15       Impact factor: 16.971

8.  Cell-free prediction of protein expression costs for growing cells.

Authors:  Olivier Borkowski; Carlos Bricio; Michela Murgiano; Brooke Rothschild-Mancinelli; Guy-Bart Stan; Tom Ellis
Journal:  Nat Commun       Date:  2018-04-13       Impact factor: 14.919

9.  Sigma Factor-Mediated Tuning of Bacterial Cell-Free Synthetic Genetic Oscillators.

Authors:  Maaruthy Yelleswarapu; Ardjan J van der Linden; Bob van Sluijs; Pascal A Pieters; Emilien Dubuc; Tom F A de Greef; Wilhelm T S Huck
Journal:  ACS Synth Biol       Date:  2018-11-19       Impact factor: 5.110

Review 10.  Synthetic biology.

Authors:  Steven A Benner; A Michael Sismour
Journal:  Nat Rev Genet       Date:  2005-07       Impact factor: 53.242

View more
  1 in total

Review 1.  Deep Learning Concepts and Applications for Synthetic Biology.

Authors:  William A V Beardall; Guy-Bart Stan; Mary J Dunlop
Journal:  GEN Biotechnol       Date:  2022-08-18
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.