Literature DB >> 29641183

An Automated Design Framework for Multicellular Recombinase Logic.

Sarah Guiziou¹, Federico Ulliana², Violaine Moreau¹, Michel Leclere², Jerome Bonnet¹.

Abstract

Tools to systematically reprogram cellular behavior are crucial to address pressing challenges in manufacturing, environment, or healthcare. Recombinases can very efficiently encode Boolean and history-dependent logic in many species, yet current designs are performed on a case-by-case basis, limiting their scalability and requiring time-consuming optimization. Here we present an automated workflow for designing recombinase logic devices executing Boolean functions. Our theoretical framework uses a reduced library of computational devices distributed into different cellular subpopulations, which are then composed in various manners to implement all desired logic functions at the multicellular level. Our design platform called CALIN (Composable Asynchronous Logic using Integrase Networks) is broadly accessible via a web server, taking truth tables as inputs and providing corresponding DNA designs and sequences as outputs (available at http://synbio.cbs.cnrs.fr/calin ). We anticipate that this automated design workflow will streamline the implementation of Boolean functions in many organisms and for various applications.

Entities: Chemical Disease Species

Keywords: automated genetic design; biological computing; distributed multicellular computing; logic gates; recombinases; synthetic biology

Mesh：

Substances：
Recombinases

Year: 2018 PMID： 29641183 PMCID： PMC5962929 DOI： 10.1021/acssynbio.8b00016

Source DB: PubMed Journal: ACS Synth Biol ISSN： 2161-5063 Impact factor: 5.110

Reprogramming the response of living cells to chemical or physical signals is a key goal of synthetic biology and would support the development of complex manufacturing processes, sophisticated diagnostics, or cellular therapies.[1] In order to control cellular behavior, researchers have engineered many types of Boolean logic gates operating in single cells by using transcriptional regulators,[2−8] RNA molecules,[9−11] or site-specific recombinases.[12−14] However, scaling-up single-cell logic systems requires solving multiple engineering challenges. First, when program complexity increases (number of inputs ≥3), the high number of parts needed can cause metabolic burden and affect cellular viability. Second, current design methods are mostly ad-hoc, and each Boolean function is implemented using a different genetic architecture that needs to be fully characterized and optimized. Despite recent progress toward predictable gate design,[7] some gates simply do not work or are too complex to be implemented within a single cell. Finally, in order to avoid cross-talk, single-cell logic systems need to use different components for every novel signal to be detected. While library of orthogonal regulatory components have greatly expanded,[3,6,15,16] their deployment can be challenging and requires time-consuming optimization. In nature, division of labor between cellular subpopulations is a ubiquitous mechanism allowing cellular communities to accomplish complex functions.[17,18] Early efforts to engineer synthetic multicellular systems led to the construction of pattern-forming communities,[19] predator–prey ecosystems,[20] synchronized oscillators,[21,22] or distributed metabolic pathways.[23] Researchers also realized that problems faced by logic circuits operating in single cells could be addressed by distributing the logic program between different cells.[24] Because of the spatial separation allowed by cellular compartments, optimized regulatory components can be reused in different subpopulations. As the circuit is divided into smaller subcircuits, metabolic burden is reduced. Finally, simple cellular computing modules can be composed in different manners and wired via cell–cell communication channels to obtain different logic functions. For example, Tamsir et al. used multilayered circuit designs inspired from electronics to construct all 2-input logic gates by combining spatially separated E. coli colonies encoding NOR gates wired via quorum-sensing molecules.[25] Specific features of biology can also be used to our advantage to engineer logic systems in a more efficient manner than by strictly transposing electronic designs.[12,24,26] One particularly promising approach is distributed multicellular computation (DMC).[24,27−29] DMC is based on the decomposition of a Boolean function into various subfunctions, each performed by a particular subpopulation of cells. Different subpopulations can then be combined in different manners to realize any given Boolean function of interest. Importantly, multiple cells are capable of producing the output which is therefore distributed among the cellular subpopulations. Recently, Macia and colleagues implemented DMC within a multicellular consortium by using cellular computing units performing elementary IDENTITY or NOT operations.[30] While highly scalable, the need for spatial separation between each subpopulation prevents these systems from operating autonomously. Here we present a composable framework enabling the systematic design of logic gates performing Boolean logic within an autonomous multicellular consortium. We designed our system to operate using site-specific recombinases, more specifically serine integrases, which allow robust and flexible engineering of complex logic gates.[12,13] Serine integrases are members of the large serine recombinase family[31] and catalyze site-specific recombination between attachment sites attB and attP. Recombination operates via double-strand breaks located at the central dinucleotides followed by the generation of hybrid sites attL and attR. Depending on the relative orientation of attB and attP, the recombination reaction leads to excision (parallel orientation) or inversion (antiparallel orientation) of the DNA sequence flanked by the attachment sites.[32] Recombinase devices can implement complex logic functions without the need of cascading multiple logic gates like in electronics.[12,13,26] Integrase recombination is irreversible in the absence of cofactors, so that recombinase logic gates exhibit memory, are single use (one-shot), and therefore belong to the family of asynchronous logic devices (i.e., the system can respond to multiple signals even if they are not present simultaneously). Our design for Boolean logic is based on a reduced library of cellular computing units responding to one or multiple inputs that can be composed at will to implement all desired Boolean functions (Figure ). Our logic system is single layer, does not require cell–cell communication nor spatial separation, greatly facilitating its implementation. In order to make our design framework broadly accessible, we provide a fully automated web platform called CALIN (Composable Asynchronous Logic using Integrase Networks) taking truth tables as inputs and providing corresponding DNA designs and sequences as outputs.

Figure 1

Distribution of a Boolean function within a multicellular consortium. The Boolean function of interest is decomposed as a disjunction (i.e., sum) of subfunctions (or clauses). Here, as an example, a given function, f, is decomposed into functions f1, f2, and f3. The strains performing f1, f2 and f3 are selected from the strain library to assemble a multicellular consortium computing the desired Boolean function.

Results

A Hierarchical Composition Framework for Multicellular Boolean Logic Using Integrase Switches

In order to implement a Boolean function within a multicellular consortium, we decomposed the function into several independent subfunctions, or clauses,[30] executed by a different cellular subpopulation, chosen from a library containing a reduced number of cellular computing units (Figure ). To facilitate multicellular system composition, we designed our system so that each cellular subpopulation computes independently of the others, without cell–cell communication needed. As a consequence, if one cellular subpopulation is ON (expression of the output gene), the global output of the system is considered to be ON. Because of their reduced number and of the absence of cell–cell communication, cellular computing units can be extensively characterized and optimized to predictably implement all Boolean functions at the multicellular level. Boolean functions encode the output state of the logic gate. The variables of the function are the inputs of the gate which are equal to 1 if the signal has been present and otherwise to 0. We express Boolean functions using the disjunctive normal form.[33] The Boolean function f is a disjunction: f = β1 OR...OR β, where M is the number of clauses present in f, and each β is a conjunctive clause: β = θ AND...AND θ, where each θ is a literal of the variable x (either the identity of the variable or its negation), with j being an integer between 1 and n. n corresponds to the number of variables in this conjunction (an integer between 1 and N). N is the number of variables in the function f. Each cellular computing unit executes a particular “subfunction” corresponding to a conjunctive clause. Then, the full function is performed by combining multiple cellular computing units (Figure A).

Figure 2

A hierarchical composition framework for asynchronous Boolean recombinase logic. (A) Distribution of a Boolean function within a multicellular consortium by decomposition into conjunctions of literals (variables or their negations). Here an example is depicted in which a Boolean function is decomposed into three subfunctions and implemented in three separate cellular computing units. (B) attB and attP disposed in parallel orientation. (C) Elements implementing IDENTITY and NOT functions. To obtain an IDENTITY function, a transcriptional terminator is flanked by parallel attachment sites, blocking transcription of the gene of interest. When the signal is present, the terminator is excised and the output gene is expressed. To obtain a NOT function, a promoter is flanked by parallel attachment sites. When the signal is present, the promoter is excised, and the gene is no longer expressed. (D) Functional composition of ID-elements into ID-modules, by placing elements in series to obtain the conjunction of IDENTITY functions. For a 2-input ID-module, the output gene is expressed only when both inputs have been present, both terminators excised (corresponding to an AND gate (A AND B)). (E) Functional composition of NOT-elements into NOT-modules, by nesting elements to obtain conjunction of NOT functions. For a 2-input NOT-module, the output gene is expressed only when none of the inputs has been present (corresponding to a NOR gate: NOT(A) AND NOT(B)). (F) Hierarchical composition framework for Boolean recombinase logic. ID- and NOT-modules are composed in series, following a priority rule in which the NOT-module is placed upstream the ID-module. The device shown here can be scaled to perform all functions based on conjunction of NOT and IDENTITY functions. We designed a hierarchical composition framework in which two elements encoding the NOT and IDENTITY functions (called ID-element and NOT-element) are composed into computational modules which are then combined to generate computational devices executing a particular clause within a cellular subpopulation. For the sake of simplicity and robustness, we designed switches controlled by integrase-mediated excision (Figure B). Excision-based design reduces the distance between gate promoter and the gene of interest. Moreover, as no asymmetric terminator is needed, this design might be easier to deploy into many organisms.[14] The ID-element consists of a transcriptional terminator flanked by recombination sites and placed between the promoter and the output gene. In presence of the signal, the terminator is excised and the output gene is expressed (Figure C, left panel). The NOT-element consists of a promoter driving the output gene and flanked by recombination sites. In presence of the signal, the promoter is excised and the gene is not expressed anymore (Figure C, right panel). Computational modules performing conjunctions of NOT or conjunctions of IDENTITY functions are respectively realized by nesting NOT-elements or by placing ID-elements in series (Figure D,E). Finally, NOT- and ID-modules are composed in series to obtain the final computational devices: in this case the NOT-module containing the promoter is positioned in 5′ of the ID-module, with the output gene positioned downstream (Figure F). Following this hierarchical composition framework, all conjunctive clauses are implementable within a cellular computing unit. The full Boolean function is then executed by a multicellular consortium containing different cellular computing units. To reduce the number of computational devices, we implemented only one computational device per set of symmetric Boolean functions and interchanged connection between integrases and control signals. For example, the two Boolean functions: NOT(A) AND B; B AND NOT(A) are executed using the same computational device (Figure S1). Consequently, only 14 computational devices are needed to realize all 4-input Boolean functions (65 536 functions) (Figure A). For every additional input (from N – 1 to N), only N + 1 novel computational devices are needed while the number of Boolean functions increases drastically. For example, 7 additional devices are needed to transition from 5 to 6 inputs (27 devices in total), enabling a 1010 fold increase in the number of Boolean functions (for a total of ∼1019) (Figure B). Of note, the different cellular computing units do not always include N integrases and computational devices responding to N inputs. As an example, the 4-input Boolean equation shown in Figure D can be executed using 3 strains containing respectively 4, 3, and 2 integrases and with different signal-integrase connectivities.

Figure 3

Implementing all Boolean logic functions using a reduced number of computational devices. (A) Schematics of all devices needed to implement up to 4-input functions. (B) Maximum number of strains and number of computational devices needed to compute all Boolean functions for a given number of inputs. See Methods for details. (C) Proportion of Boolean functions implementable with a specific number of strains for 3 and 4 inputs (obtained by generating all the biological designs for 3 and 4-input Boolean functions, see Table S1 for numbers). (D) Example of a biological implementation for a 4-input Boolean function. The function shown here is divided into a disjunction of conjunctive clauses (see Figure A). Each conjunctive clause is executed using a particular computational device (defined in panel A) each placed into a separate cellular computing unit. By combining the different units, the full logic function is obtained. If at least one of the cellular units is ON, the output is considered to be ON. Of note, inputs are not always connected to the same integrase (as for input D in Cell 1 and Cell 2), and all integrases and inputs are not present in all cells. To implement a N-input Boolean function, a maximum of 2 different cellular computing units have to be composed, corresponding to a culture of 2 different strains: 4 for 3 inputs and 8 for 4 inputs (Figure B). However, most logic functions can be performed using less cellular computing units (an average of 2.3 strains for 3-input and 3.6 strains for 4-input Boolean functions, Figure C). In summary, we provide a hierarchical composition framework using a reduced library of computational devices to systematically implement all N-input Boolean logic functions within a multicellular consortium.

An Automated Design Platform for Recombinase Logic

We then aimed at generating a software for automating the design of cellular consortia performing asynchronous Boolean logic. Softwares enabling such automated genetic circuit design are necessary and extremely useful when the design space becomes too large for humans to explore it efficiently.[7,34−36] We thus designed an algorithm called CALIN (Composable Asynchronous Logic using Integrase Networks) based on two main steps (Figure A). First, the Boolean function of interest is decomposed into a disjunction of conjunctive clauses using the Quine–McCluskey algorithm (see Methods). Then, each clause is converted into a given computational device for which particular connections between integrases and inputs are generated.

Figure 4

Automated design of multicellular recombinase logic. (A) The CALIN algorithm enables the systematic design of Asynchronous Boolean logic. (B) CALIN web-interface takes as an entry a Boolean truth table and generates as outputs: the connection map between inputs and integrases, the DNA architectures of the computational devices and the corresponding DNA sequences. The CALIN script written in Python is available on Github and can be directly used for high-throughput generation of biological designs. Furthermore, the CALIN python script can design logic devices customized for specific organisms (E. coli, B. subtilis and S. cerevisiae) and can be tailored by the user to generate devices using fully customized DNA sequences. In order to enable broader access to our design framework, we also provide a Web site of CALIN accessible at http://synbio.cbs.cnrs.fr/calin. In the CALIN web-interface, the user fills the number of inputs to process (up to 5) and the desired Boolean truth table or corresponding binary number. The Web site provides as outputs the DNA architectures of the computational devices, the connection map between signals and integrases, and the corresponding DNA sequences (Figure B).

Discussion

In this work we present a scalable composition framework for implementing asynchronous Boolean logic within a multicellular consortium. We provide an online design tool for the systematic design of recombinase logic circuits called CALIN (Composable Asynchronous Logic using Integrase Networks). While these designs are currently theoretical, the robustness of integrase-mediated recombination against various site permutations and orientations[12,13,34,37] should support straightforward experimental implementation. By taking advantage of the single-layer architecture of recombinase logic, we encapsulated complex Boolean functions into various subcellular populations. Because of its compact architecture, our design exhibit two significant improvements over previous DMC systems: (i) no cell–cell communication channels are needed, and (ii) cells do not need to be spatially separated, thereby supporting the implementation of fully autonomous multicellular consortia operating without an external physical device. Another difference between our system and other DMC is the use of recombinase switches that provide memory.[34,38,39] Recombinase mediated data-storage could be useful for applications requiring endpoint measurements, or delayed readout, like diagnostics. Also, because the state of the logic system is written within DNA, it can be addressed via PCR or DNA sequencing,[13,38,40] even if the cells die, providing other robust readout modalities. As with others DMC systems, for a given number of inputs, the number of elementary computational devices needed to compose all logic functions compares very favorably with the number of possible functions. For example, implementing all 65 536 4-input, or all ∼4.3 × 109 5-input Boolean functions only requires respectively 14 and 20 computational modules. As serine recombinases do not require host-specific cofactors and can operate in several species, the designs presented here could be implemented in many organisms. Logicfunctions could also be distributed between different species operating in concert. In such schemes, researchers could take advantage of the particular capacities of different organisms to detect different signals and/or perform specific tasks. Examples of applications include environmental remediation[41,42] or microbiome engineering for therapeutic applications.[43] A possible challenge for our system is the high number of strains that have to operate together when the number of inputs increases (Figure B). Cultivating many strains together could lead to counter selection of some subpopulations, but this problem could be addressed by encapsulating the different strains into hydrogel beads.[40] Also, as the number of strains increases, the output of one subpopulation representing a small fraction of the whole consortia could become difficult to measure. The output level in the ON state will also be different if one or multiple cellular subpopulations are turned ON. However, adding a single cell–cell communication channel could address this problem by propagating the output to the whole-population (Figure S2). Finally, for some applications, “real-time” response could be achieved via a similar composition framework using synchronous recombinase logic gates based on reversible recombination reactions performed by integrases coupled with recombination directionality factors (RDFs) (Figure S3).[12,26]

Methods

Equations for Determining of Numbers of Functions/Strains/Devices

The number of Boolean functions corresponds to 2 to the power of the number of possible states. As each state can be equal to 1 or to 0, the number of possible states is equal to 2 to the power of N where N is the number of inputs. Consequently, the number of Boolean functions is equal to eq . The maximum number of strains needed to implement any Boolean logic function with N inputs is equal to eq , as all N-input Boolean equations can be written in the disjunctive normal form, then as a disjunction of a maximum of 2 conjunctive clauses.[33] The number of different conjunctive clauses (corresponding to a conjunction of literals) is equal to eq . If we implement all these functions within cells, the number of standard devices needed is equal to the number of conjunctive clauses (eq ). This method leads to a high number of devices. Therefore, we decided to construct only one device per set of symmetric Boolean functions (e.g., A AND NOT(B) is the symmetric function of NOT(A) AND B). This approach reduces the number of standard devices. In consequence, for an N-input Boolean function, devices computing from 1 to N inputs are needed and k + 1 nonsymmetric Boolean functions computing the conjunction of k literals exist:Of note, the number of devices follows the arithmetic series: where a1 = 2, d = 1, and N is the number of inputs. In a first approximation, N sensor-modules in which a control signal (i.e., a sensor device responding to an input of interest) is connected to an integrase are needed for the construction of an N-input system. However, as we reduced the number of devices to a set composed of nonsymmetric Boolean functions, we need to connect all control signals to all integrases to compute all Boolean functions. Therefore, N2 sensor-modules are needed.

Automated Generation of Genetic Designs

We encoded an algorithm generating genetic designs executing N-input Boolean functions using Python (Figure S4). The algorithm takes as input a Boolean truth table or the binary number corresponding to the function. The output corresponds to the biological implementation of the Boolean function, such as for each strain: a graphical representation of the genetic circuit and its associated DNA sequences. The truth table is transformed into a Boolean function in the disjunctive normal form using the Quine–McCluskey algorithm[33] (Figure A). The Boolean function is decomposed into conjunctive clauses (conjunction of literals). In this scheme, each clause can be regarded as a “subfunction”. From each conjunctive clause, we extract two types of information. First, based on the number of IDENTITY and NOT functions, we identify which logic device is needed. Second, based on the association of inputs to either IDENTITY and NOT functions, we identify which sensor-modules are needed among the different connection possibilities between control signals and integrases. Finally, we combine the designs executing the different conjunctive clauses to obtain the global design for implementing the desired truth table. To simplify the construction process, the DNA sequence of the computational devices is generated by our Python code. In CALIN, sequences are adapted for E. coli, but sequence generation can be adapted to other organisms (database available for B. subtilis and Saccharomyces cerevisiae) or customized using the source Python code available on github.

41 in total

Review 1. Towards synthetic microbial consortia for bioprocessing.

Authors: Jasmine Shong; Manuel Rafael Jimenez Diaz; Cynthia H Collins
Journal: Curr Opin Biotechnol Date: 2012-03-01 Impact factor: 9.740

2. Distributed biological computation with multicellular engineered networks.

Authors: Sergi Regot; Javier Macia; Núria Conde; Kentaro Furukawa; Jimmy Kjellén; Tom Peeters; Stefan Hohmann; Eulàlia de Nadal; Francesc Posas; Ricard Solé
Journal: Nature Date: 2010-12-08 Impact factor: 49.962

Review 3. Better together: engineering and application of microbial symbioses.

Authors: Stephanie G Hays; William G Patrick; Marika Ziesack; Neri Oxman; Pamela A Silver
Journal: Curr Opin Biotechnol Date: 2015-08-28 Impact factor: 9.740

4. SYNBADm: a tool for optimization-based automated design of synthetic gene circuits.

Authors: Irene Otero-Muras; David Henriques; Julio R Banga
Journal: Bioinformatics Date: 2016-07-08 Impact factor: 6.937

5. Engineering modular and orthogonal genetic logic gates for robust digital-like synthetic biology.

Authors: Baojun Wang; Richard I Kitney; Nicolas Joly; Martin Buck
Journal: Nat Commun Date: 2011-10-18 Impact factor: 14.919

6. A sensing array of radically coupled genetic 'biopixels'.

Authors: Arthur Prindle; Phillip Samayoa; Ivan Razinkov; Tal Danino; Lev S Tsimring; Jeff Hasty
Journal: Nature Date: 2011-12-18 Impact factor: 49.962

7. Permanent genetic memory with >1-byte capacity.

Authors: Lei Yang; Alec A K Nielsen; Jesus Fernandez-Rodriguez; Conor J McClune; Michael T Laub; Timothy K Lu; Christopher A Voigt
Journal: Nat Methods Date: 2014-10-26 Impact factor: 28.547

8. A synthetic Escherichia coli predator-prey ecosystem.

Authors: Frederick K Balagaddé; Hao Song; Jun Ozaki; Cynthia H Collins; Matthew Barnet; Frances H Arnold; Stephen R Quake; Lingchong You
Journal: Mol Syst Biol Date: 2008-04-15 Impact factor: 11.429

9. Design of orthogonal genetic switches based on a crosstalk map of σs, anti-σs, and promoters.

Authors: Virgil A Rhodius; Thomas H Segall-Shapiro; Brian D Sharon; Amar Ghodasara; Ekaterina Orlova; Hannah Tabakh; David H Burkhardt; Kevin Clancy; Todd C Peterson; Carol A Gross; Christopher A Voigt
Journal: Mol Syst Biol Date: 2013-10-29 Impact factor: 11.429