| Literature DB >> 19816554 |
Yizhi Cai1, Matthew W Lux, Laura Adam, Jean Peccoud.
Abstract
Recognizing that certain biological functions can be associated with specific DNA sequences has led various fields of biology to adopt the notion of the genetic part. This concept provides a finer level of granularity than the traditional notion of the gene. However, a method of formally relating how a set of parts relates to a function has not yet emerged. Synthetic biology both demands such a formalism and provides an ideal setting for testing hypotheses about relationships between DNA sequences and phenotypes beyond the gene-centric methods used in genetics. Attribute grammars are used in computer science to translate the text of a program source code into the computational operations it represents. By associating attributes with parts, modifying the value of these attributes using rules that describe the structure of DNA sequences, and using a multi-pass compilation process, it is possible to translate DNA sequences into molecular interaction network models. These capabilities are illustrated by simple example grammars expressing how gene expression rates are dependent upon single or multiple parts. The translation process is validated by systematically generating, translating, and simulating the phenotype of all the sequences in the design space generated by a small library of genetic parts. Attribute grammars represent a flexible framework connecting parts with models of biological function. They will be instrumental for building mathematical models of libraries of genetic constructs synthesized to characterize the function of genetic parts. This formalism is also expected to provide a solid foundation for the development of computer assisted design applications for synthetic biology.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19816554 PMCID: PMC2748682 DOI: 10.1371/journal.pcbi.1000529
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Glossary of specialized terms used throughout this article.
|
| An attribute grammar is a context free grammar augmented with attributes, semantic rules, and conditions. Attribute grammars were developed as a means of formalizing the semantics of a context free grammar. |
|
| A context free grammar is a quadruple (V, Σ, P, S) where V is a finite set of non-terminal symbols, Σ (the alphabet) is a finite set of terminal symbols, P is a finite set of rules, and S is a distinguished element of V called the start symbol. A rule P is of the following form A→ω where A is a single non-terminal symbol and ω is a string of terminals and/or non-terminals (possibly empty). The term “context-free” expresses the fact that non-terminals are rewritten without regard to the context in which they occur. |
|
| A codimension 2 bifurcation formed by the tangential meeting of two loci of saddle-node bifurcations. In other words, a cusp bifurcation traces the path of the points bounding a bistable region as they change with changes in two parameters. Bistability is implied within the cusp bounds. |
|
| A direct left recursion in context free grammar refers to rules of the form A→Aω. Parsing left recursion can possibly lead the parser down an infinite branch of the search tree in the corresponding logic program. |
|
| The measurement of polymerase per second transcribing past a defined point of DNA. |
|
| The Systems Biology Markup Language (SBML) is a machine-readable language, based on XML, for representing models of biochemical reaction networks. |
|
| Semantics reveals the meaning of syntactically valid strings in a language. For natural languages, this means correlating sentences and phrases with the objects, thoughts, and feelings of our experiences. For programming languages, semantics describes the behavior that a computer follows when executing a program in the language. |
|
| Syntax refers to the ways symbols may be combined to create well-formed sentences (or programs) in a language. Syntax defines the formal relations between the constituents of a language, thereby providing a structural description of the various expressions that make up legal strings in the language. Syntax deals solely with the form and structure of symbols in a language without any consideration given to their meaning. |
Figure 1Workflow of generating the gene network model encoded in a DNA sequence.
The input for this process is a DNA sequence that is first broken down into parts by the scanner. The combination of the parts is validated by the parser according to a syntactic model. After validation by the parser, the sequence is translated by applying semantic actions attached to the rules to transform the series of parts into a set of chemical equations. The resulting equations can then be solved using existing simulation engines. Each step takes the output of the previous step as input, so the workflow can start from any step if the appropriate input is provided.
Figure 2Parse tree showing the derivation process of a two-cassette genetic construct.
In the derivation tree, terms in <> corresponds to the non-terminals in the grammar, while terms in [ ] are terminals, and the dashed lines indicate the transformation to terminals. The subscripts are used to distinguish different instances of the same category.
Figure 3An example of attribute grammar.
Attributes associated with non-terminals.
| Non-terminals | Inherited Attribute | Synthesized Attributes |
| constructs | protein_list | promoter_list, equation_list |
| cassette | protein_list | promoter_list, equation_list |
| restConstructs | protein_list | promoter_list, equation_list |
| cistron | protein_list | transcript, equation_list |
| promoter | - | name, transcription_rate, leakiness_rate, repressor_list |
| RBS | - | name, translation_rate |
| gene | - | name, mRNA_degradation_rate, protein_degradation_rate |
| terminator | - | name |
Figure 4Equation generators.
Figure 5Chemical equations translated from a DNA sequence.
Context-dependency of experimentally determined translation rates.
| Mutant | RBS | ORF | Expression | Translation rate function |
| 1 | RBS WT | ORF WT | 100 | translation_rate(RBS WT) |
| 6 | RBS WT | ORF2 | 100 | translation_rate(RBS WT) |
| 7 | RBS WT | ORF3 | 100 | translation_rate(RBS WT) |
| 17 | RBS WT | ORF4 | 3 | translation_rate(RBS WT, ORF4) |
| 20 | RBS WT | ORF5 | 6 | translation_rate(RBS WT, ORF5) |
| 23 | RBS WT | ORF6 | 0.3 | translation_rate(RBS WT, ORF6) |
| 4 | RBS1 | ORF WT | 100 | translation_rate(RBS1) |
| 2 | RBS1 | ORF1 | 100 | translation_rate(RBS1) |
| 3 | RBS1 | ORF2 | 100 | translation_rate(RBS1) |
| 5 | RBS1 | ORF3 | 4 | translation_rate(RBS1, ORF3) |
| 14 | RBS1 | ORF4 | <0.003 | translation_rate(RBS1, ORF4) |
| 9 | RBS2 | ORF WT | 100 | translation_rate(RBS2) |
| 8 | RBS2 | ORF1 | 100 | translation_rate(RBS2) |
| 10 | RBS2 | ORF3 | 100 | translation_rate(RBS2) |
| 12 | RBS3 | ORF WT | 100 | translation_rate(RBS3) |
| 11 | RBS3 | ORF1 | 20 | translation_rate(RBS3, ORF1) |
| 13 | RBS3 | ORF3 | 100 | translation_rate(RBS3) |
| 15 | RBS4 | ORF4 | 0.1 | translation_rate(RBS4) |
| 16 | RBS5 | ORF4 | 0.05 | translation_rate(RBS5) |
| 22 | RBS6 | ORF WT | 0.2 | translation_rate(RBS6, ORF WT) |
| 18 | RBS6 | ORF4 | 80 | translation_rate(RBS6) |
| 21 | RBS7 | ORF WT | 100 | translation_rate(RBS7) |
| 19 | RBS7 | ORF4 | 100 | translation_rate(RBS7) |
Figure 6Mapping the behavior of 384 genetic constructs.
Each section A to F indicates a different selection of repressors within a toggle switch: (A) tetR and lacI, (B) lacI and tetR, (C) lacI and cI, (D) cI and lacI, (E) cI and tetR, and (F) tetR and cI. Other networks that cannot give rise to bistability (e.g. a construct with tetR as both genes) are excluded as are designs that only vary the GFP RBS (see text). Each pair is explored by varying the RBS (ordered by translational efficiency from low (RBS H) to high (RBS B) as determined by a qualitative fit of the results of Gardner et al. [24] with consistent letter-based labels) and calculating the detectability ratio, defined as the steady state GFP concentration in the “on” state divided by the concentration in the “off” state. These ratios are displayed using a color map as indicated by the legend to the right. Monostable constructs have a ratio of 1 and are indicated by gray boxes. The ratio gives a measure of how easily the two steady states can be distinguished, which is important due to high experimental noise. Each pane also elucidates the traditional two-parameter bifurcation diagram of each gene pair as the translational rates are varied by changing RBSs. Constructs near the edge of the cusp operate near saddle-node bifurcations and are more prone to noise-induced switching. Thus, constructs from the cusp interior are preferred for robust behavior.