| Literature DB >> 28635591 |
Ayca Cankorur-Cetinkaya1, Joao M L Dias2,3, Jana Kludas4, Nigel K H Slater5, Juho Rousu4, Stephen G Oliver1, Duygu Dikicioglu1,6.
Abstract
Multiple interacting factors affect the performance of engineered biological systems in synthetic biology projects. The complexity of these biological systems means that experimental design should often be treated as a multiparametric optimization problem. However, the available methodologies are either impractical, due to a combinatorial explosion in the number of experiments to be performed, or are inaccessible to most experimentalists due to the lack of publicly available, user-friendly software. Although evolutionary algorithms may be employed as alternative approaches to optimize experimental design, the lack of simple-to-use software again restricts their use to specialist practitioners. In addition, the lack of subsidiary approaches to further investigate critical factors and their interactions prevents the full analysis and exploitation of the biotechnological system. We have addressed these problems and, here, provide a simple-to-use and freely available graphical user interface to empower a broad range of experimental biologists to employ complex evolutionary algorithms to optimize their experimental designs. Our approach exploits a Genetic Algorithm to discover the subspace containing the optimal combination of parameters, and Symbolic Regression to construct a model to evaluate the sensitivity of the experiment to each parameter under investigation. We demonstrate the utility of this method using an example in which the culture conditions for the microbial production of a bioactive human protein are optimized. CamOptimus is available through: (https://doi.org/10.17863/CAM.10257).Entities:
Mesh:
Substances:
Year: 2017 PMID: 28635591 PMCID: PMC5817226 DOI: 10.1099/mic.0.000477
Source DB: PubMed Journal: Microbiology ISSN: 1350-0872 Impact factor: 2.777
GA – experimental protocol conversion table for commonly employed terminology
| GA term | Equivalent in the current experimental setup |
|---|---|
| Gene | Individual experimental factor |
| Number of bits (b) | Number of binary digits (0 or 1) assigned to describe the value (i.e. the ‘length’) of each ‘gene’ |
| 2b | Number of levels to which each ‘gene’ can be assigned |
| Chromosome | Individual set of conditions to be tested experimentally |
| Generation | Each round of experiments |
| Population | Number of ‘chromosomes’ to be experimentally tested in each ‘generation’ |
| Evolution | Narrowing down the range of experimental conditions to reach the required objective through a course of consecutive rounds of experiments (‘generations’) |
| Score | Evaluation of how well suited the condition is to achieving the required objective (how ‘fit’ the ‘chromosome’ is) |
| Parent | One of the two ‘chromosomes’ to undergo genetic hybridization |
| Child/Offspring | One of the two new ‘chromosomes’ generated |
| Mating | Process by which two ‘parent chromosomes’ recombine to yield the two ‘children’ |
| Cross-over | Point where the recombination event occurs |
| Mutation | A random change in the bit value (0 to 1 or 1 to 0) introduced with an assigned probability |
Fig. 1.Metrics of the convergence score in finding the optimized sub-space of environmental parameters for improving recombinant production yield. The average enzyme activity scores for all the individuals in the ‘population’ are displayed in blue for each generation represented in the abscissa (PA). The average scores for the better-performing fraction (50 %) of the ‘population’ are displayed similarly in red (BPHA). The error bars represent the variation among the individuals in each ‘population’. The black solid line connecting the black markers represents the trajectory of the best-performing individual (BPI) in each ‘generation’ (a). The contraction of the optimal sub-space through the course of this heuristic search was represented by the relative standard deviation (RSD) in the better-performing fraction of each ‘generation’ (b).
Fig. 2.Population profiling for monitoring the absolute frequency of the levels of each factor tested in the population over generations. Absolute frequency denotes the number of times that level has been assigned to the ‘individuals’ in each ‘generation’. The factors represented here are: methanol (a), sorbitol (b), pH (c), (NH4)2PO4 (d), KCl (e), glycerol (f), FeSO4 . 7H2O (g), CaCl2 . 2H2O (h) and MgSO4 . 7H2O (i). The levels that appeared at least once in the search are displayed along the abscissa, and the absolute frequency is displayed along the ordinate. In each plot, the tone of the blue bars becomes darker for further ‘generations’ with the lightest shade representing the first ‘generation’ and the darkest tone representing the third, and last, ‘generation’ of the evolution experiments.
Evaluation of the SR models for each individual objective
| R2 | Adjusted R2* | ||
|---|---|---|---|
| ODbi | SR | 0.645 | 0.583 |
| MLR | 0.131 | −0.021 | |
| OD(ai−bi) | SR | 0.745 | 0.685 |
| MLR | 0.332 | 0.173 | |
| Enzyme activity (Ea) | SR | 0.804 | 0.758 |
| MLR | 0.595 | 0.499 | |
| Specific productivity (P) | SR | 0.881 | 0.853 |
| MLR | 0.610 | 0.518 |
*R2 that has been adjusted for the number of predictors in the model.
Summary of major contributor factors for each individual objective
| ODbi | OD(ai−bi) | Ea | P | |
|---|---|---|---|---|
| pH | ||||
| Glycerol | ||||
| Ammonium | ||||
| Methanol | ||||
| Sorbitol | ||||
| Calcium | ||||
| Potassium | ||||
| Iron | ||||
| Magnesium |
*n/a, not applicable since cultivation medium did not contain methanol and sorbitol during the pre-induction phase.
Fig. 3.Example input/output entries for the CamOptimus GUI and the process algorithm. A test system investigating nine parameters for optimizing four objectives is provided above. The optimization search is conducted for three generations using the GA (a) and the data generated are modelled by SR (b). The algorithms for conducting the optimization search and analyzing the data are shown as annotations to the GUI screenshots.