Literature DB >> 35642935

gcFront: a tool for determining a Pareto front of growth-coupled cell factory designs.

Laurence Legon1,2, Christophe Corre2, Declan G Bates1, Ahmad A Mannan1.   

Abstract

MOTIVATION: A widely applicable strategy to create cell factories is to knock out (KO) genes or reactions to redirect cell metabolism so that chemical synthesis is made obligatory when the cell grows at its maximum rate. Synthesis is thus growth-coupled, and the stronger the coupling the more deleterious any impediments in synthesis are to cell growth, making high producer phenotypes evolutionarily robust. Additionally, we desire that these strains grow and synthesise at high rates. Genome-scale metabolic models can be used to explore and identify KOs that growth-couple synthesis, but these are rare in an immense design space, making the search difficult and slow.
RESULTS: To address this multi-objective optimization problem, we developed a software tool named gcFront - using a genetic algorithm it explores KOs that maximise cell growth, product synthesis, and coupling strength. Moreover, our measure of coupling strength facilitates the search so that gcFront not only finds a growth coupled design in minutes but also outputs many alternative Pareto optimal designs from a single run - granting users flexibility in selecting designs to take to the lab. AVAILABILITY: gcFront, with documentation and a workable tutorial, is freely available at GitHub: https://github.com/lLegon/gcFront and archived at Zenodo, DOI: 10.5281/zenodo.5557755 (Legon et al., 2022). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2022. Published by Oxford University Press.

Entities:  

Year:  2022        PMID: 35642935      PMCID: PMC9272801          DOI: 10.1093/bioinformatics/btac376

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.931


1 Introduction

Genome-scale constraint-based models (GSMs) are used to explore gene or reaction knockouts (KOs) that redirect cell metabolism to chemical overproduction (Maia ). A promising strategy for enabling robust production seeks KO combinations that couple chemical synthesis with cell growth so that it is made obligatory at maximum growth rate (Feist ). KOs can disrupt metabolism to result in poorer performance than predicted, but growth coupling enables the selection of higher producing phenotypes by selecting faster growing cells through adaptive laboratory evolution (ALE). KOs by gene deletion are easily implemented in the lab, and since they remain fixed in the face of evolution, as opposed to engineering changes in gene expression, ALE has been shown to find strains with synthesis and growth rates near the optimal values predicted from GSMs (Tokuyama ). However, if the coupling is weak, cells will not synthesize the product unless they grow close to their theoretical maximum. Instead, KOs that create a strong coupling result in evolutionarily robust phenotypes with robust synthesis, and so are particularly appealing. Specifically, stronger coupling will strongly impair growth for small impediments in product synthesis, so higher producers will be reselected over evolutionary time, and it also helps conserve synthesis rates even if cells grow at suboptimal rates, for instance in large fermenters (Supplementary Fig. S1). In addition to strong coupling, we also desire that these strains grow fast but also synthesize rapidly. Identifying the KO sets, i.e., designs, that maximize these criteria is a multi-objective optimization problem. However, there are inherent trade-offs between some of these objectives, so solving this problem will give a set of alternative optimal designs where for each design each objective cannot be improved without sacrificing some of the others. This is known as a Pareto front of optimal designs. Multi-objective optimization has been applied in metabolic engineering, for instance to kinetic models to find Pareto optimal reaction kinetics that maximize synthesis (Sendín ; Vera ), and tools have been developed for use on GSMs to determine genetic manipulations to maximize growth and synthesis (Andrade et al., 2020; Patané ). Other tools have been developed to find growth-coupled designs (Alter and Ebert, 2019; Feist ; Ohno ), yet there is no tool to determine optimal designs that maximize coupling strength, growth and synthesis, in order to create evolutionarily robust strains with high productivity and robust synthesis—critical for industrial application. Moreover, though growth coupling is a widely applicable strategy (von Kamp and Klamt, 2017) KOs enabling this are rare, making the search for them difficult and slow (Ohno ). To address this key gap and problem, we developed a user-friendly software tool named gcFront that uses a genetic algorithm to search for KOs that maximize these three objectives, for any chemical and host of interest. Moreover, our proposed measure of coupling strength facilitates the search through the design space, so a run of gcFront outputs many Pareto optimal designs in reasonable timeframes.

2 The gcFront workflow

gcFront works in MATLAB, with dependencies on the COBRA toolbox (Heirendt ) for analysis of a compatible GSM; and the MATLAB Global Optimization toolbox for solving the multi-objective optimization problem (Supplementary Note S1A). The workflow, detailed in Supplementary Note S1B and Figure S2, entails four key steps. : Two interactive windows allow the user to define the GSM, target metabolite product or its exchange reaction and optional inputs (Supplementary Table S1), such as maximum number of KOs and search time. : To reduce the search space of reactions, gcFront automatically identifies and removes dead reactions, lumps unbranched pathways into composite reactions and excludes in silico essential single KOs for growth or synthesis. : To determine growth-coupled designs, gcFront solves the multi-objective optimization problem defined in Supplementary Note S2A. Our measure of coupling strength shapes the search landscape; it defines weak and strong coupling but also distinguishes between uncoupled designs (Supplementary Note S2B and Fig. S3a). It assigns higher values to KOs that reduce the cost to growth for increases in the maximum allowable synthesis, thus driving a bias to gc-designs (Supplementary Fig. S3b and c) to ease the search. : On termination (conditions in Supplementary Table S1), many Pareto optimal KO sets are found from a single run. Some proposed designs may contain redundant KOs, so to minimize the number of KOs of each design any KO that can be removed from those designs without any loss in performance is removed. The Pareto front of all designs (KOs) and their performance is then output to an interactive plot, a table in the command window and a .csv file. Users can select designs they deem suitable for their chemical and host of interest, based on bespoke combinations of the performance metrics. A tutorial is given in Supplementary Note S3.

3 Comparative performance assessment

To test gcFront’s performance, we compared it to other MATLAB-based procedures that identify growth-coupled (gc-)designs, including RobustKnock (Tepper and Shlomi, 2010) as implemented in OptPipe (Hartmann ); gcOpt (Alter and Ebert, 2019); FastPros (Ohno ) and OptGene (Patil ) as implemented in COBRA (Heirendt ). We ran each for 6 h and saved the gc-designs found and the time they needed to find their first gc-design, while repeating this three times for gcFront and OptGene because of the stochastic nature of searching with a genetic algorithm. For a fair comparison, we ran each algorithm using the Escherichia coli GSM model iML1515 (Monk ), for non-essential reaction KOs [in silico and based on Goodall ], for synthesis of succinate, tyrosine and pyruvate, as example products (detailed in Supplementary Notes S4 and Fig. S1). gcFront found the first gc-design in 38% less time than gcOpt for succinate synthesis, 98% less time than RobustKnock for tyrosine synthesis, and orders of magnitude less time than the other methods and products (Fig. 1a, Supplementary Data). Its power was especially apparent when searching for designs of tyrosine and pyruvate synthesis—still finding designs in minutes despite these designs, of at least six KOs, being rarer versus three KOs found for succinate (Supplementary Fig. S4). Furthermore, though the single gc-design found with other methods lay near the Pareto front of gc-designs from gcFront, gcFront offered many designs that achieved at least higher coupling strength (Fig. 1b, Supplementary Data).
Fig. 1.

gcFront finds many Pareto optimal growth-coupled designs, faster and with superior performance versus other algorithms. The speed and designs found from 6-h runs of gcFront were compared to those of RobustKnock, gcOpt, FastPros and OptGene (see Supplementary Note S4) on a MacBook Pro (2.3 GHz Quad-Core Intel core i5 processor, 8 GB 2133 MHz LPDDR3 RAM). Designs were based on KOs of only non-essential, gene-associated reactions, for the synthesis of three example products: succinate, tyrosine and pyruvate from the E.coli iML1515 GSM model, in aerobic, minimal media with glucose. (a) Time to identify the first gc-design from each procedure. Due to the stochastic nature of searching using the genetic algorithm in OptGene and gcFront, the average (bars) and standard deviation (error bars) of times are reported from three runs (N = 3, ±SD). (b) Pareto fronts of all gc-designs found from three 6-h runs

gcFront finds many Pareto optimal growth-coupled designs, faster and with superior performance versus other algorithms. The speed and designs found from 6-h runs of gcFront were compared to those of RobustKnock, gcOpt, FastPros and OptGene (see Supplementary Note S4) on a MacBook Pro (2.3 GHz Quad-Core Intel core i5 processor, 8 GB 2133 MHz LPDDR3 RAM). Designs were based on KOs of only non-essential, gene-associated reactions, for the synthesis of three example products: succinate, tyrosine and pyruvate from the E.coli iML1515 GSM model, in aerobic, minimal media with glucose. (a) Time to identify the first gc-design from each procedure. Due to the stochastic nature of searching using the genetic algorithm in OptGene and gcFront, the average (bars) and standard deviation (error bars) of times are reported from three runs (N = 3, ±SD). (b) Pareto fronts of all gc-designs found from three 6-h runs

4 Discussion

gcFront can find a multitude of Pareto optimal growth-coupled designs for evolutionarily robust cell factories, from a single run, in a computationally efficient manner. With the key input being the genome-scale metabolic network model of the cell host with the biochemistry of the engineered product synthesis pathway, gcFront should be widely applicable for designing growth-coupled synthesis of any compound, from any host, and so drive the design step in the design-build-test-learn cycle (Carbonell ). Since each design provides a different balance between the maximized objectives, the user has the flexibility to select designs with the balance they deem most suitable to the cell host and chemical product of interest, e.g. sacrifice growth for stronger coupling and synthesis, for instance for more robust pyruvate synthesis (Fig. 1b); or sacrifice synthesis for higher growth and stronger coupling, for instance for higher volumetric productivity with robust synthesis of succinate (Fig. 1b)—making it widely applicable to different contexts. gcFront is also user friendly, but versatile—the interactive user interface means no coding is required, making gcFront easy to use out-of-the-box, yet because it is a function in the MATLAB environment it can be easily integrated downstream of pathway designing tools, such as COBRA toolbox (Heirendt ) and RetroPath2.0 (Delépine ). Importantly, since gcFront proposes KOs for growth coupling and not changes in gene expression, strain construction and evolution is more easily automated. With recent technical advances in Synthetic Biology and lab robotics, subsequent to user-led design selection, we envision that gcFront can be integrated in pipelines upstream of robotics platforms for automated plasmid construction and transformation with a robot performing CRISPR-Cas9-based KOs (Suckling ), and automated ALE with liquid handling robotics, e.g. RoboLector (Radek )—making gcFront a tool for the future of creating microbial cell factories.

Author contributions

L.L., A.A.M.: developed theory; L.L.: developed code, ran procedures, plotted results; A.A.M., D.G.B.: designed and supervised the research; all authors: wrote the paper.

Data availability

The data underlying this article are available in the article and its online Supplementary Information and Supplementary Data files.

Funding

This work was supported by funds from the Engineering and Physical Sciences Research Council [EP/L016494/1] and Biotechnology and Biological Sciences Research Council [BB/M017982/1]. Conflict of Interest: none declared. Click here for additional data file.
  17 in total

1.  Multicriteria optimization of biochemical systems by linear programming: application to production of ethanol by Saccharomyces cerevisiae.

Authors:  Julio Vera; Pedro de Atauri; Marta Cascante; Néstor V Torres
Journal:  Biotechnol Bioeng       Date:  2003-08-05       Impact factor: 4.530

2.  Application of adaptive laboratory evolution to overcome a flux limitation in an Escherichia coli production strain.

Authors:  Kento Tokuyama; Yoshihiro Toya; Takaaki Horinouchi; Chikara Furusawa; Fumio Matsuda; Hiroshi Shimizu
Journal:  Biotechnol Bioeng       Date:  2018-03-08       Impact factor: 4.530

3.  Miniaturized and automated adaptive laboratory evolution: Evolving Corynebacterium glutamicum towards an improved d-xylose utilization.

Authors:  Andreas Radek; Niklas Tenhaef; Moritz Fabian Müller; Christian Brüsseler; Wolfgang Wiechert; Jan Marienhagen; Tino Polen; Stephan Noack
Journal:  Bioresour Technol       Date:  2017-05-12       Impact factor: 9.642

4.  Model-driven evaluation of the production potential for growth-coupled products of Escherichia coli.

Authors:  Adam M Feist; Daniel C Zielinski; Jeffrey D Orth; Jan Schellenberger; Markus J Herrgard; Bernhard Ø Palsson
Journal:  Metab Eng       Date:  2009-10-17       Impact factor: 9.783

5.  iML1515, a knowledgebase that computes Escherichia coli traits.

Authors:  Jonathan M Monk; Colton J Lloyd; Elizabeth Brunk; Nathan Mih; Anand Sastry; Zachary King; Rikiya Takeuchi; Wataru Nomura; Zhen Zhang; Hirotada Mori; Adam M Feist; Bernhard O Palsson
Journal:  Nat Biotechnol       Date:  2017-10-11       Impact factor: 54.908

6.  RetroPath2.0: A retrosynthesis workflow for metabolic engineers.

Authors:  Baudoin Delépine; Thomas Duigou; Pablo Carbonell; Jean-Loup Faulon
Journal:  Metab Eng       Date:  2017-12-09       Impact factor: 9.783

7.  FastPros: screening of reaction knockout strategies for metabolic engineering.

Authors:  Satoshi Ohno; Hiroshi Shimizu; Chikara Furusawa
Journal:  Bioinformatics       Date:  2013-11-19       Impact factor: 6.937

8.  Growth-coupled overproduction is feasible for almost all metabolites in five major production organisms.

Authors:  Axel von Kamp; Steffen Klamt
Journal:  Nat Commun       Date:  2017-06-22       Impact factor: 14.919

9.  The Essential Genome of Escherichia coli K-12.

Authors:  Emily C A Goodall; Ashley Robinson; Iain G Johnston; Sara Jabbari; Keith A Turner; Adam F Cunningham; Peter A Lund; Jeffrey A Cole; Ian R Henderson
Journal:  mBio       Date:  2018-02-20       Impact factor: 7.867

10.  An automated Design-Build-Test-Learn pipeline for enhanced microbial production of fine chemicals.

Authors:  Pablo Carbonell; Adrian J Jervis; Christopher J Robinson; Cunyu Yan; Mark Dunstan; Neil Swainston; Maria Vinaixa; Katherine A Hollywood; Andrew Currin; Nicholas J W Rattray; Sandra Taylor; Reynard Spiess; Rehana Sung; Alan R Williams; Donal Fellows; Natalie J Stanford; Paul Mulherin; Rosalind Le Feuvre; Perdita Barran; Royston Goodacre; Nicholas J Turner; Carole Goble; George Guoqiang Chen; Douglas B Kell; Jason Micklefield; Rainer Breitling; Eriko Takano; Jean-Loup Faulon; Nigel S Scrutton
Journal:  Commun Biol       Date:  2018-06-08
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.