| Literature DB >> 29468127 |
James D Winkler1, Andrea L Halweg-Edwards1, Ryan T Gill1.
Abstract
We previously introduced the LASER database (Learning Assisted Strain EngineeRing, https://bitbucket.org/jdwinkler/laser_release) (Winkler et al. 2015) to serve as a platform for understanding past and present metabolic engineering practices. Over the past year, LASER has been expanded by 50% to include over 600 engineered strains from 450 papers, including their growth conditions, genetic modifications, and other information in an easily searchable format. Here, we present the results of our efforts to use LASER as a means for defining the complexity of a metabolic engineering "design". We evaluate two complexity metrics based on the concepts of construction difficulty and novelty. No correlation is observed between expected product yield and complexity, allowing minimization of complexity without a performance trade-off. We envision the use of such complexity metrics to filter and prioritize designs prior to implementation of metabolic engineering efforts, thereby potentially reducing the time, labor, and expenses of large-scale projects. Possible future developments based on an expanding LASER database are then discussed.Entities:
Keywords: Design tools; Metabolic engineering; Standardization; Synthetic biology
Year: 2016 PMID: 29468127 PMCID: PMC5779719 DOI: 10.1016/j.meteno.2016.07.002
Source DB: PubMed Journal: Metab Eng Commun ISSN: 2214-0301
Fig. 1(A). Trends in number of metabolic engineering papers published per year, along with the average number of mutations per strain (red line) in their designs. Years 1983–1997 and 2001 contain only a single datapoint. B). Calculation of , along with the WGC score, for a single mutant. Node clusters are denoted by color. In this case, there is one mutation type (deletion, X; ), two mutated genes , one edge between a cluster containing a modified gene and a non-modified gene , and one intended effect of increasing the flux through part of the metabolic network . The resulting WGC score is . For study duration, these properties are calculated from all mutants described in the papers LASER record. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 2The complexity distribution for (A) WGC and (B) frequency (C) metrics for the entire LASER dataset. Values greater than five times the median complexity are placed in the final bin for each histogram. Median WGC complexity is 1.79, while median frequency complexity is 137.4; both distributions indicate LASER is highly skewed towards low complexity designs.
Fig. 3The (A) correlation between LASER-extracted topological properties and experimenter-reported study lengths (dashed lines denoted the 90% confidence interval for estimates), and (B) distribution of LASER study time estimates generated using the correlation. Multilinear regression was performed using the scipy optimize package using Eq. (2). C). The Winkler-Gill design complexities of E. coli proposed succinic acid production strains, along with their predicted theoretical yields from glucose (right-hand y-axis). Each proposed design was converted into a LASER design as discussed in Methods and Materials and analyzed using the same analysis pipeline.