| Literature DB >> 23632383 |
Douglas McCloskey1, Bernhard Ø Palsson, Adam M Feist.
Abstract
The genome-scale model (GEM) of metabolism in the bacterium Escherichia coli K-12 has been in development for over a decade and is now in wide use. GEM-enabled studies of E. coli have been primarily focused on six applications: (1) metabolic engineering, (2) model-driven discovery, (3) prediction of cellular phenotypes, (4) analysis of biological network properties, (5) studies of evolutionary processes, and (6) models of interspecies interactions. In this review, we provide an overview of these applications along with a critical assessment of their successes and limitations, and a perspective on likely future developments in the field. Taken together, the studies performed over the past decade have established a genome-scale mechanistic understanding of genotype-phenotype relationships in E. coli metabolism that forms the basis for similar efforts for other microbial species. Future challenges include the expansion of GEMs by integrating additional cellular processes beyond metabolism, the identification of key constraints based on emerging data types, and the development of computational methods able to handle such large-scale network models with sufficient accuracy.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23632383 PMCID: PMC3658273 DOI: 10.1038/msb.2013.18
Source DB: PubMed Journal: Mol Syst Biol ISSN: 1744-4292 Impact factor: 11.429
Figure 1History of the E. coli expression and metabolic reconstructions. Shown in the upper portion of the graph are 2 milestone efforts contributing to the reconstruction of the E. coli transcription and translation network, and shown in the bottom portion of the graph are 7 milestone efforts contributing to the reconstruction of the E. coli metabolic network. For each of the two reconstructions shown (Allen and Palsson, 2003; Thiele et al, 2009) in the upper graph, the number of included transcription units (blue diamonds), genes (green triangles) and components (purple squares) are displayed. For each of the seven reconstructions shown (Majewski and Domach, 1990; Varma and Palsson, 1993; Pramanik and Keasling, 1997; Edwards and Palsson, 2000; Reed et al, 2003; Feist et al, 2007; Orth et al, 2011) in the bottom graph, the number of included reactions (blue diamonds), genes (green triangles) and metabolites (purple squares) are displayed. Moreover, listed is noteworthy content expansion that each successive reconstruction provided over previous efforts. For example, Varma et al (1993), and Varma and Palsson (1993) included amino acid and nucleotide biosynthesis pathways in addition to the content that Majewski and Domach (1990) characterized. The start of the genomic era (Blattner et al, 1997) marked a significant increase in included components for successive iterations of the network reconstruction. The significant increase in the number of reactions in 2007 (Feist et al, 2007) was, in large part, due to the removal of many lumped reactions, which were often included for lipid and cell wall biosynthesis in earlier metabolic reconstructions. Thiele et al (2009) expanded the initial work of Allen and Palsson (2003) by increasing the scope of the transcription and translation network from a few example pathways to all known genes involved in protein synthesis (i.e., expression). Not included on the timeline is a metabolic reconstruction based upon Reed et al (2003), which was modified to include additional reactions from the KEGG (Kanehisa et al, 2008) database and incorporated into the MetaFluxNet software package (Lee et al, 2005).
Figure 2The detailed usage of the E. coli metabolic GEM over time. The cumulative and new number of studies published per year separated according to (A) the metabolic reconstruction used (Edwards and Palsson, 2000; Reed et al, 2003; Feist et al, 2007; Orth et al, 2011), (B) in silico (i.e., strictly computational prediction) or combined in silico and in vivo (i.e., computational usage of the model and experimental validation or data generation guided by the model) and (C) the application category of the study. BD, model-driven discovery; BE, studies of evolutionary processes; II, interspecies interaction; ME, metabolic engineering; NA, analysis of network properties; PB, prediction of cellular phenotypes.
Figure 3Six categories of uses and number of studies for each use of the E. coli metabolic GEM. The original five categories defined in 2008 (Feist and Palsson, 2008) include (A) metabolic engineering, (B) model-driven discovery, (C) prediction of cellular phenotypes, (D) analysis of biological network properties and (E) studies of evolutionary processes. A new category has been added, (F) interspecies interaction. The addition of this category signifies a growing trend in the field to explore the interaction of the E. coli metabolic network with other organisms and across different environmental conditions. Specifically, studies have explored host/pathogen interactions (Jain and Srivastava, 2009), cocultures (Wintermute and Silver, 2010; Hanly and Henson, 2011; Tzamali et al, 2011), ecology (Klitgord and Segrè, 2010) and chemotaxis (Kugler et al, 2010). The number of studies in this category is expected to increase, as the interest in understanding the complexities of microbial interactions and ecosystems continues to grow. The complete lists of the studies for each category are included in Supplementary Table S1.
Strengths and limitations of the metabolic GEM applications
| Application | What the model can do | What the model cannot do |
|---|---|---|
| Strengths of the | Areas for future progress | |
| Metabolic engineering | Gene deletion (combinatorial) | Limited coverage of molecular biology |
| Gene addition | Predicting the effects of perturbations to regulatory elements | |
| Gene over- and under-expression | Predicting allosteric inhibition | |
| Rapidly test the systemic effects of heterologous pathway additions | There is no explicit representation of metabolite concentrations | |
| Design biomarkers/biosensors for characteristic function | Account for enzyme kinetics | |
| Determine media supplementation strategiesMap high-throughput data to identify bottlenecks | Cannot accurately predict the performance of nonnative genes/proteins in | |
| Design strains through evolution | ||
| Biological discovery | Predict growth on different carbon sources/media conditions | Predict the regulation of isozymes/parallel pathways |
| Guide the functional assignment of network gaps | Predict enzyme promiscuity | |
| Guide the discovery of previously uncharacterized gene product functions (graph theory analysis) | Predictive power is inherently limited, because the model is not complete in scope | |
| Guide the reannotations of incorrectly annotated genes | Predict the expression of genes | |
| Connect orphan metabolites to known reactions | Predict the functional state of proteins (e.g., posttranslational modification) | |
| Phenotypic behavior | Predict optimal cellular behavior | Differentiate between computed alternate optimal flux distributions of the cell |
| Understand energetics and occurrence of suboptimal behavior | Explain the reasons for suboptimal performance | |
| Infer impact of regulationProvide a context for which experimental data can be interpreted | Provide a framework for incorporating additional regulatory interactions that are currently under development | |
| Predict and understand absolute and conditional gene essentiality | ||
| Predict and understand shifts in growth conditions | ||
| Network analysis | Evaluate metabolic networks from a systems view through node and link dependencies, essentialities, overall network robustness | Does not always include the biological mechanisms behind the network connectionsFew predictions can be experimentally validated |
| Describe the complex interactions of the components of the metabolic network | ||
| Evaluate modularity of function | ||
| Evaluate regulation based on network structure | ||
| Bacterial evolution | Predict essential genes | Account for changes in regulatory elements |
| Predict the endpoint of evolution | Predict the time-course of evolution | |
| Understand the basis for epistatic interactions and mutational effects | Predict location of mutations in the genomePredict the effects of mutations in the genome | |
| Provide insights into evolutionary trajectories | Account for strain-specific genomic differences | |
| Interspecies interaction | Model the exchange of metabolites | Model interactions that affect metabolic regulation |
| Analyze high-throughput data from different strains | Inability to measure flux exchange in multi cell-type systems | |
| Determine the cost/benefit ratio for different types of commensalism | There are still too many unknowns to accurately build an interactions network | |
| Limited ability to define individual genetic content in large communities | ||
| Limited spatial knowledge in large communities |
Figure 4Iterative workflows. (A) A generic network reconstruction and model-driven systems biology workflow is a cyclic path that iterates between in silico predictions and in vivo observations. This general process has been followed in some of the more influential studies presented in this review. DNA sequencing and bibliomic data can be used to reconstruct and translate a biological system into a mathematical structure. Other omics data types that have been generated can be interpreted in the context of a reconstruction and computational model to analyze organism functions under specific conditions. This information becomes a de facto knowledge base that can be queried through a consortium of analytical methods. The aim of these methods is to hypothesize answers to complex biological questions that can often be nonintuitive or not readily apparent. Experiments can then be designed to test these predictions in order to either confirm GEM-derived explanations or move researchers one iteration closer to the answer. Studies that have successfully iterated through the E. coli GEM workflow that are presented as examples include (B) Reed et al (2006), (C) Shen et al (2010), (D) Yim et al (2011) and (E) Nakahigashi et al (2009).
Figure 5The future of the E. coli GEM. The most widely used E. coli reconstruction accounts only for metabolism (the ‘M' matrix) (Feist et al, 2009). However, efforts are currently underway to integrate the operon structure that determines cellular regulation (the ‘O' matrix), the transcriptional and translational machinery allowing for the expression of proteins (the ‘E' matrix; Thiele et al, 2009) and other cellular processes (e.g., DNA replication, posttranslational modifications, etc.) with metabolism. The integration of these cellular processes, supported by high-throughput data types, into a single mathematical model, will allow researchers to more accurately compute complex phenotypes, and will guide the discovery of unknown aspects of cellular functions beyond that of just metabolism.