| Literature DB >> 19067749 |
Maxime Durot1, Pierre-Yves Bourguignon, Vincent Schachter.
Abstract
Genome-scale metabolic models bridge the gap between genome-derived biochemical information and metabolic phenotypes in a principled manner, providing a solid interpretative framework for experimental data related to metabolic states, and enabling simple in silico experiments with whole-cell metabolism. Models have been reconstructed for almost 20 bacterial species, so far mainly through expert curation efforts integrating information from the literature with genome annotation. A wide variety of computational methods exploiting metabolic models have been developed and applied to bacteria, yielding valuable insights into bacterial metabolism and evolution, and providing a sound basis for computer-assisted design in metabolic engineering. Recent advances in computational systems biology and high-throughput experimental technologies pave the way for the systematic reconstruction of metabolic models from genomes of new species, and a corresponding expansion of the scope of their applications. In this review, we provide an introduction to the key ideas of metabolic modeling, survey the methods, and resources that enable model reconstruction and refinement, and chart applications to the investigation of global properties of metabolic systems, the interpretation of experimental results, and the re-engineering of their biochemical capabilities.Entities:
Mesh:
Year: 2008 PMID: 19067749 PMCID: PMC2704943 DOI: 10.1111/j.1574-6976.2008.00146.x
Source DB: PubMed Journal: FEMS Microbiol Rev ISSN: 0168-6445 Impact factor: 16.408
Fig. 1Genome-scale modeling of metabolism. A metabolic network (top left) is transformed into a model by defining the boundaries of the system, a biomass assembly reaction, and exchange fluxes with the environment (top right). Using the corresponding stoichiometric matrix (bottom right), the achievable flux distributions compatible with enforced constraints can be found (a particular one is depicted in the bottom left figure).
Data sources for metabolic model reconstruction and refinement
| DDBJ | General nucleotide sequence database | |
| EMBL | General nucleotide sequence database | |
| GenBank | General nucleotide sequence database | |
| Integr8 | Integrated information on complete genomes | |
| CMR | Integrated information on complete prokaryotic genomes | |
| IMG | Integrated system for analysis and annotation of microbial genomes | |
| SEED | Integrated system for analysis and annotation of genomes using functional subsystems | |
| BRENDA | Comprehensive enzyme information system gathering data collected from the literature by curators | |
| ENZYME | Enzyme nomenclature database providing extensive information on all enzymes with an associated EC number | |
| UniProt | Universal Protein Resource gathering protein sequences and annotations from SwissProt (manually reviewed), trEMBL (computer annotated), and PIR | |
| TransportDB | Predictions of membrane transport proteins for fully sequenced genomes | |
| PSORTdb | Repository of experimentally determined and predicted protein localizations | |
| Prolinks | Database of predicted functional links between proteins | |
| STRING | Database of known and predicted protein–protein interactions | |
| CheBI | Database on small molecules of biological interest | |
| Pubchem | Database on small molecules | |
| LipidMaps | Database on lipid metabolites | |
| Reactome | Curated database of biological pathways | |
| KEGG | Suite of databases comprising information on compounds, reactions, pathways, genes/proteins | |
| BioCyc | Collection of organism-specific pathway/genome databases, including a curated multiorganism pathway database: MetaCyc | |
| UniPathway | Curated resource of metabolic pathways linked to UniProt enzyme database | |
| UM-BBD | Database on microbial biocatalytic reactions and biodegradation pathways | |
| IntAct | Repository of reported protein interactions | |
| DIP | Database of experimentally determined interactions between proteins | |
| Array Express | Public repository of microarray data | |
| GEO | Public repository of microarray data | |
| ASAP | Repository of results of functional genomics experiments for selected bacterial species | |
| Comprehensive dataset of transcriptomic, proteomic, metabolomic, and fluxomic experiments for | ||
| Systomonas | Repository of ‘omics’ datasets and molecular networks for pseudomonads species | |
| PubMed | Database on biomedical literature | |
| BiGG | Repository of reconstructed genome-scale metabolic models | |
| BioModels | Database of mathematical models of biological systems | |
Type of information provided by each data source
| Type of information | |
|---|---|
| Biochemical activities | |
| Enzyme specificity | |
| Enzyme localization | |
| Reaction equation | |
| Reaction direction | |
| Metabolite formula | |
| GPR association | |
| Biomass composition | |
| Experimental observations |
Methods for model reconstruction
| Identification of metabolic reactions from textual gene annotations | |
| Direct inference of metabolic reactions from genome sequence | |
| Use of metabolic context to complete pathways | |
| Flux variability analysis: identification of reactions that are predicted to never carry any flux | |
| Identification of dead-end metabolites, which can never be produced or consumed. | |
| Assessment of thermodynamic consistency and assignment of reaction directions. | |
| Graph-based metabolic network expansion using shortest metabolic paths | |
| GapFill: optimization-based network expansion and reaction reversibility changes to solve dead-end metabolite inconsistencies | |
| Optimization-based metabolic network expansion to resolve inconsistent growth phenotypes | |
| Network-based identification of candidate genes for orphan metabolic activities | |
Fig. 2Pipeline for model reconstruction and refinement. An initial model is reconstructed from genome annotations and from preexisting knowledge on the species' biochemistry and physiology. Besides collecting the biochemical activities, this task includes several additional key steps. The resulting model is then iteratively corrected and refined, according to internal consistency criteria and by comparing its predictions to experimental data.
Existing genome-scale metabolic models for bacterial organisms
| Experimental assessment | |||||||
|---|---|---|---|---|---|---|---|
| Organism | Reference | Genes | Reactions | Metabolites | Wild-type growth phenotypes | Knockout mutant growth phenotypes | Quantitative growth measures |
| 774 | 875 | 701 | 173/190 (91%) | 1138/1208 (94%) | – | ||
| 844 | 1020 | 988 | 200/271 (74%) | 720/766 (94%) | – | ||
| 432 | 502 | 479 | 10/11 (91%) | – | X | ||
| 474 | 552 | 422 | – | – | – | ||
| 1260 | 2077 | 1039 | 129/170 (74%) | 1152/1260 (92%) | X | ||
| 588 | 523 | 541 | – | – | X | ||
| 412 | 461 | 367 | – | – | – | ||
| 341 | 476 | 485 | – | 54/72 (75%) | – | ||
| 721 | 643 | 531 | – | – | X | ||
| 358 | 621 | 422 | – | – | X | ||
| 335 | 373 | 332 | – | – | – | ||
| 726 | 849 | 739 | – | 547/705 (78%) | X | ||
| 661 | 939 | 828 | – | 132/237 (56%) | X | ||
| 555 | 496 | 471 | – | – | X | ||
| 1056 | 883 | 760 | 78/95 (82%) | 893/1056 (85%) | – | ||
| 746 | 950 | 710 | 84/90 (93%) | 665/746 (89%) | X | ||
| 363 | 387 | 371 | – | – | – | ||
| 619 | 641 | 571 | – | – | – | ||
| 551 | 774 | 712 | – | 8/14 (57%) | – | ||
| 700 | 700 | 500 | 54/58 (93%) | 11/12 (92%) | X | ||
First two columns of experimental assessment show the number of correct predictions among all experimentally determined qualitative growth phenotypes. Last column specifies whether the model has been assessed against quantitative growth rate measurements.
Number of distinct reactions including transport processes.
Number of biochemically distinct metabolites.
This model is an update of two earlier models for E. coli (Edwards & Palsson, 2000; Reed ).
This model is an update of an earlier model for H. pylori (Schilling ).
Using gene essentiality data for Pseudomonas aeruginosa.
Main analytical methods for genome-scale models sorted by type of application
| Flux sampling: random sampling of flux distribution among the set of possible metabolic states | |
| Flux variability analysis: examination of flux variability for each reaction | |
| Metabolic pathway analysis, elementary modes/extreme pathways: comprehensive description of all independent metabolic modes achievable in the metabolic network | |
| Flux coupling: identification of reaction pairs whose fluxes are coupled | |
| Metabolite coupling/evaluation of conserved metabolite pools | |
| Producibility analysis of biomass precursors | |
| FBA: quantitative prediction of growth yield by maximization of growth rate given bounded nutrient input rates | |
| MOMA: prediction of gene deletion mutant flux distribution by minimizing overall flux changes with wild type | |
| ROOM: prediction of gene deletion mutant growth by minimizing regulatory changes with wild type | |
| Identification of multiple gene deletion essentialities | |
| Metabolic Flux Analysis using labeled metabolites: prediction of attainable reaction fluxes given observed metabolite isotopic patterns | |
| Global prediction of reaction activities using metabolic flux measurements on subsets of reactions | |
| Identification of metabolic objectives best describing observed fluxes | |
| Comparison of model coverage with experimentally detected metabolites | |
| NET analysis and TMFA: application of thermodynamic constraints to reaction directions using metabolite concentrations | |
| Identification of metabolic pathways correlated with gene expression levels | |
| Refinement of flux distribution predictions by blocking reactions corresponding to unexpressed genes | |
| Evaluation of consistency of gene expression levels with metabolic objectives | |
| rFBA and SR-FBA: prediction of gene expression states using Boolean regulatory rules | |
| Systematic identification of gene deletions enhancing metabolite production yield | |
| OptStrain: systematic identification of reaction additions enabling the production of novel metabolites | |
| Prediction of adjustments of enzyme expression levels enhancing metabolite production yield | |