| Literature DB >> 27482704 |
Daniel Hartleb1, Florian Jarre2, Martin J Lercher1.
Abstract
Constraint-based metabolic modeling methods such as Flux Balance Analysis (FBA) are routinely used to predict the effects of genetic changes and to design strains with desired metabolic properties. The major bottleneck in modeling genome-scale metabolic systems is the establishment and manual curation of reliable stoichiometric models. Initial reconstructions are typically refined through comparisons to experimental growth data from gene knockouts or nutrient environments. Existing methods iteratively correct one erroneous model prediction at a time, resulting in accumulating network changes that are often not globally optimal. We present GlobalFit, a bi-level optimization method that finds a globally optimal network, by identifying the minimal set of network changes needed to correctly predict all experimentally observed growth and non-growth cases simultaneously. When applied to the genome-scale metabolic model of Mycoplasma genitalium, GlobalFit decreases unexplained gene knockout phenotypes by 79%, increasing accuracy from 87.3% (according to the current state-of-the-art) to 97.3%. While currently available computers do not allow a global optimization of the much larger metabolic network of E. coli, the main strengths of GlobalFit are already played out when considering only one growth and one non-growth case simultaneously. Application of a corresponding strategy halves the number of unexplained cases for the already highly curated E. coli model, increasing accuracy from 90.8% to 95.4%.Entities:
Mesh:
Year: 2016 PMID: 27482704 PMCID: PMC4970803 DOI: 10.1371/journal.pcbi.1005036
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Comparison of experimental and predicted viability for 187 M. genitalium gene knockouts.
| Experiment | ||||
|---|---|---|---|---|
| Predictions | growth | non-growth | Accuracy | MCC |
| | 16 | 22 | 87.3% | 0.56 |
| | 2 | 149 | ||
| | 12 | 24 | 85.0% | 0.44 |
| | 4 | 147 | ||
| G | ||||
| | 14 | 10 | 93.6% | 0.68 |
| | 2 | 161 | ||
| G | ||||
| | 14 | 2 | 97.9% | 0.86 |
| | 2 | 169 | ||
1 These numbers include the two genes wrongly associated with the FBA model (MG260, MG124) removed in our calculations.
2 The initial network obtained from [21] was not able to produce biomass in any environment; to rectify this problem, we converted three irreversible reactions (ZN2t4, INSK, LYSt3) to reversible reactions. We further allowed uptake of all metabolites for which transport reactions are included (see Methods).
Modifications of the M. genitalium network suggested by GlobalFit based on 187 gene knockout experiments (bold font indicates conservative changes).
| Type | Gene | Associated reactions | Removed reactions | Added reactions | Added biomass metabolite |
|---|---|---|---|---|---|
| FPp | UPPRT | NDPK1for, NDPK9for, URIK1for | |||
| CYTD, DCYTD | URAt2for or EX_ura(e) | ||||
| PMANM | PGAMTback or G1PACTfor | ACGAMPMfor | |||
| DGK1, GK1, GK2 | NDPK8for | ||||
| G6PI,PGI | FRUptsfor or EX_fru(e)back | ||||
| GLYC3Pabc | GLYCtback or EX_glyc(e)back | ||||
| GLYC3Pabc | GLYCtback or EX_glyc(e)back | ||||
| GLYC3Pabc | GLYCtback or EX_glyc(e)back | ||||
| PFK | FRUptsfor or EX_fru(e)back | ||||
| PDH | DATPtfor or EX_datp(e)back | ||||
| PDH | DATPtfor or EX_datp(e)back | ||||
| NADH5 | G3PD4for | ||||
| PBUTT, PTA2r, PTAr | PGAMTback or G1PACTfor | ACGAMPMfor | |||
| ACKr, PPAK | PGAMTback or G1PACTfor | ACGAMPMfor | |||
| GLYK | Glycerol | ||||
| DRPAr | 2-Deoxy-D-ribose 5-phosphate | ||||
| UDPGALM | UDP-D-galacto-1,4-furanose | ||||
| GLNMT | S-Adenosyl-L-homocysteine | ||||
| CHOLK | EX_chol(e), CHLabcfor | Choline phosphate | |||
| THZPSN | 4-Hydroxy-benzyl alcohol and 4-Methyl-5-(2-phosphoethyl)-thiazole and 1-deoxy-D-xylulose 5-phosphate | ||||
| RPI | D-Ribulose 5-phosphate | ||||
| METSR-R1, METSR-R2 | L methionine R oxide | ||||
| PIabc | GLYKback | ||||
| PIabc | GLYKback |
Fig 1An example for the utility of simultaneously adding and removing reactions.
Ellipses indicate metabolites, rectangles indicate reactions; abbreviations are taken from iPS189 [21]. (A) N-Acetyl-D-glucosamine 1-phosphate (acgam1p) is produced by G1PACT; MG053, MG299, and MG357 are falsely predicted to be non-essential (FPp). (B) The simultaneous removal of PGAMT (or, alternatively, G1PACT) and addition of ACGAMPM makes the genes MG053, MG299, and MG357 essential. Blue arrows mark essential pathways, while red arrows indicate blocked pathways. Note that removing either one of PGAMT or G1PACT blocks the other reaction, and that both reactions are not associated with any genes.
Comparison of experimental and predicted viability for 1366 E. coli gene knockouts on two different minimal media.
| Experiment | ||||
|---|---|---|---|---|
| Predictions | growth | non-growth | Accuracy | MCC |
| Unoptimized model (iJO1366) grown on glucose | ||||
| growth | 1079 | 80 | 91.3% | 0.69 |
| no growth | 39 | 168 | ||
| Unoptimized model (iJO1366) grown on glycerol | ||||
| growth | 1073 | 87 | 90.3% | 0.66 |
| no growth | 45 | 161 | ||
| Optimized model grown on glucose | ||||
| growth | 1104 | 45 | 95.7% | 0.85 |
| no growth | 14 | 203 | ||
| Optimized model grown on glycerol | ||||
| growth | 1096 | 44 | 95.2% | 0.83 |
| no growth | 22 | 204 | ||
Isoenzymes that resolved FNp.
| Gene | Associated reactions | Isoenzyme | e-value → | e-value ← |
|---|---|---|---|---|
| TRDR | b0606 | 2x10-35 | 8x10-37 | |
| ASPTA | b4054 | 2x10.113 | 2x10-113 | |
| GCALDD, LCADi | b1385 | 7x10-80 | 1x10-77 | |
| PPS | b2383 | 2x10-22 | 2x10-22 | |
| PGAMT | b2048 | 3x10-16 | 1x10-18 | |
| SDPTA | b1748 | 1x10-180 | 1x10-180 |
Removal of biomass components from the E. coli model suggested by GlobalFit to remove FNp.
| Gene | Associated reactions | Removed biomass metabolite |
|---|---|---|
| MPTAT | Bis-molybdopterin guanine dinucleotide | |
| THZPSN3 | Thiamine diphosphate | |
| CPMPS | Bis-molybdopterin guanine dinucletide | |
| CPMPS | Bis-molybdopterin guanine dinucletide | |
| MOADSUx, MPTS | Bis-molybdopterin guanine dinucletide | |
| MPTS | Bis-molybdopterin guanine dinucletide | |
| MPTSS | Bis-molybdopterin guanine dinucletide | |
| BMOCOS, BWCOS, MOCOS, WCOS | Bis-molybdopterin guanine dinucletide | |
| PMPK | Thiamine diphosphate | |
| CD2tpp, CU2tpp, FE2tpp, MN2tpp, ZN2tpp | Copper | |
| CAt6pp | Calcium | |
| I2FE2SS, I2FE2SS2, S2FE2SS, S2FE2SS2 | [4Fe-4S] iron-sulfur cluster and [2Fe-2S] iron-sulfur cluster | |
| BMOGDS1, BMOGDS2, BWCOGDS1, BWCOGDS2, MOGDS | Bis-molybdopterin guanine dinucletide | |
| THZPSN3 | Thiamine diphosphate | |
| TYRL | Thiamine diphosphate | |
| THZPSN3 | Thiamine diphosphate | |
| TMPPP | Thiamine diphosphate | |
| AMPMS2 | Thiamine diphosphate | |
| THZPSN3 | Thiamine diphosphate |
Reversal of reactions of the E. coli network suggested by GlobalFit to remove FNp.
| Gene | Associated reactions | Reversed reactions |
|---|---|---|
| 5DOAN, AHCYSNS, MTAN | HCYSMT, CPPPGO2 | |
| PMPK | 2MAHMP | |
| RHCCE | HCYSMT | |
| CD2tpp, CU2tpp, FE2tpp, MN2tpp, ZN2tpp | CU2abcpp | |
| CAt6pp | CA2t3pp |
Metabolite additions to the E. coli biomass reaction suggested by GlobalFit to resolve FPp.
| Gene | Associated reactions | Added as biomass substrate | Added as biomass product |
|---|---|---|---|
| PROTRS | L-Prolyl-tRNA(Pro) | TRNA(Pro) | |
| GLU5K | L-Glutamate 5-phosphate | ||
| CYSTRS | L-Cysteinyl-tRNA(Cys) | TRNA(Cys) | |
| MTHFC, MTHFD | 5-Formyltetrahydrofolate | ||
| LEUTRS | L-Leucyl-tRNA(Leu) | TRNA(Leu) | |
| GLNTRS | L-Glutaminyl-tRNA(Gln) | TRNA(Gln) | |
| SERTRS, SERTRS2 | L-Seryl-tRNA(Ser) | TRNA(Ser) | |
| ASNTRS | L-Asparaginyl-tRNA(Asn) | TRNA(Asn) | |
| TYRTRS | L-Tyrosyl-tRNA(Tyr) | TRNA(Tyr) | |
| PHETRS | L-Phenylalanyl-tRNA(Phe) | TRNA(Phe) | |
| PHETRS | L-Phenylalanyl-tRNA(Phe) | TRNA(Phe) | |
| THRTRS | L-Threonyl-tRNA(Thr) | TRNA(Thr) | |
| ASPTRS | L-Aspartyl-tRNA(Asp) | TRNA(ASP) | |
| ARGTRS | L-Arginyl-tRNA(Arg) | TRNA(ARG) | |
| PGSA120, PGSA140, PGSA141, PGSA160, PGSA161, PGSA180, PGSA181 | Phosphatidylglycerophosphate (didodecanoyl, n-C12:0) or Phosphatidylglycerophosphate (ditetradecanoyl, n-C14:0) or Phosphatidylglycerophosphate (ditetradec-7-enoyl, n-C14:1) or Phosphatidylglycerophosphate (dihexadecanoyl, n-C16:0) or Phosphatidylglycerophosphate (dihexadec-9-enoyl, n-C16:1) or Phosphatidylglycerophosphate (dioctadecanoyl, n-C18:0) or Phosphatidylglycerophosphate (dioctadec-11-enoyl, n-C18:1) | ||
| METTRS | TRNA(Met) | ||
| HISTRS | L-Histidyl-tRNA(His) | TRNA(His) | |
| GHMT2r, THFAT | 5-Formyltetrahydrofolate | ||
| PGCD | 3-Phosphohydroxypyruvate | ||
| b3288 | FMETTRS | N-Formylmethionyl-tRNA | |
| b3384 | TRPTRS | L-Tryptophanyl-tRNA(Trp) | TRNA(Trp) |
| b4258 | VALTRS | L-Valyl-tRNA(Val) | TRNA(Val) |
Removal of reactions of the E. coli network suggested by GlobalFit to correct FPp.
| Gene | Associated reactions | Removed reactions |
|---|---|---|
| CBPS | (CBMKrfor and ALLTAMHfor) or (CBMKrfor and ALLTNfor) or (CBMKrfor and OXAMTCfor) or (CBMKrfor and URDGLYCDfor) or (CBMKrfor and URICfor) | |
| CBPS | (CBMKrfor and ALLTAMHfor) or (CBMKrfor and ALLTNfor) or (CBMKrfor and OXAMTCfor) or (CBMKrfor and URDGLYCDfor) or (CBMKrfor and URICfor) | |
| GLU5K | NACODAfor | |
| G5SD | NACODAfor | |
| ADK1, ADK3, ADK4, ADNK1, DADK | NDPK1for or PRPPSback or R1PKfor or PPMback or R15BPKfor | |
| DHORD2, DHORD5 | DHORDfumfor | |
| T2DECAI | (CTECOAI6back and CTRCOAI7back) or (CTECOAI6back and AACPS4for) | |
| PRPPS | R1PKfor or PPMback or R15BPKfor | |
| PDX5POi, PYAM5PO | PDX5PO2for | |
| GAPD | TPIfor | |
| RNDR1, RNDR2, RNDR3, RNDR4 | (GRXRfor and RNTR3c2for) or (GTHOrfor and RNTR3c2for) or (GRXRfor and RNTR1c2for) or (GTHOrfor and RNTR1c2for) | |
| RNDR1, RNDR2, RNDR3, RNDR4 | (GRXRfor and RNTR3c2for) or (GTHOrfor and RNTR3c2for) or (GRXRfor and RNTR1c2for) or (GTHOrfor and RNTR1c2for) | |
| IMPD | HXAND or XPPT | |
| PGCD | GHMT2rback | |
| PGK | TPIfor | |
| ATPS4rpp | (F6PAback and PGKback) or (G6PDH2rfor and PGKback) | |
| ATPS4rpp | (F6PAback and PGKback) or (G6PDH2rfor and PGKback) | |
| ATPS4rpp | (F6PAback and PGKback) or (G6PDH2rfor and PGKback) | |
| ATPS4rpp | (F6PAback and PGKback) or (G6PDH2rfor and PGKback) | |
| ATPS4rpp | (F6PAback and PGKback) or (G6PDH2rfor and PGKback) | |
| ATPS4rpp | (F6PAback and PGKback) or (G6PDH2rfor and PGKback) | |
| OPHHX | OPHHX3for | |
| PPC | FUMfor or MALSfor | |
| G3PAT120, G3PAT140, G3PAT141, G3PAT160, G3PAT161, G3PAT180, G3PAT181 | ACPPAT160for or AG3PAT161for or AG3PAT160for | |
| PSP_L | GHMT2rback |
Definitions of the sets used in the system of equations that describes the GlobalFit algorithm.
| The reactions included in the original (input) network reconstruction | |
| All irreversible reactions that can be reversed | |
| All reactions that can be added to the network (here, we consider bidirectional reactions as two separate reactions corresponding to forward and backward directions (with fluxes ≥0)). | |
| BS | All substrates that can be removed from the biomass reaction |
| The stoichiometric coefficients of all biomass substrates | |
| BP | All products that can be removed from the biomass reaction |
| The stoichiometric coefficients of all biomass products | |
| All substrates that can be added to the biomass reaction | |
| The stoichiometric coefficients of all additional biomass substrates | |
| All products that can be added to the biomass reaction | |
| The stoichiometric coefficients of all additional biomass products | |
| All experiments with observed growth | |
| All experiments with observed non-growth |
The parameters of the system of equations describing the GlobalFit algorithm.
| Binary variables that indicate the removal of forward and backward reaction | |
| Penalty for the removal of forward or backward reaction (which can be set to a different value for each reaction | |
| Binary variables that indicate the addition of a backward reaction for reaction | |
| Corresponding penalties | |
| Binary variables that indicate the addition of reaction | |
| Corresponding penalties | |
| Binary variables that indicate the addition of substrate | |
| Corresponding penalties | |
| Binary variables that indicate the addition of product | |
| Corresponding penalties | |
| Binary variables that indicate the removal of substrate | |
| Corresponding penalties | |
| Binary variables that indicate the removal of product | |
| Corresponding penalties | |
| Binary variables that indicate the exclusion of growth experiment | |
| Corresponding penalties | |
| Binary variables that indicate the exclusion of non-growth experiment | |
| Corresponding penalties | |
| Flux through the (potentially modified) biomass reaction (see | |
| Optimal value for | |
| Minimal flux allowed through reaction | |
| Maximal flux allowed through reaction | |
| Viability threshold of growth experiment | |
| Viability threshold of non-growth experiment | |
| The vector of all | |
| The vector of all fluxes |