| Literature DB >> 28598965 |
Andreas Kuehne1,2, Urs Mayr1, Daniel C Sévin1,2, Manfred Claassen1, Nicola Zamboni1.
Abstract
In recent years, the number of large-scale metabolomics studies on various cellular processes in different organisms has increased drastically. However, it remains a major challenge to perform a systematic identification of mechanistic regulatory events that mediate the observed changes in metabolite levels, due to complex interdependencies within metabolic networks. We present the metabolic network segmentation (MNS) algorithm, a probabilistic graphical modeling approach that enables genome-scale, automated prediction of regulated metabolic reactions from differential or serial metabolomics data. The algorithm sections the metabolic network into modules of metabolites with consistent changes. Metabolic reactions that connect different modules are the most likely sites of metabolic regulation. In contrast to most state-of-the-art methods, the MNS algorithm is independent of arbitrary pathway definitions, and its probabilistic nature facilitates assessments of noisy and incomplete measurements. With serial (i.e., time-resolved) data, the MNS algorithm also indicates the sequential order of metabolic regulation. We demonstrated the power and flexibility of the MNS algorithm with three, realistic case studies with bacterial and human cells. Thus, this approach enables the identification of mechanistic regulatory events from large-scale metabolomics data, and contributes to the understanding of metabolic processes and their interplay with cellular signaling and regulation processes.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28598965 PMCID: PMC5482507 DOI: 10.1371/journal.pcbi.1005577
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Design and implementation of the metabolic network segmentation algorithm, a probabilistic graphical modelling approach to identify sites of metabolic regulation.
(a) Principle of the algorithm to identify sites of metabolic regulation using metabolomics data and metabolic network reconstructions. The algorithm segments the metabolic network into modules of highly similar changes. To deal with sparse and noisy metabolite measurements, it considers dependencies between neighboring metabolites to extend the modules. Fractures between two modules are potential sites of metabolic regulation. (b) Structure of the Markov random field model. The discrete hidden variables y represent the modules to which a metabolite belongs, the observed variables x integrate the metabolite measurements. The factor potential functions introduce probabilistic dependencies between neighboring hidden states (ψ) and between measurements and the hidden states (ψ). (c) Example of neighborhood and observation factor potential functions. The neighborhood factor potentials are module label dependent exponential decay functions. The influence of the metabolic neighborhood is controlled by λ1. Observation potential functions link hidden state/module labels to metabolic observations by hidden state dependent Gaussian distributions. (d) Algorithm workflow of the scan mode implementation to automatically identify sites of metabolic regulation. Details in S1 Fig.
Fig 2Principle of the algorithm to identify sites and sequential order of metabolic regulations from sequential metabolomics data.
a) Principle of the algorithm to identify sites and sequential order of regulations from metabolomics data with sequential structure. The metabolomics data gets split into individual frames, for which the regulated reactions get identified through metabolic network segmentation. (b) Structure of the Markov random field model for sequential data. For each frame of a sequential dataset a Markov random field model for univariate comparison is introduced (Fig 1b). To enforce a probabilistic dependency between hidden variables of neighboring sequence frames, the model includes additional sequence factor potential (ψ) connecting adjacent hidden variables.
Fig 3Parameter optimization for the accurate identification of experimentally perturbed reactions from metabolomics data.
(a) Number of significantly identified reactions (p < 0.05) for each parameter combination for the reaction ranking based on maximum λ1 or total numbers of identified fractures (Details see S1 and S2 Text, S1 and S2 Figs). Significantly identified reactions were classified into “exact” if the experimentally perturbed reaction was inferred by the algorithm or as “first neighbor” if one of the first neighbor reactions of the perturbed reaction was inferred. P-values are calculated by a permutation test of the reaction labels with 1000 permutations. (b) Influence of the models parameter settings on the significance of the inference of individually perturbed enzymes. Black box marks the best single parameter combination (3 hidden states, mean type: k-means, std type: all data). Certain perturbed enzyme knock-outs (e.g. shdB, aldA) cannot be identified with the best parameter combination but with others (e.g. 3 hidden states, mean type: quantile, std type: all data). (c) Number of significantly identified reactions (p < 0.05) for rank product combinations of different parameter settings. Three combinations of two predictors with different model parameterizations (black boxes) improve the significantly inferred enzymes to 14. P-values are calculated by a permutation test of the reaction labels with 1000 permutations.
Comparison of algorithms in the identification of the experimentally perturbed reactions in 647 E. coli enzyme knockout mutants.
Significantly identified reactions were determined by a permutation test of the reaction labels with 1000 permutations and a p-value cutoff of 0.05.
| Algorithm | Genes found in TOP10 Ranks [%] | #Significantly identified reactions | Significantly identified reactions [%] | |||
|---|---|---|---|---|---|---|
| Exact | Total | Exact | Total | Exact | Total | |
| MNS—P3—max(λ1) | 4.0 | 38.6 | 48 | 55 | 7.4 | 8.5 |
| MNS—P3—#fractures | 4.2 | 38.8 | 48 | 55 | 7.4 | 8.5 |
| MNS—P2 & P11—max(λ1) | 4.3 | 38.9 | 66 | 71 | 10.2 | 11.0 |
| MNS—P5 & P8—max(λ1) | 3.6 | 28.4 | 62 | 67 | 9.6 | 10.4 |
| MNS—P2 & P3—#fractures | 5.6 | 42.2 | 69 | 74 | 10.7 | 11.4 |
| reporter reactions | 2.8 | 43.6 | 37 | 48 | 5.7 | 7.4 |
| mass action ratio | 1.5 | 38.5 | 26 | 31 | 4.0 | 4.8 |
Fig 4Identification of novel regulatory events in nucleotide metabolism mediated by MetR.
(a) Change in nucleotide triphosphate (NTP), cyclic nucleotide monophosphates (cNMP), nucleotide monophosphates (NMP) and nucleoside metabolite levels comparing ΔmetR knockout and wildtype E. coli. (b) Results of CyaA enzyme assays with 10 mM ATP as substrate in crude extracts of ΔmetR knockout, metR overexpression and wildtype E. coli. (c) Known and potentially new interactions involved in the regulation of nucleotide metabolism. Our study suggests that MetR inhibits CyaA. This could be mediated through direct inhibition or indirect feedback for example to CRP, the known regulator of CyaA expression.
Fig 5Identification of sequential order of oxidative stress induced metabolic regulations in fibroblasts.
Application of MNS algorithm for sequential data on metabolomics data from fibroblasts treated with increasing concentrations of H2O2. The figure shows segmentation results with increasing incluence of the neighborhood and sequential dependency: (a) sequence weight ws = 0, neighborhood weight wn = 0, (b) ws = 0.15, wn = 0.15, (c) ws = 0.24, wn = 0.24. In each time profile inset, the black line reports the log2 of the fold-change relative to the first time point (frame). With increasing dependency to the neighborhood and sequence, only the major known regulators involved in the metabolic reponse to oxidative stress remain as inferred regulatory sites. Furthermore, the algorithm correctly infers the sequential order of an initial activation of G6PD and inhibition of glycolytic flux (GAPDH) which is followed by a rerouting of flux into P5P, S7P, and E4P via PGD and back to upper glycolysis via TK and TA [39]. Abbreviations: Hexose P: hexose phosphates, GL6P: gluconolactone 6-phosphate, 6PG: 6-phospho gluconic acid, FBP: fructose bisphosphate, E4P: erythrose 4-phosphate, S7P: sedoheptulose 7-phosphate, P5P: pentose 5-phosphates, GAP/DHAP: glyceraldehyde 3-phosphate/dihydroxyacetone phosphate, 3-PGP: 3-phosphoglyceroyl phosphate, xPG: 2/3-Phosphoglyceric acid, PEP: phosphoenolpyruvate, LAC: lactic acid, CIT: citric acid, cAco: cis-aconitic acid, ICIT: isocitric acid, aKG: α-ketoglutaric acid, SUC: succinic acid, FUM: fumaric acid, MAL: Malic acid, OXA: oxaloacetic acid, G6PD: glucose-6-phosphate dehydrogenase, PGLS: 6-phosphogluconolactonase, PGD: phosphogluconate dehydrogenase, TK: transketolase, TA: transaldolase, PFK: phosphofructokinase, ALDO: aldolase, GAPDH: glyceraldehyde 3-phosphate dehydrogenase, PGK: phosphoglycerate kinase, ENO: enolase, PK: pyruvate kinase, LDH lactate dehydrogenase, CS: citrate synthetase, ACO: aconitase, IDH: isocitrate dehydrogenase, αKGDH: α- ketoglutarate dehydrogenase, SDH: succinate dehydrogenase, FH: fumarate hydratase, MDH: malate dehydrogenase.
Fig 6Overview of the MNS toolbox.
The MNS algorithm enables to predict sites and the sequential order of metabolic regulations from metabolomics data.