| Literature DB >> 25379144 |
Min Kyung Kim1, Desmond S Lun2.
Abstract
Several computational methods have been developed that integrate transcriptomic data with genome-scale metabolic reconstructions to infer condition-specific system-wide intracellular metabolic flux distributions. In this mini-review, we describe each of these methods published to date with categorizing them based on four different grouping criteria (requirement for multiple gene expression datasets as input, requirement for a threshold to define a gene's high and low expression, requirement for a priori assumption of an appropriate objective function, and validation of predicted fluxes directly against measured intracellular fluxes). Then, we recommend which group of methods would be more suitable from a practical perspective.Entities:
Keywords: Contraint-based model; Flux balance analysis; Omics
Year: 2014 PMID: 25379144 PMCID: PMC4212280 DOI: 10.1016/j.csbj.2014.08.009
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Flux balance analysis (FBA). This figure illustrates how FBA works with an example of the simple network below consisting of two metabolites, A and B, and three metabolic reactions. (a) To use FBA, the network is converted into a stoichiometric matrix, S, where the rows in S correspond to the metabolites of the metabolic network, and the columns represent the reactions. Each matrix element s, indicates a stoichiometric coefficient, that is, the number of molecules of the ith metabolite participating in the jth reaction. s = 0 means that the ith metabolite is not involved, and a positive or a negative s indicates that the ith metabolite is a product or a reactant of the jth reaction, respectively. (b) Under the steady state assumption, the metabolic flux distribution can be represented mathematically by S·v = 0, where v is a column vector whose elements are the unknown reaction rates (fluxes) through each of the reactions of S.(c) Since the resulting system, S·v = 0, is usually underdetermined, physiologically meaningful flux solutions need to be narrowed down from all the possible flux distributions by imposing additional constraints on the system (e.g. 0 ≤ v ≤ 2 in the figure) and by optimizing certain objective functions (e.g. Max v3 in the figure).
Fig. 2Representative methods currently available for integration of transcriptomic data in genome-scale metabolic models. (a)–(g) show how each method integrates gene expression data onto the models. (a) PROM binarizes the gene expression data according to a user-supplied threshold. Then, it calculates the probability of a metabolic target gene being expressed relative to the activity of its regulating transcription factor from a large dataset of gene expression data. The flux maxima of the metabolic reaction associated with the metabolic target gene is constrained by a factor of this probability. (b) MADE creates a sequence of binary expression states using several datasets for differential gene expression so as to find the model that most closely reproduces the observed expression changes. (c) Åkesson's method is one of the earliest methods to integrate genome-wide expression data into genome-scale metabolic models. In this method, the fluxes of reactions whose corresponding genes are not expressed are constrained as zero. (d) GIMME consists of a two-step procedure. First, the method finds a flux distribution that optimizes a given biological objective such as growth and/or ATP production using FBA. Then, the method minimizes the utilization of ‘inactive’ reactions whose corresponding mRNA transcript levels are below a given threshold. (e) iMAT discretized gene expression data into tri-valued expression states, representing either low, moderate or high expression in the condition studied according to a user-specified threshold. Then, the method finds an optimal metabolic flux distribution that is the most consistent with the discrete gene expression data by maximizing the number of flux-carrying reactions associated with highly expressed enzymes and minimizing the number of flux-carrying reactions that correspond to lowly-expressed enzymes. (f) E-Flux maps continuous gene expression levels into flux bound constraints according to gene–protein–reaction (GPR) associations. It uses transcriptomic data to set upper and lower bounds on metabolic fluxes so that reactions associated with more highly expressed genes will be allowed to have higher absolute flux values. (g) Dave Lee's method uses transcriptomic data in the objective function. This method predicts intracellular metabolic fluxes by minimizing the deviation between the flux distribution and the transcriptomic data. The deviation was calculated by the sum of absolute differences between fluxes and corresponding gene expression data.
Summary of the features of previous methods according to four grouping criteria described in this paper. Desirable features from a practical perspective are shaded in green.