Joshua E Goldford1,2,3, Ashish B George4,5,6, Avi I Flamholz7, Daniel Segrè3,4,8,9. 1. Physics of Living Systems, Massachusetts Institute of Technology, Cambridge, MA 02139. 2. Blue Marble Space Institute of Science, Seattle, WA, 98154. 3. Bioinformatics Program, Biological Design Center, Boston University, Boston, MA, 02215. 4. Department of Physics, Boston University, Boston, MA, 02215. 5. Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801. 6. Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801. 7. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA. 8. Department of Biomedical Engineering, Boston University, Boston, MA, 02215. 9. Department of Biology, Boston University, Boston, MA, 02215.
Abstract
SignificanceMetabolism relies on a small class of molecules (coenzymes) that serve as universal donors and acceptors of key chemical groups and electrons. Although metabolic networks crucially depend on structurally redundant coenzymes [e.g., NAD(H) and NADP(H)] associated with different enzymes, the criteria that led to the emergence of this redundancy remain poorly understood. Our combination of modeling and structural and sequence analysis indicates that coenzyme redundancy may not be essential for metabolism but could rather constitute an evolved strategy promoting efficient usage of enzymes when biochemical reactions are near equilibrium. Our work suggests that early metabolism may have operated with fewer coenzymes and that adaptation for metabolic efficiency may have driven the rise of coenzyme diversity in living systems.
SignificanceMetabolism relies on a small class of molecules (coenzymes) that serve as universal donors and acceptors of key chemical groups and electrons. Although metabolic networks crucially depend on structurally redundant coenzymes [e.g., NAD(H) and NADP(H)] associated with different enzymes, the criteria that led to the emergence of this redundancy remain poorly understood. Our combination of modeling and structural and sequence analysis indicates that coenzyme redundancy may not be essential for metabolism but could rather constitute an evolved strategy promoting efficient usage of enzymes when biochemical reactions are near equilibrium. Our work suggests that early metabolism may have operated with fewer coenzymes and that adaptation for metabolic efficiency may have driven the rise of coenzyme diversity in living systems.
Group and electron transfer reactions form the foundation for more complex metabolic pathways in biological systems. For specific types of biochemical transformations, molecular components are shuttled within the cell through the action of small-molecule coenzymes. For the majority of these transformations, there often exist multiple chemically distinct coenzymes capable of similar chemistry (Fig. 1). For example, electron transfer between common functional groups widely distributed throughout cellular metabolism (e.g., alcohols, aldehydes, and activated carboxylic acids) are carried out by a pair of redundant coenzymes capable of electron transfer at identical standard thermodynamic potential: nicotinamide adenine dinucleotide (NAD+) and the phosphorylated derivative NADP+ (or the reduced forms, NADH and NADPH) (Fig. 1). NAD(H) and NADP(H) are coupled to distinct sets of reactions, where NAD(H) is typically used in the breakdown of nutrients to generate precursors and ATP (catabolism), while NADP(H) is used in the reductive synthesis of macromolecules (anabolism) (1, 2). These functional roles have been attributed to differences in the in vivo thermodynamic potentials of NAD(H) and NADP(H), where NAD(H) and NADP(H) are typically poised in the oxidized (NAD+) and reduced state (NADPH), respectively (3–7). For example, in Escherichia coli grown aerobically with glucose as the sole source of carbon, the NADH/NAD+ ratio is ∼0.03, while the NADPH/NADP+ ratio exceeds 57 (3, 8). This thermodynamic difference is commonly assumed to be a universal way for cells to simultaneously oxidize and reduce metabolites that might be impossible with just a single nicotinamide coenzyme (2, 9). Although the difference between the NADH/NAD+ and NADPH/NADP+ ratios in vivo enables tuning of thermodynamic potentials, the quantitative mechanism(s) that relate cellular fitness to these thermodynamic differences have not been addressed.
Fig. 1.
Coenzyme redundancy is a widespread biological phenomenon. (A) A table of examples of redundant coenzymes for various classes of coenzymes. Note that for electron transfer reactions, different classes of coenzymes have varying characteristic ranges of standard redox potentials, enabling electron transfer between distinct classes of functional groups (27). (B) Structure of the nicotinamide adenine diphosphate (NAD+) and the phosphorylated derivative (NADP+). The phosphoryl transfer enzyme NAD kinase phosphorylates NAD+ on the 2′ hydroxyl to make NADP+.
Coenzyme redundancy is a widespread biological phenomenon. (A) A table of examples of redundant coenzymes for various classes of coenzymes. Note that for electron transfer reactions, different classes of coenzymes have varying characteristic ranges of standard redox potentials, enabling electron transfer between distinct classes of functional groups (27). (B) Structure of the nicotinamide adenine diphosphate (NAD+) and the phosphorylated derivative (NADP+). The phosphoryl transfer enzyme NAD kinase phosphorylates NAD+ on the 2′ hydroxyl to make NADP+.The assumption that coenzyme redundancy is essential in living systems leaves unanswered a set of fundamental questions about the nature and early evolution of metabolism: Is the coupling of each reaction to a specific coenzyme a strong constraint for the proper functioning of metabolism, or can some coenzymes be switched with limited consequences? More generally, is the presence of two coenzymes poised in different redox states an absolute necessity for the proper functioning of a metabolic network, or could metabolism in principle operate with only one coenzyme? If one coenzyme is sufficient, is it possible to quantitatively explain how evolution could gradually lead to the selective advantage of two coenzymes? Is the rise of multiple coenzymes an idiosyncratic step along a complex historical process, or a predictable universal feature of enzyme-driven biochemical networks?While prior experimental work has shown that a range of cellular phenotypes can be sensitive to the coenzyme preferences for various oxidoreductases, they do not rule out the possibility of a simpler precursor to the NAD(H)/NADP(H) system. NAD(P)-dependent oxidoreductases can display selectivity for either NAD(H) or NADP(H), and perturbations to this specificity can strongly affect cellular phenotypes and fitness (10, 11). For example, mutagenesis experiments have shown that altering coenzyme specificity of isopropylmalate dehydrogenase and isocitrate dehydrogenase can severely reduce growth rate in E. coli (11–15). More generally, specificity for NAD(H) or NADP(H) for some enzymes can strongly influence intracellular reaction flux distributions (10), serving as the basis for metabolic engineering strategies to dramatically improve yields of valuable metabolic by-products (10, 16–18). Importantly, recent work has demonstrated that specificity for NAD(P)H can emerge in laboratory evolution studies, suggesting that coenzyme preference might be a highly adaptable feature of metabolic networks (19). However, it is unclear if most enzymes are typically constrained to use only NAD(H) or NADP(H), and whether a single coenzyme could support the functioning of metabolism. It is also unclear how the thermodynamic potentials controlled by NADH/NAD+ and NADPH/NADP+ ratios in vivo translate into quantitative mechanism(s) that affect cellular fitness.While an experimental assessment of the landscape of possible variants of metabolism with different coenzyme systems would be extremely challenging, one can leverage theoretical and computational approaches to help address these questions. The recent development and application of metabolic modeling has brought a quantitative and model-driven approach to the study of metabolic evolution (20–25) and has led to uncovering optimality criteria that may have shaped universal features of core metabolism, such as the emergence of both NAD(H) and NADP(H) (26). Although recent quantitative work has explored the logic for why NAD(P)H emerged as a prominent coenzyme in biochemistry (27), a quantitative theory predicting the emergence of both NAD(H) and NADP(H), rather than just a single coenzyme, remains unexplored.Here we use computational systems biology and various network-level models of metabolism to systematically probe, in a way that could not be easily addressed experimentally, the possible advantages of maintaining multiple pools of biochemically redundant redox coenzymes. By taking advantage of large-scale computational experiments using both core and genome-scale stoichiometric models, we show that the E. coli metabolic network is predicted to be insensitive to coenzyme preference for the majority of enzymes and that a metabolism with just a single coenzyme [rather than both NAD(H) and NADP(H)] is capable of producing biomass under a variety of environments. We found that switching coenzyme preferences may lead to significant metabolic impairment for only a small fraction of oxidoreductases. The widespread sequence conservation of NAD(H)/NADP(H) selective residues, however, seems to contradict this low sensitivity, suggesting that factors beyond flux stoichiometry may drive coenzyme redundancy. By developing a model of enzyme cost as a function of enzyme–coenzyme specificity we found that this conundrum can be resolved by viewing coenzyme redundancy as a graded improvement that enables efficient usage of protein resources when reactions are near thermodynamic equilibrium. These results suggest that widespread coenzyme redundancy in metabolic networks is the outcome of evolution for increased metabolic efficiency and parsimonious use of protein at the whole-cell level. Our findings may cast new light on the early stages and emergence of coenzyme couplings and could serve as design principles for the construction of modified or artificial metabolic circuits for synthetic biology and metabolic engineering applications.
Results
Thermodynamics Constrain Only a Small Number of NAD-Coupled Oxidoreductases.
We first explored the robustness of metabolism to altering enzyme specificity of individual oxidoreductases to coenzymes as a way of investigating whether coenzyme coupling constrains metabolic networks (15). To investigate the consequences of altering the NAD(P)-specificity of individual oxidoreductases, we used flux balance analysis (FBA) (28) to simulate the growth rates of in silico mutants with altered coenzyme specificity in 117,180 different growth media. Specifically, we altered the iJO1366 (29) E. coli genome-scale metabolic model (GEMM) as follows. First, we changed stoichiometric coefficients to replace, one by one, the original coenzyme stoichiometric coefficients with the alternative coenzyme. In other words, if a reaction chosen for a mutation was originally coupled with NAD(H), it will now become coupled with NADP(H) instead. Since the in vivo NADH/NAD+ ratio is drastically different from the NADPH/NADP+ ratio (3, 8), some reactions that are reversible with the original coenzyme coupling may become irreversible upon switching coenzymes (), even when taking into consideration the possible range of concentrations of other metabolites in the cell (Materials and Methods). Thus, upon switching coenzymes, we reevaluated the thermodynamic feasibility of each reaction in either direction and translated this information into constraints included into our altered metabolic model ( and Materials and Methods). We compiled a list of 76 genes encoding NAD(P)-coupled oxidoreductases () and generated 76 single-mutant metabolic models by switching the coenzyme preference of each enzyme, one at a time. For each model mutant, we then computed maximal growth rates in 117,180 media conditions spanning fermentation, aerobic respiration, and anaerobic respiration using nitrate, nitrite, dimethyl sulfoxide (DMSO), trimethylamine N-oxide (TMAO), or fumarate as electron acceptors, on combinations of 180 carbon sources and 93 nitrogen sources, using sulfate and orthophosphate as sulfur and phosphorus sources, respectively (Materials and Methods and ).We first counted the frequency of lethal mutations across all in silico media conditions (Fig. 2 and Materials and Methods), expecting coenzyme switching to cause major fitness defects for each gene under at least some conditions. To our surprise, the vast majority (97.3%, 8,103,217 of 8,323,596) of environment and coenzyme mutation pairs were nonlethal, suggesting that metabolism is overall robust to individual swaps of coenzyme specificity. However, of the 219,092 lethal mutant–environment pairs, all came from forcing one of six NAD(H)-coupled oxidoreductases to use NADP(H). These genes are leuB, pdxB, paaH, gatD, lgoD, and fucO (Fig. 2). By further constraining the range of allowed metabolite concentrations, we also found environments where sugar alcohol oxidoreductases encoded by srlD and mltD require coupling to NAD(H) rather than NADP(H) (). Only two oxidoreductases, encoded by the two genes leuB and pdxB, were predicted to require NAD(H) as a coenzyme rather than NADP(H) in all conditions tested (Fig. 2). This is in agreement with extensive work by Dean and coworkers, who demonstrated that the gene product of leuB, isopropylmalate dehydrogenase, is strongly specific for NAD(H) and that introducing mutations to alter specificity toward NADP(H) always reduced fitness (13). Notably, all the observed lethal gene–environment pairs are alleviated when we do not take into consideration the thermodynamic differences between NAD(H) and NADP(H) (), demonstrating that the thermodynamic driving force is a dominant factor in constraining the coenzyme preference in our model.
Fig. 2.
Coenzyme “rewiring” in genome-scale metabolic models. (A) We determined the consequences of “rewiring” the E. coli metabolic network iJO1366 by changing the coenzyme specificity of 76 genes encoding NAD(P)-dependent oxidoreductases and stimulated the growth across 117,180 growth media spanning different combinations of 180 carbon sources, 93 nitrogen sources, and 6 electron acceptors (oxygen, nitrite, nitrate, TMAO, DMSO, and fumarate) as well as fermentative growth using FBA. (B) Only 6 of 76 genes were predicted to require a specific nicotinamide cofactor in one or more media conditions (x axis, Inset). The percentage of the media conditions that led to no growth is plotted as a bar graph (y axis), showing that only two oxidoreductases are constrained to the endogenous coupling in all conditions (purple text): the NAD(H)-coupled isopropylmalate dehydrogenase leuB and erythronate 4-phosphate dehydrogenase pdxB. (C) We next altered the E. coli metabolic model by replacing all NADP(H)-coupled reactions with NAD(H)-coupled reactions (Materials and Methods) and simulated growth at various concentration ratios of NADH/NAD+. (D) We computed the maximum growth rate of the NAD(H)-only metabolic model (y axis) at various concentration ratios for NADH/NAD+ (x axis, black line) and found that a wide range of ratios enabled growth at comparable levels to the unmodified model. The red and blue dashed line indicates the measured concentration ratios in E. coli for NADH/NAD+ (0.032) and NADPH/NADP+ (57.14), respectively. Note that a single coenzyme poised at the measured NADH/NAD+ ratio could enable cellular growth.
Coenzyme “rewiring” in genome-scale metabolic models. (A) We determined the consequences of “rewiring” the E. coli metabolic network iJO1366 by changing the coenzyme specificity of 76 genes encoding NAD(P)-dependent oxidoreductases and stimulated the growth across 117,180 growth media spanning different combinations of 180 carbon sources, 93 nitrogen sources, and 6 electron acceptors (oxygen, nitrite, nitrate, TMAO, DMSO, and fumarate) as well as fermentative growth using FBA. (B) Only 6 of 76 genes were predicted to require a specific nicotinamide cofactor in one or more media conditions (x axis, Inset). The percentage of the media conditions that led to no growth is plotted as a bar graph (y axis), showing that only two oxidoreductases are constrained to the endogenous coupling in all conditions (purple text): the NAD(H)-coupled isopropylmalate dehydrogenase leuB and erythronate 4-phosphate dehydrogenase pdxB. (C) We next altered the E. coli metabolic model by replacing all NADP(H)-coupled reactions with NAD(H)-coupled reactions (Materials and Methods) and simulated growth at various concentration ratios of NADH/NAD+. (D) We computed the maximum growth rate of the NAD(H)-only metabolic model (y axis) at various concentration ratios for NADH/NAD+ (x axis, black line) and found that a wide range of ratios enabled growth at comparable levels to the unmodified model. The red and blue dashed line indicates the measured concentration ratios in E. coli for NADH/NAD+ (0.032) and NADPH/NADP+ (57.14), respectively. Note that a single coenzyme poised at the measured NADH/NAD+ ratio could enable cellular growth.Our flux predictions suggest that enzymes that are sensitive to switching coenzymes are limited to NAD(H)-dependent, and not NADP(H)-dependent, oxidoreductases, suggesting that a metabolism using only NAD(H) might be feasible. To test this scenario, we constructed an NAD(H)-only model of E. coli metabolism by swapping all NADP(H)-coupled reactions with NAD(H) and removing NADP+ synthesis and degradation reactions (Materials and Methods). We systematically computed maximum growth rates across a broad gradient of NADH/NAD+ ratios (Fig. 2) and found that a wide range of NADH/NAD+ ratios enabled growths similar to that of the unmodified model (Fig. 2). The feasibility of an NAD(H)-only E. coli metabolism is also corroborated by an alternative approach, i.e., thermodynamic flux balance analysis (TFBA), applied to a core and genome-scale E. coli model (). Interestingly, TFBA modeling of a NAD-only genome-scale model suggests a decreased growth rate relative to the unmodified model, where flux through lower glycolysis was blocked (). We also computed relative growth rates of each of the 76 mutant models described previously using TFBA and found that, unlike results described in Fig. 2, no single coenzyme swap reduced growth rate (). This discrepancy likely results from the fact that TFBA allows for variation in NAD(P)+/NAD(P)H ratios while our previous modeling approach (Fig. 2 ) fixed these ratios to measured values.Altogether, these results challenge the commonly accepted notion that two redox coenzymes are essential for running metabolism at the whole-cell level and suggest that the reasons for coenzyme redundancy and NAD(H) or NADP(H) specificity might have to be sought elsewhere.
Amino Acid Residues Conferring Selectivity to NAD(H) or NADP(H) Are Well Conserved.
Flux modeling predictions suggest that while the maximum organismal growth rate is insensitive to the coenzyme specificity for the majority of enzymes, a few genes are expected to be highly constrained to a single coenzyme specificity, limited to NAD(H)-coupled oxidoreductases. To determine if these model predictions are consistent with measures of enzyme specificity, we used a combination of protein structure and bioinformatic analysis to assess the coenzyme binding preferences of oxidoreductases both in E. coli and across phylogenetically divergent species.Since protein structural features that shape coenzyme selectivity are well known, we reasoned that the analysis of conservation of residues that occlude or coordinate the 2′ phosphate on the ribosyl moiety of NADP(H) (10) would enable us to access the degree of coenzyme preference conservation independent of cosubstrate specificity. We compiled a list of 57 NAD(P)-dependent oxidoreductases from E. coli with NAD(P)-bound crystal structures of the enzyme (or a homolog with greater than 30% homology) in the Protein Data Bank (PDB) and annotated residues that confer selectivity to NAD(H) or NADP(H) (). We next performed multiple sequence alignment for orthogroups for each oxidoreductase, and quantified the conservation of the NAD(P)-selective residues across orthologs (Materials and Methods). We found that residues that confer selectivity for NAD(H) or NADP(H) were significantly less variable than the remaining portion of the protein sequences but more variable than active-site residues (Fig. 3 ). Interestingly, residues that confer selectivity to NAD(H) were more conserved than NADP(H), which is consistent with our predictions that metabolic systems appear more sensitive to NAD(H) coupling compared to NADP(H).
Fig. 3.
Structural analysis of NAD(P)-dependent enzymes and connection to FBA predictions. (A) We computed the entropy for each residue that conferred selectivity for NAD(H) in 35 KEGG orthogroups encoding NAD-coupled oxidoreductases and plotted the empirical cumulative distribution (eCDF) for residues that are responsible for NAD(H) selectivity (red) vs. all other residues in the protein sequences (gray). (B) The same as A, but for 25 NADP(H)-dependent oxidoreductases, where the entropy distribution for NADP(H)-selective residues are plotted in blue. NAD(H) and NADP(H) selective residues displayed significantly lower entropy than nonselective residues [Mann–Whitney U test: P < 10−7 for NAD(H)-dependent oxidoreductases and P < 10−2 for NADP(H)-dependent oxidoreductases]. For both A and B, we plotted the eCDFs for residue entropy for active site residues (purple lines), obtained from M-CSA (44). (C) Orthologs of pdxB were retrieved from the KEGG database (K03473). The Rossman fold was identified using HMMR 3.1 (45) to generate a sequence logo for the NAD(H) binding domain of these enzymes (Materials and Methods). The arrow denotes the 100% conservation of the aspartic acid residue (Asp146) responsible for discriminating between NAD(H) and NADP(H). (D) For each enzyme with an Asp or Glu residue that specified selectivity for NAD(H) (n = 25), we constructed flux balance models with single oxidoreductases coupled to an alternative coenzyme and computed maximum growth rate across 117,180 different media. We computed then plotted the minimum growth rate (relative to wild-type growth) across all media conditions (y axis) versus the conservation of the ribosyl-coordinating Asp or Glu (x axis).
Structural analysis of NAD(P)-dependent enzymes and connection to FBA predictions. (A) We computed the entropy for each residue that conferred selectivity for NAD(H) in 35 KEGG orthogroups encoding NAD-coupled oxidoreductases and plotted the empirical cumulative distribution (eCDF) for residues that are responsible for NAD(H) selectivity (red) vs. all other residues in the protein sequences (gray). (B) The same as A, but for 25 NADP(H)-dependent oxidoreductases, where the entropy distribution for NADP(H)-selective residues are plotted in blue. NAD(H) and NADP(H) selective residues displayed significantly lower entropy than nonselective residues [Mann–Whitney U test: P < 10−7 for NAD(H)-dependent oxidoreductases and P < 10−2 for NADP(H)-dependent oxidoreductases]. For both A and B, we plotted the eCDFs for residue entropy for active site residues (purple lines), obtained from M-CSA (44). (C) Orthologs of pdxB were retrieved from the KEGG database (K03473). The Rossman fold was identified using HMMR 3.1 (45) to generate a sequence logo for the NAD(H) binding domain of these enzymes (Materials and Methods). The arrow denotes the 100% conservation of the aspartic acid residue (Asp146) responsible for discriminating between NAD(H) and NADP(H). (D) For each enzyme with an Asp or Glu residue that specified selectivity for NAD(H) (n = 25), we constructed flux balance models with single oxidoreductases coupled to an alternative coenzyme and computed maximum growth rate across 117,180 different media. We computed then plotted the minimum growth rate (relative to wild-type growth) across all media conditions (y axis) versus the conservation of the ribosyl-coordinating Asp or Glu (x axis).Despite the overall high conservation of NAD(H)-binding residues relative to NADP(H)-binding ones, the flux balance prediction suggest that NAD(H)-dependent enzymes have a broad range of dependency on the coenzyme preference, with some being clearly much more sensitive than others. In exploring the relationship between NAD(H) selectivity and flux balance predictions, we first investigated in detail leuB and pdxB, which both require NADH coupling to permit biomass production and growth in silico. While leuB has been previously shown to use only NAD(H) (13), pdxB, which encodes the enzyme erythronate-4-phosphate dehydrogenase involved in de novo pyridoxal 5′-phosphate biosynthesis, has received less attention. We computed the degree of conservation of the coenzyme-binding selective residue (Asp146; ) across all pdxB orthologs from the KEGG (Kyoto Encyclopedia of Genes and Genomes) database (orthogroup K03473, n = 1087) and performed multiple sequence alignment on the coenzyme-binding Rossman fold (Materials and Methods). We found that Asp146 was conserved in all pdxB orthologs, consistent with predictions that pdxB is universally required to use NAD(H) like leuB (Fig. 3).The above analysis of the relationship between flux balance prediction and structural conservation can be extended to all NAD(H)-dependent oxidoreductases. We found 25 oxidoreductases with NAD(H)-bound structures in the PDB with clear evidence of hydrogen bonding between an aspartic acid or glutamic acid residue and the 2′ and 3′ hydroxyl groups in the ribosyl portion of NAD(H) (). We performed multiple sequence alignment for orthogroups for each oxidoreductase and quantified the conservation of the aspartic acid or glutamic acid residue across orthogroups. We next compared this degree of conservation to the minimum relative growth of coenzyme mutant models across all 117,180 media conditions. We found that all enzymes predicted to have a severe fitness defect upon cofactor switching (<10% of wild type) have a high conservation (>80%) of the coordinating Asp or Glu residues (Fig. 3). Conversely, oxidoreductases with low conservation of the coordinating Asp or Glu residue were not predicted to have a fitness defect after forcing coupling to NADP(H) with flux modeling (e.g., rfbD). However, several oxidoreductases with high conservation of the coordinating Asp or Glu residue either were not predicted to have a fitness defect after forcing coupling to NADP(H) in any media condition (e.g., hisD). Thus, flux balance predictions only partially predict the observed cofactor preference, as measured by structural conservation.Altogether, the results from genome-scale simulations (previous section) suggest that metabolism, in principle, can operate with a single nicotinamide coenzyme system, strikingly contradicting the textbook assumption that having two coenzymes poised at two distinct in vivo thermodynamic potentials is essential. Models also predict that NAD(H)-dependent reactions are more sensitive to altering the coenzyme coupling than NADP(H)-dependent reactions. While this hypothesis is supported by the overall pattern of conservation at the specificity-determining residues in NAD(P)-dependent oxidoreductases, a detailed analysis indicates that several enzymes with high conservation are predicted to have no detectable sensitivity based on flux balance models, suggesting that additional selective pressures not captured by our flux modeling might drive coenzyme redundancy. In particular, while metabolism with an individual cofactor seems potentially viable, it may be highly inefficient, due to hidden costs that cells may need to pay in order to maintain metabolic fluxes. In the following section, we explore a model of metabolic fluxes that explores the potential relationship between these crucial hidden costs (i.e., enzyme abundance) and coenzyme thermodynamic potentials to test the hypothesis that coenzyme redundancy might lead to a quantitative fitness advantage in extant metabolism.
Enzyme Cost Minimization Drives Separation of Coenzyme Preference Globally.
Thus far, the fitness consequences of modifying oxidoreductase specificity for different coenzymes has been assessed by computing maximum growth rates using stoichiometric models of metabolism with altered coenzyme couplings for individual enzymes. These models suggest that thermodynamics may drive specificity for a few individual enzymes, giving rise to “local” constraints for enzyme specificity that are also recapitulated in protein sequence-level variation. However, these models also suggest that a metabolism with a single coenzyme (poised close to the in vivo NADH/NAD+ ratio) can satisfy these constraints, which is in stark contrast with the observed widespread specificity for NAD(H) or NADP(H) by oxidoreductases (Fig. 3).Beyond inducing local constraints in single reactions, thermodynamics may also induce “global” fitness consequences through the control of the total protein demand to carry biosynthetic flux. This is due to the fact that strong thermodynamic driving forces in enzyme-catalyzed reactions reduce the minimum protein abundance necessary to maintain a fixed reaction flux (30–33). This effect comes directly from the following flux–force relation (34):where is the free energy at physiological metabolite concentrations and and are the forward and reverse reaction flux, respectively. Note that as the free energy decreases (i.e., it becomes more negative), the proportion of flux that is in the forward direction increases, and thus decreases the enzyme abundance required for catalysis (30, 31). We hypothesized that coenzyme redundancy might allow the cell to lower the total protein required to maintain intracellular reaction fluxes. As in previous work, we used the flux–force relation to model the interrelationship between thermodynamics and enzyme cost (30–33). Specifically, we constructed a model of enzyme cost as a function of coenzyme specificity and aimed to study how the number of coenzymes affects the amount of protein enzymes needed to run metabolism (Fig. 4 and Materials and Methods).
Fig. 4.
A minimal model of coenzyme-utilizing enzyme cost predicts the emergence of coenzyme redundancy. (A) Model of a reaction network where multiple reactions (dashed arrows) produce and consume a charged coenzyme, where reaction flux is feasible with a single coenzyme (Top), compared to a model where reaction flux is partitioned between two coenzyme pools connected by a transhydrogenase reaction (Bottom). (B) A heat map of the minimum enzyme cost (scaled by the enzyme cost for a single-coenzyme model) as a function of various coenzyme ratios (x and y axis) for randomly sampled reaction fluxes and thermodynamic and kinetic parameters (Materials and Methods). The color scale denotes the minimum relative enzyme cost at each coenzyme ratio (log-scale), and the gray region denotes the region of space that is thermodynamically infeasible. Note that the dark blue regions denote the space of coenzyme ratios that enables enzyme cost to be less than the single coenzyme, and the red star denotes the point where the minimum enzyme abundance is achieved. (C) We simulated 100 random realizations and plotted the distribution of the minimal enzyme cost relative to the cost using a single coenzyme system (x axis) and found that all simulations were lower than unity. (D) The coenzyme ratio for coenzyme α (x axis) vs. coenzyme β (y axis) that minimizes total protein abundance were plotted for each realization as a two-dimensional histogram. (E) We varied the mean effective equilibrium constant (, x axis), and plotted the fractional reduction in minimal protein required (y axis) relative to a single coenzyme model (see for definitions and derivations).
A minimal model of coenzyme-utilizing enzyme cost predicts the emergence of coenzyme redundancy. (A) Model of a reaction network where multiple reactions (dashed arrows) produce and consume a charged coenzyme, where reaction flux is feasible with a single coenzyme (Top), compared to a model where reaction flux is partitioned between two coenzyme pools connected by a transhydrogenase reaction (Bottom). (B) A heat map of the minimum enzyme cost (scaled by the enzyme cost for a single-coenzyme model) as a function of various coenzyme ratios (x and y axis) for randomly sampled reaction fluxes and thermodynamic and kinetic parameters (Materials and Methods). The color scale denotes the minimum relative enzyme cost at each coenzyme ratio (log-scale), and the gray region denotes the region of space that is thermodynamically infeasible. Note that the dark blue regions denote the space of coenzyme ratios that enables enzyme cost to be less than the single coenzyme, and the red star denotes the point where the minimum enzyme abundance is achieved. (C) We simulated 100 random realizations and plotted the distribution of the minimal enzyme cost relative to the cost using a single coenzyme system (x axis) and found that all simulations were lower than unity. (D) The coenzyme ratio for coenzyme α (x axis) vs. coenzyme β (y axis) that minimizes total protein abundance were plotted for each realization as a two-dimensional histogram. (E) We varied the mean effective equilibrium constant (, x axis), and plotted the fractional reduction in minimal protein required (y axis) relative to a single coenzyme model (see for definitions and derivations).Given the uncertainty about many model parameters, we generated predictions based on extensive random sampling of key parameters, including the reaction network topology, the effective equilibrium constant for each reaction r (), the maximal turnover rates (), and the flux through each reaction (). In this formulation, the effective equilibrium constant aggregates the effects of the free energy change at standard molar conditions and the concentration differences between products and substrates that are not coenzymes, capturing the coenzyme-independent forces driving reaction r (). Motivated by the fact that specific concentration profiles in the flux modeling enable a viable single-coenzyme metabolism, we investigated how coupling reactions to multiple coenzymes influences enzyme cost at these fixed concentrations. Our model also relies on quantities related to the thermodynamic drive of each coenzyme as well as the flux partitioning between different coenzyme pools. These features are captured by the ratio of charged to uncharged species for various coenzyme pairs ( and the fraction of flux for reaction r through the NAD(H) coenzyme pool ( and NADP(H) pool (. We next used nonlinear optimization techniques to find the coenzyme concentration ratios (,) and the reaction fluxes through each coenzyme pool that minimized the total protein abundance in the network (Fig. 4 and ). We found that partitioning flux through two coenzyme pools decreased the minimum protein abundance required relative to the single coenzyme scenario (Fig. 4). Notably, at the optimum, each reaction flux was driven through primarily one coenzyme, indicating that enzyme specificity for a single coenzyme could emerge from the cellular-level objective of optimizing proteome allocation (). The emergence of enzyme specificity for one of the two coenzymes was driven by the combination of both kinetic and thermodynamic factors (), where reactions that required less protein to maintain the given flux demand were used to rebalance coenzyme pools, operating against the coenzyme concentration gradient ().To see if these results were generic for different realizations of parameters, we computed optimal flux distributions and coenzyme ratios for 100 random instances. The two-coenzyme model always achieved lower total protein cost than the one-coenzyme model (Fig. 4). For each simulation, we plotted the coenzyme ratios found to minimize the total protein abundance, and we found that all simulations resulted in optimal ratio sets with opposing thermodynamic drive, where one coenzyme pool was primarily in the charged state, and the other was primarily in the uncharged state (Fig. 4). This is strikingly consistent with the observation that in vivo measured ratios of oxidized to reduced NAD(H) and NADP(H) coenzymes are greater than and less than one, respectively (3, 8). Interestingly, our model also predicts that the largest reduction in enzyme cost comes when the mean effective equilibrium constants are closer to unity (Fig. 4), suggesting that two-coenzyme systems may be particularly beneficial if the coupled reactions are held closer to equilibrium.Motivated by the expectation that enzymes catalyzing reactions near equilibrium should display stronger coenzyme preference, we revisited results presented in Fig. 3 to see if protein-cost minimization could explain any of the enzymes that were predicted to have low sensitivity to coenzyme preference using FBA but displayed a high degree of conservation of key residues. We found that the NAD(H)-dependent histidinol dehydrogenase (encoded by hisD) catalyzes a reaction operating far from equilibrium (−26.6 kJ/mol) in physiological conditions. After swapping for NAD(P), the free energy rises to slightly below equilibrium (−1.4 kJ/mol), which translates to an approximately threefold increase in estimated protein cost to run this reaction (). Even after accounting for variable metabolite concentrations, the minimum protein cost increase would be ∼6%, indicating the protein cost considerations for coenzyme-catalyzed reactions could be a factor shaping coenzyme selectivity.
NAD(P)-Dependent Oxidoreductase Coenzyme Specificity Is Associated with Reaction Thermodynamics.
From the models presented above, we found that enzyme preference for specific coenzymes is strongly associated with thermodynamic drive, where enzymes catalyzing reactions near equilibrium are predicted to be more specific for one coenzyme over another (Fig. 5). To test this prediction, we computed free energies for all NAD(P)-coupled oxidoreductases in the KEGG database using eQuilibrator (35, 36) and computed binding preferences of protein sequences for NAD(H) and NADP(H) using an artificial neural network (37), which estimates the likelihood of FAD(H2), NAD(H), and NADP(H) binding in protein sequences containing Rossman folds. We noticed that the distribution of free energies for NAD(P)-coupled oxidoreductases was bimodal, where oxidoreductases using oxygen as a cosubstrate were driven far from equilibrium () compared to oxygen-independent oxidoreductases (; ), providing a simple way to categorize oxidoreductases by thermodynamic potential. We computed binding preferences for NAD(H) and NADP(H) (Fig. 5) and computed the proportion of Rossman folds that were single-coenzyme binders [e.g., NAD(H) or NADP(H)] or multicoenzyme binders [e.g., NAD(H) and NADP(H)]. Consistent with predictions from the theory, we found that 61% (2,139/3,497) of folds from oxygen-coupled NAD(P)-dependent oxidoreductases are expected to be multicoenzyme binders, compared to just 23% (28992/126718) from nonoxygen-coupled NAD(P)-dependent oxidoreductases (Fisher’s exact test: P < 10−5) and that the estimated binding preferences are more promiscuous for either NAD(H) or NADP(H) in nonoxygen-coupled NAD(P)-dependent oxidoreductases (Mann–Whitney U test: P < 10−5; ). Since many confounding factors may shape coenzyme dependencies for oxygen-dependent oxidoreductases, we corroborated these results by analyzing binding specificities for protein folds from oxygen-independent reactions only by comparing folds that catalyze reactions near equilibrium versus reactions far from equilibrium (Materials and Methods and ). We found that folds are more promiscuous for either NAD(H) or NADP(H) if the folds are used to catalyze far-from-equilibrium reactions versus protein folds used to catalyze near-equilibrium reactions (Mann–Whitney U test: P < 10−5; ).
Fig. 5.
NAD(P)-oxidoreductases catalyzing reactions far from equilibrium are predicted to be more coenzyme-promiscuous. (A) The relationship between the degree of flux partitioning between two coenzyme pools (x axis) and enzyme cost (y axis, rescaled by the maximum enzyme cost per model) at both low (red) and high (blue) coenzyme-independent driving forces (). Note that at high driving forces, the sensitivity of enzyme cost to flux partitioning is significantly less than at low driving forces. (B) The proportion of Rossman folds that were predicted using an artificial neural network model (37) to bind a single coenzyme [light gray bars, e.g., NAD(H) or NADP(H)] or multiple coenzymes [dark gray bars, e.g., NAD(H) and NADP(H)] are plotted for oxygen-dependent or independent NAD(H)/NADP(H) oxidoreductases (x axis). We found 61% of the folds from oxygen-dependent NAD(H)/NADP(H) oxidoreductases (2,139/3,497) were multicoenzyme binders, compared to just 23% among oxygen-dependent NAD(H)/NADP(H) oxidoreductases (28,992/126,718) (Fisher's exact test: P < 10−5). (C) The empirical cumulative distribution (eCDF) of binding specificities for NAD(H) or NADP(H) for protein folds catalyzing NAD(P)-coupled redox reactions (Materials and Methods). We partitioned folds based on whether the folds catalyzed reactions that either used (blue line) or did not use (red line) oxygen as a substrate or product. Specificity is lower for folds involved in O2-dependent reactions than in folds involved in O2-independent reactions (Mann–Whitney U test: P < 10−5). (D) Same as C, but just limited to folds involved in O2-independent reactions. We partitioned folds as far from equilibrium (blue line) if they catalyzed reactions with absolute standard molar free energies above 62.7 kJ/mol (top 10% of all free energies for all O2-independent reactions) or near equilibrium (red) if they catalyzed reactions with absolute standard molar free energies below 6.6 kJ/mol (bottom 10% off free energies for all O2-independent reactions). Specificity is lower for folds involved in far-from-equilibrium than in folds involved in near-equilibrium reactions (Mann–Whitney U test: P < 10−5).
NAD(P)-oxidoreductases catalyzing reactions far from equilibrium are predicted to be more coenzyme-promiscuous. (A) The relationship between the degree of flux partitioning between two coenzyme pools (x axis) and enzyme cost (y axis, rescaled by the maximum enzyme cost per model) at both low (red) and high (blue) coenzyme-independent driving forces (). Note that at high driving forces, the sensitivity of enzyme cost to flux partitioning is significantly less than at low driving forces. (B) The proportion of Rossman folds that were predicted using an artificial neural network model (37) to bind a single coenzyme [light gray bars, e.g., NAD(H) or NADP(H)] or multiple coenzymes [dark gray bars, e.g., NAD(H) and NADP(H)] are plotted for oxygen-dependent or independent NAD(H)/NADP(H) oxidoreductases (x axis). We found 61% of the folds from oxygen-dependent NAD(H)/NADP(H) oxidoreductases (2,139/3,497) were multicoenzyme binders, compared to just 23% among oxygen-dependent NAD(H)/NADP(H) oxidoreductases (28,992/126,718) (Fisher's exact test: P < 10−5). (C) The empirical cumulative distribution (eCDF) of binding specificities for NAD(H) or NADP(H) for protein folds catalyzing NAD(P)-coupled redox reactions (Materials and Methods). We partitioned folds based on whether the folds catalyzed reactions that either used (blue line) or did not use (red line) oxygen as a substrate or product. Specificity is lower for folds involved in O2-dependent reactions than in folds involved in O2-independent reactions (Mann–Whitney U test: P < 10−5). (D) Same as C, but just limited to folds involved in O2-independent reactions. We partitioned folds as far from equilibrium (blue line) if they catalyzed reactions with absolute standard molar free energies above 62.7 kJ/mol (top 10% of all free energies for all O2-independent reactions) or near equilibrium (red) if they catalyzed reactions with absolute standard molar free energies below 6.6 kJ/mol (bottom 10% off free energies for all O2-independent reactions). Specificity is lower for folds involved in far-from-equilibrium than in folds involved in near-equilibrium reactions (Mann–Whitney U test: P < 10−5).
Dephospho-Coenzyme A Might Be a Redundant Coenzyme for Acyl Transfer Reactions.
The simple phenomenological model presented above (Fig. 4) demonstrates that coenzyme redundancy (i.e., multiple coenzymes for the same group transfer reaction) may be a generic mechanism to reduce intracellular protein cost, potentially explaining the maintenance of both NAD(H) and NADP(H) as intracellular redox coenzymes. However, this simple model is agnostic to the chemical details of the group or electron transfer (apart from thermodynamic potentials), suggesting that protein cost can be minimized by simply adding additional coenzymes to the intracellular repertoire. Indeed, according to the model presented in Fig. 4, additional coenzyme redundancy decreases protein cost monotonically (), which potentially explains the vast degree of redundancy for protein-coded coenzymes like ferredoxins, glutaredoxins, and thioredoxins in microbial genomes (). This begs the question of why several coenzymes appear to exist as only one variant in the cell.One interesting example is coenzyme A (CoA), a molecule involved in the redistribution of intracellular acyl groups. We computed free energies for various group transfer reactions (Fig. 6) and found that the distribution of free energies for acyl transfer reactions were similar to NAD(P)-coupled reactions, suggesting multiple coenzyme systems may be also advantageous for CoA-coupled acetyl-transfer reactions. Interestingly, the last step of CoA biosynthesis resembles the last step of NADP biosynthesis, where a kinase acts on the ribosyl moiety of the coenzyme structure (Fig. 6). For the case of NADP+ biosynthesis, NAD kinase phosphorylates NAD+ at the 2′ hydroxyl of the ribosyl moiety to create NADP+. For the case of CoA, dephospho-CoA kinase phosphorylates the 3′ hydroxyl of the ribosyl moiety of dephospho-CoA to produce CoA. The strikingly similar reaction motif may indicate that the dephospho-CoA is an active and alternative coenzyme capable of performing similar functions as CoA, analogous to NAD+ and NADP+. Interestingly, mass spectrometry experiments have shown that dephospho-CoA can be acylated in rat liver (38). We thus propose that dephospho-CoA may be a redundant coenzyme for acyl-transfer reactions rather than simply serving as an intermediate during CoA biosynthesis.
Fig. 6.
Comparison between NAD(P) and CoA biosynthesis reveals potential redundancy for acyl-CoAs. (A) Free energies for group transfer reactions of phosphate groups, acetyl groups, and electron transfer reactions in KEGG were computed using eQuilibrator (35, 36), and the empirical cumulative distributions across reactions were plotted. Free energies for all group and electron transfer reactions are similarly distributed near equilibrium, suggesting that CoA-coupled reactions may also depend on redundant coenzyme systems similar to NAD(H)/NADP(H). Free energies for redox and group transfer reactions were computed for the following coenzyme pairs. Phosphate-coupled reactions: ADP/ATP, CDP/CTP, GDP/GTP; acetyl-coupled reactions: CoA/Acetyl-CoA; electrons (220 mV): FAD/FADH2, FMN/FMNH2, glutathione disulfide/glutathione; electrons (320 mV): NAD/NADH, NADP/NADPH. (B) The last biosynthetic step of NADP+ synthesis is the phosphorylation of the 2′ hydroxyl group on the ribose moiety of NAD+, compared to the biosynthesis of CoA, where the 3′ hydroxyl of the ribose moiety is phosphorylated by dephospho-CoA kinase (E.C. 2.7.1.24).
Comparison between NAD(P) and CoA biosynthesis reveals potential redundancy for acyl-CoAs. (A) Free energies for group transfer reactions of phosphate groups, acetyl groups, and electron transfer reactions in KEGG were computed using eQuilibrator (35, 36), and the empirical cumulative distributions across reactions were plotted. Free energies for all group and electron transfer reactions are similarly distributed near equilibrium, suggesting that CoA-coupled reactions may also depend on redundant coenzyme systems similar to NAD(H)/NADP(H). Free energies for redox and group transfer reactions were computed for the following coenzyme pairs. Phosphate-coupled reactions: ADP/ATP, CDP/CTP, GDP/GTP; acetyl-coupled reactions: CoA/Acetyl-CoA; electrons (220 mV): FAD/FADH2, FMN/FMNH2, glutathione disulfide/glutathione; electrons (320 mV): NAD/NADH, NADP/NADPH. (B) The last biosynthetic step of NADP+ synthesis is the phosphorylation of the 2′ hydroxyl group on the ribose moiety of NAD+, compared to the biosynthesis of CoA, where the 3′ hydroxyl of the ribose moiety is phosphorylated by dephospho-CoA kinase (E.C. 2.7.1.24).
Discussion
The evolution of coenzyme coupling in redox biochemistry constitutes a fascinating puzzle central to the emergence and complexification of life itself (24, 39). In order to understand the constraints and the selection pressure that led to the enzyme–coenzyme couplings observed today, one would ideally shuffle all of these couplings (e.g., by reassigning coenzymes to different enzymes) and determine fitness changes under many different environments. While such an endeavor would be very challenging experimentally, it can be pursued efficiently using both core and genome-scale stoichiometric models of metabolism. In particular, we used flux balance modeling to survey dozens of in silico mutants of oxidoreductase enzymes in thousands of environmental conditions, explicitly modeling the consequences of changing an enzyme’s coenzyme preference on growth rate. Our analysis suggests that few enzymes are universally constrained to use NAD(H) or NADP(H) and that most dependencies are only observable in specific environments. Future experimental work could test these predictions by altering coenzyme specificity for some of these oxidoreductases and measuring growth rate in various media conditions (13).Our flux modeling results suggest that thermodynamic constraints play a role in shaping the preferences of just a small number of oxidoreductases for specific coenzymes. Structural and bioinformatics analyses of pdxB suggest that these thermodynamic constraints can create strong selective pressures shaping protein sequence evolution, as evidenced by our observation that all pdxB orthologs contain a conserved residue that specifically coordinates NAD(H), and potentially sterically prevents the binding of NADP(H) (Fig. 3). However, our results also suggest that residues conferring selectivity for NAD(H) or NADP(H) in oxidoreductases are well conserved (Fig. 3 ), indicating that constraint-based modeling may not adequately capture the principal factors driving enzyme specificity for individual coenzymes.The model presented in Fig. 4 suggests that the emergence of widespread enzyme specificity toward one of many coenzymes could be induced by a selective pressure to minimize the total abundance of enzymes in the cell. Importantly, our model does not rely on chemical details specific to nicotinamide coenzymes or electron transfer and could potentially explain the ubiquity of multicoenzyme systems (e.g., ) in cellular metabolism. Notably, this model suggests that enzyme specificity toward one of the two coenzymes may be a generic strategy to reduce the total protein cost of oxidoreductase enzymes and that coenzyme choice is shaped by both kinetic and thermodynamic factors. The quantitative mechanistic link that our model establishes between cofactor redundancy and cellular fitness could be directly tested through physiological measurements of kinetic constants, metabolite abundances, and thermodynamic potentials of all NAD(P)-dependent oxidoreductases in an organism. Additionally, feasibility of an NAD(H)-only metabolism supports the hypothesis that ancient metabolic networks may have originally relied on NAD(H) only and gradually evolved NADP(H) dependencies. Future computational work could further assess the feasibility of an NAD(H)-only metabolic network by analyzing metabolite concentration bounds and comparing them to physiological constraints or by analyzing the feasibility of NAD(H)-only metabolic pathways beyond those found in E. coli, and thought to be important for ancient metabolism. While it has been proposed that an NADP(H)-dependent isocitrate dehydrogenase evolved from an ancestral NAD(H)-dependent enzyme in prokaryotes (14), future efforts to reconstruct ancestral sequences from various NAD(P)-dependent oxidoreductases can explore whether this is a general feature of NADP-coupled enzymes (40).The flux analysis presented in this study only considers stoichiometric and thermodynamic factors shaping cell fitness. However, other constraints or metabolic demands exist that may constrain oxidoreductase coenzyme preference. For example, our flux models may not accurately capture metabolic demands induced by oxidative stress, as NADPH is the primary electron donor for enzymatic detoxification of reactive oxygen species and other damaging oxidants (2). Specificity may be shaped through mechanisms not directly linked to stoichiometric balancing or thermodynamics, including epistatic mechanisms, where the sequence variation in the coenzyme binding site indirectly influences cosubstrate specificity, kinetics (41), or thermostability (42). Nonetheless, our results suggest that thermodynamics and enzyme cost minimization are important factors that can constrain the oxidoreductase specificity for NAD(H) or NADP(H).In addition to offering an explanation for the occurrence of multicoenzyme systems, our model can be used for the targeted search for unexplored redundant coenzymes. Our results point to the intriguing possibility that dephospho-CoA might serve as an acyl-transfer coenzyme rather than simply an intermediate during CoA and could in principle explain the occurrence of acyl-dephospho-CoAs (38). More broadly, the principle that enzyme cost minimization universally promotes coenzyme redundancy may help guide the interpretation of molecular diversity in the biosphere (43), as well as serve as a design principle for metabolic engineering and synthetic biology applications.
Materials and Methods
Software Availability.
Code is available in the following GitHub repository: https://github.com/jgoldford/coenzymes.
Metabolic Modeling of Oxidoreductase Specificity.
We computationally accessed the fitness consequences of rewiring the NAD(P)H specificity of individual metabolic genes in E. coli by altering stoichiometric and thermodynamic constraints in a GEMM and by simulating the growth rate under various media conditions. We used the GEMM iJO1366 (29) downloaded from the BiGG database (46), with Gibbs free energies derived using the component contribution method (47), using the eQuilibraitor application programming interface (API) at pH 7 and ionic strength 0.1 M (35, 36). For each oxidoreductase gene that mapped to a reaction in iJO1366, we altered the coenzyme preference by first changing the stoichiometric coefficient in that reaction to the alternative coenzyme. For genes encoding reactions that utilized both coenzymes, we altered specificity by forcing all reactions to use only NAD(H) or NADP(H). To model the thermodynamic effect of changing coenzymes specificities, we used the computed reaction free energies at standard molar conditions and bounds on intracellular metabolite concentrations to estimate the maximum and minimum driving force obtainable for each reaction using the following linear program:where is the vector of stoichiometric coefficients for reaction , is the change in free energy at standard molar conditions for reaction is the gas constant, is temperature in Kelvin, and denotes the vector of natural log-transformed metabolite concentrations. Note that we fixed the concentration of coenzymes based on previously measurements, where M, M, M, and M. We then allowed all other metabolite concentrations to vary when predicting the maximum and minimum driving forces for each reaction. After each in silico mutation that changed coenzyme preferences, we recomputed the minimum and maximum driving force and used these values to set the upper and lower bounds of reactions, such thatandwhere and are the updated lower and upper bounds for reaction r, respectively. We simulated the consequences of changing oxidoreductase enzyme specificity for NAD(P)H in E. coli by then computing the maximum growth rate using flux balance analysis (28) under 117,180 conditions, consisting of various carbon sources, nitrogen sources, and electron acceptors (oxygen, nitrite, nitrate, TMAO, DMSO, and fumarate) as well as fermentative growth, using the following linear program:where is the altered stoichiometric matrix, flux lower bound, and flux upper bound after updating stoichiometric and thermodynamic constraints. Note that for the NAD(H)-only metabolism presented in Fig. 2 , we performed the same procedure described above but for all NADPH-coupled reactions, and under a range of different coenzyme ratios (Fig. 3 ). For these simulations, we removed NAD kinase (NADK), NAD phosphatase (NADPPPs), NAD diphosphatase (NADDP), and NAD transhydrogenase (NADTRHD) from the model and performed FBA in aerobic conditions with glucose as the sole carbon source.For results presented in Fig. 2, we restricted the analysis to the media sets where growth of the unperturbed network was feasible (109,521 conditions). We modeled mutations for 76 genes encoding oxidoreductases in E. coli. This set of genes was determined using the following criteria: 1) the gene encoded an enzyme that could be classified using the Enzyme Commission code E.C. 1.X.1.Z, where X was any number except 6 and Z = any number and 2) there were no other genes coding for enzymes that catalyze the same reaction. All simulations were performed using CobraPy and Gurobi (version 9.0.0) optimizer (48).Thermodynamic flux balance analysis was performed using pyTFA (49), with the reduced core E. coli metabolic model irJO1366 () (50) and with the genome-scale model iJO1366 (). We modified both the core model, irJO1366, and the genome-scale model, iJO1366, by substituting all NADP(H)-coupled reactions with NAD(H)-coupled reactions and removed the NAD(P)H transhydrogenase reaction and performed TMFA as previously described (49). To run TFBA on the genome-scale model iJO1366, we used the relax_dgo subroutine from pyTFA to find minimal free energy relaxations to achieve feasible growth, which are provided in .
Oxidoreductase Specificity from Compiled Structural Data and Sequences.
We compiled a list of NAD(P)-dependent oxidoreductases in the E. coli genome and manually searched for structural features that confer selectivity of NAD(H) vs. NADP(H). To this end, we identified NAD(P) binding orthogroups (KO) using the KEGG REST API and developed a custom Python script to identify sequence of NAD(P)-bound structures from the PDB within each orthogroup. For each orthogroup that contained at least one NAD(P)-bound structure for the E. coli ortholog (or an ortholog with at least 30% homology), we manually identified residues that clearly showed hydrogen bonding with the 2′ and 3′ hydroxyl group of the ribosyl moiety in NAD(H), or interactions with the 2′ phosphate in NADP(H).We first analyzed the conservation of the residues conferring specificity in orthologs of pdxB (K03473), the gene that codes for erythronate-4-phosphate dehydrogenase. We downloaded all orthologs in the KEGG database for pdxB (orthogroup K03473, n = 1087), and performed multiple sequence alignment on the coenzyme-binding Rossman fold. We identified a 42-amino-acid subsequence in the E. coli genome using HMMR 3.1 as the Rossman fold in the pdxB gene (b2320) from E. coli MG1655 and used the biopython “pairwise2” module to perform local alignment between the reference Rossman fold and the ortholog sequence. Parameters were chosen such that gap opening and extension were penalized much more than nonmatching characters (matched character: 1; unmatched character: 0; gap opening penalty: −10; gap extension penalty: −1). For each ortholog, we identified the subsequence with the largest score and labeled this subsequence as the Rossman fold for each ortholog. The sequences were then aligned using MUSCLE v3.8.31 using default parameters.To compare other genes to pdxB, we identified 25 genes encoding enzymes in the iJO1366 model with structural evidence supporting the coordination of a single aspartic acid or glutamic acid residue and the 2′ and 3′ hydroxyl group of the ribosyl moiety in NAD(H) (). For orthogroups with >1,000 orthologs, we downsampled the sequences to 1,000 before identifying running multiple sequence alignment.We annotated residues that confer coenzyme selectivity for 57 NAD(P)-coupled oxidoreductases in E. coli based on structures available in the PDB (). We performed multiple sequence alignment using MUSCLE and computed the Shannon entropy, , at each residue position and classified a residue as conferring selectivity to NAD(H) vs. NADP(H), such thatand is the proportion of sequences with amino acid residue i in position j. Prior to computing distributions of entropy across residues that confer selectivity to either NADH or NADPH, we removed all residues with >50% gaps, as these represented poorly aligned regions of the protein sequence.In Fig. 5 , we computed coenzyme binding likelihoods using an artificial neural network model called Cofactory (37). Briefly, Cofactory first identifies the Rossman folds in a protein sequence using HMMR 3.1 and estimates the likelihood each fold binds FAD(H2), NADH, and NADPH. We identified 779 orthogroups that either use oxygen as a cosubstrate and downloaded all sequences within each orthogroup from the KEGG REST API (51). Like in the previous case, we down sampled the sequences to 1,000 if the orthogroup contained more than 1,000 sequences before running Cofactory. After obtaining predictions of binding likelihoods to NAD(H), NADP(H), and FAD(H2), we filtered out Rossman folds that either 1) unambiguously bound FAD(H2) or 2) were not predicted to bind any of the coenzymes. We then categorized the remaining Rossman folds as “multicoenzyme binders” if the binding likelihood exceeded 0.5 for more than one coenzyme and “single coenzyme binders” if the binding likelihood exceeded 0.5 for either NAD(H) or NADP(H), but not both. We estimated coenzyme specificities for each NAD(P)-oxidoreductase with a Rossman fold by computing binding likelihoods for NAD(H) and NADP(H) using the following formula:
Phenomenological Modeling of Proteome Cost for Multicoenzyme Systems.
A detailed derivation of the phenomenological model used to analyze protein cost for multicoenzyme systems in Fig. 4 is provided in . Here we present the final model used in the main text. The following nonlinear optimization problem was constructed to find the minimal protein cost by varying coenzyme concentration ratios and flux partitioning, such thatwhere sampled parameters are the following: is the effective equilibrium constant for reaction r, is the maximum turnover rate for reaction r, is the net flux for reaction r, and is the stoichiometric coefficient for the charged coenzyme (negative if the charged coenzyme is being consumed and is positive if being produced). Variables in our optimization approach are the following: is the ratio of charged to uncharged species for coenzyme pair , and is the fraction of flux from reaction r through the coenzyme pool . In this model, we also include an additional set of reactions e, which are generalized exchange reactions that shuttle groups between coenzymes, much like NAD(P)H transhydrogenase enzymes. The objective function is the total enzyme cost of the coenzyme-coupled subnetwork plus the cost of all generalized exchange reactions (see for derivation). All enzyme cost simulations were performed in MATLAB R2021a, using the COBRA toolbox (48) and GLPK solver for linear programs.
Authors: Jan Schellenberger; Richard Que; Ronan M T Fleming; Ines Thiele; Jeffrey D Orth; Adam M Feist; Daniel C Zielinski; Aarash Bordbar; Nathan E Lewis; Sorena Rahmanian; Joseph Kang; Daniel R Hyduke; Bernhard Ø Palsson Journal: Nat Protoc Date: 2011-08-04 Impact factor: 13.491
Authors: Avi Flamholz; Elad Noor; Arren Bar-Even; Wolfram Liebermeister; Ron Milo Journal: Proc Natl Acad Sci U S A Date: 2013-04-29 Impact factor: 11.205