Szabolcs Cselgő Kovács1,2, Balázs Szappanos1,2,3, Roland Tengölics1,2, Richard A Notebaart4, Balázs Papp1,2. 1. HCEMM-BRC Metabolic Systems Biology Lab, Szeged, Hungary. 2. Biological Research Centre, Institute of Biochemistry, Synthetic and Systems Biology Unit, Eötvös Loránd Research Network (ELKH), Szeged, Hungary. 3. Department of Biotechnology, University of Szeged, Szeged, Hungary. 4. Food Microbiology, Wageningen University & Research, Wageningen, The Netherlands.
Abstract
MOTIVATION: Bioproduction of value-added compounds is frequently achieved by utilizing enzymes from other species. However, expression of such heterologous enzymes can be detrimental due to unexpected interactions within the host cell. Recently, an alternative strategy emerged, which relies on recruiting side activities of host enzymes to establish new biosynthetic pathways. Although such low-level 'underground' enzyme activities are prevalent, it remains poorly explored whether they may serve as an important reservoir for pathway engineering. RESULTS: Here we use genome-scale modelling to estimate the theoretical potential of underground reactions in engineering novel biosynthetic pathways in Escherichia coli. We found that biochemical reactions contributed by underground enzyme activities often enhance the in silico production of compounds with industrial importance, including several cases where underground activities are indispensable for production. Most of these new capabilities can be achieved by the addition of one or two underground reactions to the native network, suggesting that only a few side activities need to be enhanced during implementation. Remarkably, we find that the contribution of underground reactions to the production of value-added compounds is comparable to that of heterologous reactions, underscoring their biotechnological potential. Taken together, our genome-wide study demonstrates that exploiting underground enzyme activities could be a promising addition to the toolbox of industrial strain development. AVAILABILITY: All scripts and metabolic network reconstructions used in this work are available on GitHub (https://github.com/pappb/Kovacs-et-al-Underground-metabolism). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Bioproduction of value-added compounds is frequently achieved by utilizing enzymes from other species. However, expression of such heterologous enzymes can be detrimental due to unexpected interactions within the host cell. Recently, an alternative strategy emerged, which relies on recruiting side activities of host enzymes to establish new biosynthetic pathways. Although such low-level 'underground' enzyme activities are prevalent, it remains poorly explored whether they may serve as an important reservoir for pathway engineering. RESULTS: Here we use genome-scale modelling to estimate the theoretical potential of underground reactions in engineering novel biosynthetic pathways in Escherichia coli. We found that biochemical reactions contributed by underground enzyme activities often enhance the in silico production of compounds with industrial importance, including several cases where underground activities are indispensable for production. Most of these new capabilities can be achieved by the addition of one or two underground reactions to the native network, suggesting that only a few side activities need to be enhanced during implementation. Remarkably, we find that the contribution of underground reactions to the production of value-added compounds is comparable to that of heterologous reactions, underscoring their biotechnological potential. Taken together, our genome-wide study demonstrates that exploiting underground enzyme activities could be a promising addition to the toolbox of industrial strain development. AVAILABILITY: All scripts and metabolic network reconstructions used in this work are available on GitHub (https://github.com/pappb/Kovacs-et-al-Underground-metabolism). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Due to our strong dependency on fossil materials, there is a growing interest for the development of sustainable ways to replace them and to provide alternative approaches to produce chemical compounds with application in fields such as medicine (pharmaceuticals), cosmetics, materials (e.g. bioplastics) and food. One promising approach is the use of microbial cell factories to produce such value-added compounds. More specifically, the metabolic network of these species can be redesigned and engineered to optimize the production of the desired compounds (Hadadi and Hatzimanikatis, 2015; Ko ; Nielsen and Keasling, 2016; Wang ; Yim ).Several metabolic engineering approaches have been developed to improve the production of value-added compounds (Okano ; Pontrelli ). Up till now, the most widely applied approach is the introduction and expression of heterologous metabolic reactions (i.e. reactions from a different organism), which generate heterologous biosynthetic pathways that enable the production of non-native value-added compounds by the host organism (Lechner ; Pickens ). Several computational approaches have been proposed to facilitate pathway engineering through heterologous reactions using genome-scale metabolic modeling (Ko ; Wang ). Genome-scale metabolic models are available for the most common microorganisms used in metabolic engineering and they can accurately predict how the addition of heterologous enzymes affect metabolite production (Pharkya ). Although the vast repertoire of available heterologous reactions provide a huge potential for value-added compound production, there are several limitations. First, heterologous enzymes often require specific cofactors which cannot be provided by the host organism hindering the biosynthetic pathway of its proper working (Boynton ). Second, heterologous expression could lead to stress response due to protein overproduction or accumulation of toxic intermediates (Gill ; Martin ). Third, microorganisms containing heterologous reactions are considered as GMO, whose commercial application, especially in food industry, relies on the juristic regulations of countries.One possible approach to overcome these limitations is to exploit what is known as the underground metabolism, i.e. the collection of enzyme side activities in a cell (D’Ari and Casadesús, 1998; Fig. 1). In addition to their native activities, most enzymes display weak side activities by which substrates are turned into products, albeit at low rates, due to the limited substrate specificity of the enzyme. Although the physiological effect of such underground reactions is mostly neglected given their inefficient kinetics, they can often be enhanced by only a few mutations (Aharoni ; Khersonsky and Tawfik, 2010; Notebaart ). Such genetic changes allow underground metabolism to contribute to microbial growth and adaptation to novel nutrient conditions (Cam ; King ; Notebaart ; 2018). Although underground activities have been utilized for pathway development in case studies, no comprehensive work has yet explored the biotechnological potential of the underground metabolic network (Notebaart ; Rosenberg and Commichau, 2019). In particular, it remains unexplored whether biochemical reactions catalyzed by enzyme side activities are as likely to contribute to new biosynthetic pathways toward value-added compounds as those catalyzed by heterologous enzymes.
Fig. 1.
Utilization of underground activities as an alternative to incorporating heterologous enzymes. The schematic figure depicts two strategies to enhance or establish the production of industrially relevant chemicals in microorganisms. The conventional strategy is to introduce one or more heterologous enzymes from a different organism (cell with long flagella, blue) to create a heterologous pathway capable of producing the value-added compound (lower part). A less well explored alternative is to amplify existing low-activity underground reactions (upper part, orange arrow) catalyzed by the endogenous enzymes of the organism (A color version of this figure appears in the online version of this article)
Utilization of underground activities as an alternative to incorporating heterologous enzymes. The schematic figure depicts two strategies to enhance or establish the production of industrially relevant chemicals in microorganisms. The conventional strategy is to introduce one or more heterologous enzymes from a different organism (cell with long flagella, blue) to create a heterologous pathway capable of producing the value-added compound (lower part). A less well explored alternative is to amplify existing low-activity underground reactions (upper part, orange arrow) catalyzed by the endogenous enzymes of the organism (A color version of this figure appears in the online version of this article)Here, we aim to systematically assess the extent to which underground metabolism can increase the production of industrially important compounds. As genome-scale computational modeling of underground metabolism successfully predicts the potential to utilize new nutrient sources in Escherichia coli (Notebaart ), we reasoned that a similar approach could be employed to characterize the theoretical potential of underground reactions to produce industrially relevant chemical compounds. To this end, we integrated experimentally reported and predicted underground reactions into a genome-scale metabolic reconstruction of E. coli (Orth ) and characterized its biosynthetic properties across 64 different industrially important compounds spanning diverse applications, such as health, material sciences, chemical industry and food. Our computational analyses demonstrate that underground enzyme activities frequently enhance the production of value-added compounds with industrial importance, which is comparable to heterologous enzyme activities that enhance such value-added compounds. Notably, we find that activation of only a single underground reaction can already substantially increase the production of important industrial compounds, such as a precursor of bioplastics (3-hydroxypropanoate) and a potential biofuel (1-butanol). Overall, our work reveals that the underground metabolic network provides a promising metabolic engineering tool to improve the production of both native and non-native value-added compounds.
2 Materials and methods
2.1 Reconstruction of the E. coli underground metabolism version 2.0
To study the industrial potential of the underground metabolic network, we extended a previously reconstructed underground metabolic network of E. coli K-12 MG1655 (Notebaart ) with additional underground reactions. These additional underground reactions were in silico predicted by the PROPER algorithm and 20% of them were validated experimentally (Oberhardt ). In brief, these underground activities were predicted to be associated with an enzyme based on its sequence similarity to orthologous enzymes (Oberhardt ). We next integrated the set of underground reactions into the native E. coli genome-scale metabolic model iJO1366 (Orth ). We removed duplicate reactions and ‘perpetuum mobile’ cycles, that is, flux distributions that are able to produce energy without the presence of available nutrients (Fritzemeier ). Note that underground reactions u0291 and u0227 were adjusted to contain 1-propanol instead of isopropanol according to the literature references in the Brenda database. The resulting metabolic model (hereby termed underground model) contains 3146 reactions and 2172 metabolites, of which 563 reactions and 367 metabolites are not part of the native model, respectively. The reconstruction is available as a computational SBML model (Supplementary File S1, also downloadable from https://github.com/pappb/Kovacs-et-al-Underground-metabolism).
2.2 Compiling a list of value-added compounds
To get a broad view of the potential of underground metabolism in industrial bioproduction, we compiled a comprehensive list of value-added compounds (Supplementary Dataset S1). A compound was considered to hold a value if it fulfills a role in industrial processes such as precursor, building block, reagent, solvent or final product. Most compounds are adapted from the Standard Industrial Classification Manual of United States Department of Labor (Industrial Organic Chemicals, Industry Group 286; Major Group 28: Chemicals and Allied Products | Occupational Safety and Health Administration). The other major source is an exhaustive report of Werpy and Petersen presenting the most wanted value-added compounds whose sustainable production is a crucial issue (Werpy and Petersen, 2004). In addition, the list was extended by individual studies that attempted bioproduction of value-added compounds (Supplementary Dataset S1). Because we are interested in compounds that are known as metabolites and because the association of compounds with their representation in the model is based on their KEGG database IDs (Kanehisa ), we excluded compounds without a KEGG ID. This resulted in a list of 160 value-added compounds that qualify as metabolites and used for subsequent analyses (see Supplementary Dataset S1).
2.3 Defining in silico media
We simulated the growth of E. coli in M9 minimal medium adapted from a glucose minimal medium of a previous model (Feist ). In order to estimate the influence of nutrients, we run our yield calculations on seven different carbon sources: d-glucose (Bhatia ), glycerol (Clomburg and Gonzalez, 2013), d-xylose (Bhatia ), d-fructose (Aristidou ), l-fucose (Kim ), l-arabinose (Bhatia ) and acetate (Wu ). These carbon sources are either already used or suggested to be used in industrial compound production by prior works. In our simulations, we enabled the uptake of only a single carbon source at a time.
2.4 Calculating the production yield of value-added compounds
To calculate the production yield of value-added compounds we applied Flux Balance Analysis (FBA) (Orth ). First, we constrained the biomass production to 10% of its maximal value to ensure that cellular growth is not completely halted by the compound production. Second, we extended the model with a virtual ‘demand’ reaction that has the value-added compound as the substrate and no product to mimic the industrial extraction of the compound. Third, we run FBA simulation to maximize the flux on the demand reaction, i.e. to maximize the production of the value-added compound. The flux of the demand reaction equals to the production rate of the target compound. Finally, we calculated the Maximum Theoretical Yield (shortened as yield) as a metric to assess the efficiency of predicted biosynthetic pathways according to the following equation (Campodonico ):We defined biotechnologically relevant (i.e. significant) yield increments in the underground metabolic model as those that enhanced the theoretical yield of the native model by at least 5%. The same criterion was applied when determining the impact of adding heterologous reaction sets to the native model. Although this threshold is arbitrary, our results were rather insensitive to the actual value (see Section 3).
2.5 Determining the minimum number of underground reactions needed for increasing the production yield
We used a mixed integer linear programming (MILP) algorithm inspired by our previous study and the Optcouple algorithm to identify minimal underground reaction sets that are directly involved in the production of value-added compounds (Jensen ; Szappanos ). Our algorithm looked for a minimal set of underground reactions sufficient for yield increment for each target compound. Although there might be multiple such minimal reaction sets we collected only one for each target compound as an example reaction set and to determine the minimal number of underground reactions necessary for yield increment. We excluded those value-added compounds from the MILP analysis where the production yield is enhanced by less than 5% by underground reactions. The basis of the MILP problem was the steady-state assumption:
where S is the stoichiometric matrix and v is the flux vector for all reactions. The reactions of the model were distinguished according to their origin: native or underground. The flux constraints on the E. coli reactions were the same as in FBA:
where v is the flux of a reaction, LB and UB is the lower and upper flux bounds, respectively.We interpreted our goal as a dual optimization problem. While the primal objective function was the flux maximization of a value-added compound, the dual objective was the minimization of the number of active underground reactions. The activation of underground reactions was controlled with binary variables:
where b is a binary variable, i is the index of an underground reaction, B is the set of binary variables and N’ is the number of underground reactions. The binary variable b tells whether the underground reaction r’ (i = 1, …, N’) is active (b = 1) or not (b = 0). The following equations ensure these rules:
where v’ is the flux and UB’ is the maximal possible flux of underground reaction r’, while is the minimal non-zero flux value (in our calculations ). Reversible underground reactions of the underground network were decomposed into two opposing irreversible reactions. This way the fluxes of the underground reactions can only take positive values, which is a prerequisite for MILP. In addition, to avoid having two opposing reactions derived from the same reversible reaction being active simultaneously we introduced the following constraint:To solve the first optimization problem, we calculated the maximal production rate for each value-added compound (see Section 2.4). Next, for each value-added compound we constrained the flux of the demand reaction of the value-added compound to be equal to its maximum value to ensure the maximal production of the compound.Finally, the second objective of the MILP problem was to minimize the active underground reactions:The result of this optimization is the minimum number of underground reactions whose collective presence is required for the enhanced value-added compound production.
2.6 Statistical comparison of underground and heterologous reactions
To compare the industrial potentials of underground and heterologous enzyme activities, we extended the native E. coli metabolic model iJO1366 (Orth ) with a large set of heterologous reactions (hereby termed heterologous model). Data on heterologous reactions were obtained from the MetaCyc database (version 22.5), a comprehensive database of metabolism from all domains of life (Caspi ). All MetaCyc reactions not present in the E. coli native network were considered as heterologous. Next, we discarded heterologous reactions that were not mass-balanced or contained metabolites without a KEGG ID association (Kanehisa ). We used the associated KEGG IDs to identify compounds that are already present in the native E. coli network. We also added the remaining non-native compounds to the metabolic model and integrated the heterologous reactions into the model. After that, we removed duplicate reactions and ‘perpetuum mobile’ cycles, that is, flux distributions that are able to produce energy without the presence of available nutrients (Fritzemeier ). Finally, to make the heterologous model comparable to the underground model, we added further exchange and native reactions that were also incorporated into the underground model based on literature evidence (Notebaart ). The resulting heterologous metabolic model contains 8050 reactions and 5538 metabolites, whereof 5686 reactions and 3495 metabolites are not part of the native model, respectively. The heterologous model is available as a computational Systems Biology Markup Language (SBML) model (Supplementary File S2, https://github.com/pappb/Kovacs-et-al-Underground-metabolism).Since the set of collected heterologous reactions is much larger than the set of underground reactions, we generated 1000 heterologous subsets. Each subset contained a random set of heterologous reactions chosen from the heterologous model. The number of heterologous reactions in each random set matched the number of underground reactions in the underground model. For each of the 1000 heterologous subsets, we then calculated the maximum theoretical yield and yield increments of the value-added compounds in the same way as we did for the underground model (see Section 2.4). Out of the 160 value-added compounds with KEGG IDs, 110 were present in the heterologous model either as native or heterologous metabolites and therefore the heterologous production of these 110 compounds were tested. Calculations with the heterologous subsets were done only on glucose minimal medium to reduce computational time.The combined effect of underground and heterologous reactions was assessed using a model reconstruction that contain both the heterologous and underground reactions. This model is based on the heterologous model described above and extended with the underground reaction sets (the model is available as SBML file in Supplementary File S3 and under the following GitHub link: https://github.com/pappb/Kovacs-et-al-Underground-metabolism). Yield increments were calculated for this combined model under glucose minimal medium as described in Section 2.4.
2.7 Software and computation used in metabolic simulations
All simulations were implemented in python3.6 (Python Software Foundation. Python Language Reference, version 3.6. Available at http://www.python.org) using cameo (Cardoso ) and cobrapy (Ebrahim ) python packages for constraint-based modeling. As an optimizer for linear programming and MILP we used GUROBI 8.1 (Gurobi Optimization, LLC, 2021). The linear programming was done on a 64-bit Ubuntu Linux system with an Intel Core-i7 quad core processor. MILP problems were solved on a Red Hat Enterprise Linux Server release 6.2 with 96 Intel Xeon central processing units.
3 Results
3.1 Reconstructing an expanded model of E. coli underground metabolism
To achieve high coverage of potential underground reactions of E. coli, we extended our previously published list of experimentally reported E. coli underground activities (Notebaart ) with predicted ones and incorporated them into the iJO1366 genome-scale metabolic reconstruction of this species (Orth ). Specifically, we utilized the predictions of the PROPER algorithm, which assigns native activities of homologous enzymes from other bacteria as underground activities to E. coli enzymes (Oberhardt ). Importantly, this algorithm achieved good overlap with multicopy suppression studies in E. coli where the over-expression of a ‘replacer’ gene rescues an otherwise lethal loss-of-function mutation.Our updated underground network contains 543 underground reactions, which is approximately 20% of the number of native reactions in the genome-scale model. By incorporating the underground reactions into the E. coli metabolic network, we also introduced 311 novel compounds that are not present in the native network. Hence, the extended metabolic network model contains 22% more metabolites than the native model (Fig. 2, Supplementary Dataset S1). Overall, the updated underground metabolic network reconstruction contains 107% and 12% more reactions and metabolites, respectively, compared to the previous version, indicating a substantial increase in coverage.
Fig. 2.
Reconstructing an expanded underground metabolic network of E. coli. First, the native E. coli metabolic model was extended with both experimentally validated (Notebaart ) and predicted (Oberhardt ) underground reactions. Next, value-added compounds were collected from the literature and those that are present in the model were kept
Reconstructing an expanded underground metabolic network of E. coli. First, the native E. coli metabolic model was extended with both experimentally validated (Notebaart ) and predicted (Oberhardt ) underground reactions. Next, value-added compounds were collected from the literature and those that are present in the model were keptUnderground reactions do not necessarily have the potential to carry flux due to missing connections to the rest of the metabolic network. Nevertheless, we found that 79% of the compiled underground reactions are either fully or partially connected to the native network (i.e. either the substrates or the products, or both are present in the native network). This result suggests that many underground reactions may have the potential to participate in novel biosynthetic pathways.
3.2 Underground reactions can often increase the yield of value-added compounds
To explore the potential utility of underground reactions for bioproduction of desired chemicals, we compiled a list of value-added compounds from literature. As we are interested in bioproduction, we focused on compounds that have been described as metabolites according to the KEGG database (Kanehisa ). Our list includes not only the desired end products (e.g. biofuels) but also precursors, building blocks and relevant reagents and solvents as well (see Section 2). Out of the 160 value-added metabolites collected, 64 (40%) are present in the extended metabolic network reconstruction and thus have the potential to be produced by E. coli. We excluded 20 compounds because they cannot be produced in any of the simulated media even when all underground reactions are available for biosynthesis. The remaining set of 44 value-added compounds consists of 42 native metabolites that are already present in the native E. coli network and 2 novel metabolites that are participating in underground reactions only. Importantly, 16 out of the 44 target-compounds are among the top 30 most sought-after value-added compounds to be produced from renewable carbon sources (Werpy and Petersen, 2004). For example, ethylene glycol is used as antifreeze and also as a building block for plastics (Curme Jr and Young, 1925; Liu ).Underground reactions may contribute to the production of value-added compounds in two ways. First, they may open up more efficient biosynthetic pathways for a compound that can already be produced by the native network. Second, they may enable the production of entirely new compounds that cannot be produced by the native network. To investigate the feasibility of these two scenarios, we systematically tested whether the presence of underground reactions increases the maximal theoretical yield of each value-added compound using flux balance analysis (Campodonico ) (see Section 2). Maximum theoretical yield (yield for short) measures production efficiency by calculating the fraction of carbon atoms coming from the carbon source that is converted into the production of the target compound (Campodonico ).We allowed the utilization of all underground reactions simultaneously to identify all potential cases where the underground network facilitates the production of a target compound, including those where multiple underground reactions are required. These simulations were run on glucose as the sole carbon source. We report that underground reactions enhanced the maximum theoretical yield by more than 5% in 9 out of the 44 target compounds (Fig. 3A). These 9 compounds have various industrial applications including plastic manufacturing, flavoring, antifreeze production, etc. (Fig. 3B). Increasing the threshold of yield increment to 10% would eliminate only a single hit (glycerol), showing that our results are robust to this parameter. In more than half of the cases (five out of nine), the native metabolic network is already capable of producing the compound but the underground reactions opened a new pathway with a higher yield. Out of those four compounds not produced by the native model (see Fig. 3B), two (D-tartrate and butanol) were already present in the native network, indicating that the underground reactions created new routes between existing parts of the native network to enable their production. In the remaining two cases (ethylene glycol and 1-propanol), the underground reactions extended the native network to reach the target compound (Fig. 4C).
Fig. 3.
Impact of underground metabolism on the production potential of value-added compounds. The impact of underground reactions is inferred by simultaneously enabling all underground reactions to allow for potential pathways that require multiple underground reactions. (A) Number of value-added compounds showing no yield increment (35, green), showing increased yield in the underground model compared to the native model (5, red) and showing production only in the presence of underground reactions (4, blue). (B) Table of compounds with increased yield (native production available, red) and exclusive production (no native production, blue) in the presence of underground reactions. Yield increment percentage cannot be calculated for the latter as the native model cannot produce those compounds (A color version of this figure appears in the online version of this article)
Fig. 4.
The presence of one or two underground reactions can be sufficient to increase production yield. (A) Distribution of the number of underground reactions needed to increase production yields of target compounds. No more than four underground reactions have to be added to the native model to increase the yield in all cases. Furthermore, in most cases a single underground reaction is sufficient. (B) Matrix of underground reactions that contribute to the yield increments of each target compound. The columns represent the underground reactions (e.g. u0008, etc.) and their standard gene associations in parentheses, while the rows represent the target compounds. Cells in the upper part (marked as ‘Increased yield', red) belong to metabolites that can also be produced by the native metabolic network while the ones in the lower part (marked as ‘No native production', blue) require the presence of underground reactions. Multiple cells in one row account for multiple underground reactions whose joint presence is needed for production of the target compound. (C) Schematic view of central carbon metabolism (gray arrows) with underground reactions (red arrows) that enable the production of new compounds (yellow highlight) that cannot be produced by the native metabolic network. For details, see Supplementary Dataset S1 (A color version of this figure appears in the online version of this article)
Impact of underground metabolism on the production potential of value-added compounds. The impact of underground reactions is inferred by simultaneously enabling all underground reactions to allow for potential pathways that require multiple underground reactions. (A) Number of value-added compounds showing no yield increment (35, green), showing increased yield in the underground model compared to the native model (5, red) and showing production only in the presence of underground reactions (4, blue). (B) Table of compounds with increased yield (native production available, red) and exclusive production (no native production, blue) in the presence of underground reactions. Yield increment percentage cannot be calculated for the latter as the native model cannot produce those compounds (A color version of this figure appears in the online version of this article)The presence of one or two underground reactions can be sufficient to increase production yield. (A) Distribution of the number of underground reactions needed to increase production yields of target compounds. No more than four underground reactions have to be added to the native model to increase the yield in all cases. Furthermore, in most cases a single underground reaction is sufficient. (B) Matrix of underground reactions that contribute to the yield increments of each target compound. The columns represent the underground reactions (e.g. u0008, etc.) and their standard gene associations in parentheses, while the rows represent the target compounds. Cells in the upper part (marked as ‘Increased yield', red) belong to metabolites that can also be produced by the native metabolic network while the ones in the lower part (marked as ‘No native production', blue) require the presence of underground reactions. Multiple cells in one row account for multiple underground reactions whose joint presence is needed for production of the target compound. (C) Schematic view of central carbon metabolism (gray arrows) with underground reactions (red arrows) that enable the production of new compounds (yellow highlight) that cannot be produced by the native metabolic network. For details, see Supplementary Dataset S1 (A color version of this figure appears in the online version of this article)To assess the extent to which these results depend on the applied nutrient conditions, we repeated the simulations on seven other carbon sources deemed relevant in metabolic engineering (see Materials and Methods). Generally, the yield increments showed little variation between carbon sources indicating that our results are robust to the choice of carbon source (Supplementary Dataset S1). This general lack of condition-dependency can be explained by the fact that most target compounds are biosynthesized from intermediates of central carbon metabolism with the sole exception of acrolein. Notably, acrolein can be produced only on d-xylose as the sole carbon source (see Supplementary Dataset S1).Collectively, these results show that underground reactions often have the potential to increase the production efficiency of value-added compounds and can even confer the ability to synthesize new chemicals.
3.3 Yield increase can be achieved with a handful of underground reactions
Biosynthetic pathways with fewer underground reactions are expected to be easier to implement in biotechnological applications. Therefore, we next examined the number of underground reactions that are directly involved in the biosynthesis of a given target compound. This was achieved by applying a mixed integer linear programming (MILP) algorithm to find the minimum number of underground reactions necessary for the yield increment (Fig. 4, see Materials and Methods). In two-thirds of the cases (six out of nine), the presence of a single underground reaction is sufficient to increase the production yield and a maximum of four underground reactions are sufficient in all cases (Fig. 4A and B, Supplementary Dataset S1). The set of underground reactions involved in the production of specific target compounds are generally unique, with only a single underground reaction, catalyzed by glycerol dehydrogenase (gldA), being advantageous for the production of more than one compound.Notably, in three out of the four target compounds that cannot be produced by the native model, the involved underground reactions are located at the end of the biosynthetic pathway (Supplementary Dataset S1, Fig. 4C). In contrast, target compounds that can already be produced by the native model are never direct products of the underground reactions (Supplementary Dataset S1). These results suggest that underground reactions directly involved in the production of the target compound have higher impact on the yield. Future studies are demanded to test the generality of this hypothesis.
3.4 Empirical support for the biotechnological application of underground reactions
Literature survey provides empirical support for several of our computational predictions. First, consistent with the predictions, overexpressing gldA is a crucial part of the experimental strain design for (S)-propane-1,2-diol production (Clomburg and Gonzalez, 2011). The gene gldA is an endogenous aldehyde reductase with a primary role in removal of dihydroxyacetone by converting it to glycerol (Subedi ). However, it has also been shown to catalyze the conversion of methylglyoxal to lactaldehyde, a precursor of (S)-propane-1,2-diol (Clomburg and Gonzalez, 2011). Our method successfully identified this reaction as the underground reaction necessary to enhance (S)-propane-1,2-diol production. Furthermore, in line with our simulations, experiments show that production of (S)-propane-1,2-diol is more efficient on glycerol than on glucose as the carbon source (Clomburg and Gonzalez, 2011) (see Supplementary Dataset S1). Another example is the production of ethylene glycol, where a previous experimental implementation utilized the same underground activity and gene (fucO) as predicted here. Specifically, a key step in engineering a novel pathway for ethylene glycol production relied on the underground activity of FucO, which converts glycolaldehyde to ethylene glycol (Pereira ).1-Propanol emerged as one of the compounds that cannot be produced by native E. coli metabolism, but can be synthesized by simultaneously adding two underground reactions to the network (Fig. 4B). 1-Propanol is used as a solvent during production of cosmetics, and it is also used to manufacture cellulose-based plastics (Gonzalez-Garcia ). Moreover, 1-propanol is a promising alternative biofuel to bioethanol considering its higher energy density (Jun Choi ). There are multiple pathways engineered to produce 1-propanol and one of them includes the two underground reactions predicted by our method (Jun Choi ) (Fig. 4B). In particular, the study of Jun Choi demonstrates the experimental feasibility of the predicted underground pathway converting propionyl-CoA into propionyl-aldehyde and further to 1-propanol. Note, however, that the associated gene (adhE) in this experimental study is different from those predicted by our analysis (mhpF and yqhD), indicating that the underground network reconstruction is far from complete.Overall, despite the sparse usage of enzyme side activities in metabolic engineering (Pontrelli ), these examples provides support for our simulations.
3.5 Underground reactions show a similar potential to produce value-added compounds as heterologous reactions
To systematically assess the utility of underground pathways in the production of value-added compounds, we next compared their ability to increase production yield to that of heterologous pathways from other species, which are commonly employed in strain development. To this end, we first compiled a dataset of heterologous enzymatic reactions that are absent from E. coli, but have been described in other organisms according to the MetaCyc database (see Section 2). This resulted in 5686 distinct heterologous reactions that were added to the native E. coli model.To compare the production potentials of underground and heterologous reactions while controlling for the different sizes of these two sets, we generated random sets of heterologous reactions having the same size as the total number of underground reactions. Next, we calculated how frequently these subsets of ‘heterologous’ reactions facilitate the production of value-added compounds compared to the model containing underground reactions. Note that out of the 160 compiled value-added metabolites, here we evaluated all 110 ones that participate in either a heterologous or a native reaction (Section 2). We found that, on average, random heterologous reaction sets increase the production of 6.1 target compounds. Statistical comparison to the nine cases where the underground reactions increased the production yield of a target compound revealed no significant difference (randomization test; P = 0.168; Fig. 5A). Furthermore, underground reactions can produce a similar number of novel compounds which cannot be produced by the native network (4), compared to heterologous reaction sets (2.4 on average; randomization test; P = 0.238; Fig. 5B). Together, these results show that underground reactions have a comparable potential to contribute to the production of value-added compounds as heterologous reactions.
Fig. 5.
Underground reactions show similar potential to produce value-added compounds as heterologous reactions. The figure compares the impact of adding the set of 543 underground reactions to the native network (vertical line) with that of adding the same number of randomly selected heterologous reactions (histograms). (A) Distribution of the number of target compounds where sufficient yield increment was attained across 1000 instances of randomly selected heterologous reaction sets. Underground reactions show comparable yield increment (nine cases, P = 0.168, randomization test). (B) Distribution of the number of non-native target compounds which cannot be produced by the native metabolic network. There is no significant difference compared to underground reactions (four cases, P = 0.238, randomization test). Vertical line represents the production properties of the underground metabolic network on both plots
Underground reactions show similar potential to produce value-added compounds as heterologous reactions. The figure compares the impact of adding the set of 543 underground reactions to the native network (vertical line) with that of adding the same number of randomly selected heterologous reactions (histograms). (A) Distribution of the number of target compounds where sufficient yield increment was attained across 1000 instances of randomly selected heterologous reaction sets. Underground reactions show comparable yield increment (nine cases, P = 0.168, randomization test). (B) Distribution of the number of non-native target compounds which cannot be produced by the native metabolic network. There is no significant difference compared to underground reactions (four cases, P = 0.238, randomization test). Vertical line represents the production properties of the underground metabolic network on both plotsLast, we asked whether a combined strategy using both underground and heterologous activities would improve the production of specific compounds to extents that cannot be achieved by either reaction repertoire alone (i.e. synergistic effects). To examine this, we calculated the yield increment using a model containing all underground and heterologous reactions simultaneously. Notably, we identified four value-added compounds that showed an enhanced production yield only when both underground and heterologous reactions were made available for the calculations (see Supplementary Dataset S1). This result shows that the combined use of underground and heterologous enzyme activities can further extend the biotechnological potential of an organism.
4 Discussion
By extending the native metabolic network of E. coli by underground reactions, we present a framework to further broaden the metabolic engineering application for the production of value-added compounds (Pontrelli ). Our genome-scale computational analysis gave several new insights. First, we show that underground metabolism can enhance the production yield of numerous industrially relevant compounds and even enable the production of new compounds that cannot be produced by the native metabolic network.Second, we also demonstrate that production of a given value-added compound through underground metabolism often hinges on one or few underground reactions only. This implies that it would be sufficient to engineer only few enzyme side activities for any given target compound. Engineering a small number of enzymatic steps would likely benefit a successful metabolic engineering strategy, especially since it concerns weak-side activities of the enzyme. There are several ways to translate our predictions toward the production of engineered strains in vivo, which is clearly the next major step. First, mutations that increase the underground activity could be engineered via various genome editing techniques, such as CRISPR-Cas and MAGE-based methods (Csörgő ; Jakočiūnas ; Wang ). Second, adaptive laboratory evolution has shown to be successful in increasing underground activities (Guzmán ; Pontrelli ), and third, a combination of editing and evolution could be applied (Pontrelli ; Wannier ).Perhaps the most important new insight provided by our study is that underground and heterologous reactions have similar theoretical potentials to contribute to the production of value-added compounds. Thus, underground enzyme activities may provide a complementary source of biochemical reactions for overproduction purposes. This is a notable result, because the use of heterologous genes might be unfavorable for applications that demand a GMO-free status, such as food fermentation products. In contrast, the use of underground reactions coupled with adaptive evolution does not involve the introduction of specific DNA from other organisms. As such, the second approach may contribute to applications that go beyond GMO. Moreover, in a preliminary analysis we found several cases where specific underground and heterologous activities are jointly required to improve the production yield of value-added compounds. This result suggests that combining the two reaction repertoires can be advantageous for industrial applications. Clearly, future works are needed to fully explore this possibility. Last, we expect that underground activities might have a limited potential compared to heterologous enzyme activities in one particular area of application: many industrially relevant metabolites are produced through secondary metabolism in plants, which are unlikely to be producible by side activities of microbial host enzymes.We note that our estimate of the industrial potential of underground reactions might be distorted by at least two phenomena. On the one hand, the applied modeling framework might overestimate the contribution of underground reactions to new pathways for several reasons. First, we used a simple FBA framework that ignores thermodynamic realizability and therefore some of the predicted pathways might be unrealistic under physiologically relevant metabolite concentrations. Genome-scale modeling methods that incorporate thermodynamics as well as enzyme constraints through kinetics could address this shortcoming (Hoppe ; Salvy ; Sánchez ). Recent frameworks allow the incorporation of enzyme kinetics data as well as quantitative omics data into models, which could result in the prediction of biological relevant phenotypes through FBA (Filippo ; Sánchez ). Second, underground enzyme activities should not interfere with the native metabolic network structure to become biochemically and physiologically functional. As such, it might be challenging to enhance underground activities in vivo, especially if extensive protein engineering is needed (Porokhin ). Third, the FBA-based modeling framework might be over-optimistic and in reality, multiple genetic modifications are needed to achieve the desired production. For example, a single underground activity associated with the fucO gene is sufficient for efficient ethylene glycol production in silico. However, in addition to fucO overexpression, three other native and two heterologous enzymes had to be overexpressed and a gene deleted to achieve a high-flux pathway from glucose to ethylene glycol in vivo (Pereira ). Similarly, to produce (S)-Propane-1,2-diol, a previous work overexpressed three genes, replaced an enzyme with a more efficient heterologous enzyme and disrupted pathways that alter the flux from the synthesis (Clomburg and Gonzalez, 2011). In contrast, the addition of a single underground activity is sufficient to increase the predicted yield in our simulations. Therefore, hits from our simulations are potentially needed to be expanded with further pathway improvements, including deletion of genes that divert the flux from biosynthesis and modifications that alter the redox balance. Nevertheless, our method gives suggestions that open new ways to produce important chemicals after further refinement. On the other hand, our knowledge of underground activities is still rudimentary and there might be orders of magnitude more side activities than currently known that could potentially be recruited for new pathways. Notably, there are various promising recent reports to predict metabolic (side) reactions from cheminformatics, enzyme structures and machine learning (Amin ; Carbonell ; Carbonell and Faulon, 2010; Koch ; Mou ; Robinson ). We therefore anticipate that future advances in machine learning and cheminformatics will further expand the known ‘metabolic reaction space’ of species, i.e. the total number of metabolic reactions that could potentially be active in a species (Hafner ; Tyzack ). This ‘space’ could be exploited for the production of value-added compounds (Campodonico ; Carbonell ).The present study focused on underground reactions as raw materials for building new pathways, however, promiscuous activities of enzymes in the existing metabolic network may also affect negatively the production of target compounds (Kim and Copley, 2012). For example, it has been shown that promiscuous phosphatase activities redirect flux from a heterologous terpenoid biosynthetic pathway, hence decreasing its efficiency (Wang ). A more complete knowledge of the repertoire of underground reactions, including those catalyzed by native and heterologous enzymes alike, would therefore be instrumental to avoid network disruptions arising from enzyme promiscuity (Porokhin ).Our research focuses on public information of value-added compounds used in the industry, but this is likely an underestimate of the total complement of compounds of interest. Therefore, our approach could easily be adapted for the needs of the industry to incorporate their compound of interest. Moreover, we report that changes in the exact nutrient environment does not alter the predictions substantially, since the value-added compounds are produced from central metabolism. This may, however, change once the underground metabolic network is further extended in the future, as well as when additional value-added compounds are added to the network. Hence, our approach may also predict new industrially relevant cost-reducing environments to produce similar product yield or even increased yields. Our results pave the way for exploiting underground metabolism to produce novel strains and we anticipate that a growing interest in underground metabolism will go together with a rise of biotechnological applications.Click here for additional data file.
Authors: Yvan Cam; Ceren Alkim; Debora Trichez; Vincent Trebosc; Amélie Vax; François Bartolo; Philippe Besse; Jean Marie François; Thomas Walther Journal: ACS Synth Biol Date: 2015-07-24 Impact factor: 5.110
Authors: Miguel A Campodonico; Barbara A Andrews; Juan A Asenjo; Bernhard O Palsson; Adam M Feist Journal: Metab Eng Date: 2014-07-28 Impact factor: 9.783
Authors: Jeffrey D Orth; Tom M Conrad; Jessica Na; Joshua A Lerman; Hojung Nam; Adam M Feist; Bernhard Ø Palsson Journal: Mol Syst Biol Date: 2011-10-11 Impact factor: 11.429
Authors: Gabriela I Guzmán; Troy E Sandberg; Ryan A LaCroix; Ákos Nyerges; Henrietta Papp; Markus de Raad; Zachary A King; Ying Hefner; Trent R Northen; Richard A Notebaart; Csaba Pál; Bernhard O Palsson; Balázs Papp; Adam M Feist Journal: Mol Syst Biol Date: 2019-04-08 Impact factor: 11.429