Literature DB >> 31457023

Genome-Scale Metabolic Network Models of Bacillus Species Suggest that Model Improvement is Necessary for Biotechnological Applications.

Tahereh Ghasemi-Kahrizsangi¹, Sayed-Amir Marashi¹, Zhaleh Hosseini¹.

Abstract

BACKGROUND: A genome-scale metabolic network model (GEM) is a mathematical representation of an organism's metabolism. Today, GEMs are popular tools for computationally simulating the biotechnological processes and for predicting biochemical properties of (engineered) strains.
OBJECTIVES: In the present study, we have evaluated the predictive power of two GEMs, namely iBsu1103 (for Bacillus subtilis 168) and iMZ1055 (for Bacillus megaterium WSH002).
MATERIALS AND METHODS: For comparing the predictive power of Bacillus subtilis and Bacillus megaterium GEMs, experimental data were obtained from previous wet-lab studies included in PubMed. By using these data, we set the environmental, stoichiometric and thermodynamic constraints on the models, and FBA is performed to predict the biomass production rate, and the values of other fluxes. For simulating experimental conditions in this study, COBRA toolbox was used.
RESULTS: By using the wealth of data in the literature, we evaluated the accuracy of in silico simulations of these GEMs. Our results suggest that there are some errors in these two models which make them unreliable for predicting the biochemical capabilities of these species. The inconsistencies between experimental and computational data are even greater where B. subtilis and B. megaterium do not have similar phenotypes.
CONCLUSIONS: Our analysis suggests that literature-based improvement of genome-scale metabolic network models of the two Bacillus species is essential if these models are to be successfully applied in biotechnology and metabolic engineering.

Entities: Chemical

Keywords: Bacillus Species; Biochemical capability; Computational biotechnology; Model validation; Systems biology

Year: 2018 PMID： 31457023 PMCID： PMC6697824 DOI： 10.15171/ijb.1684

Source DB: PubMed Journal: Iran J Biotechnol ISSN： 1728-3043 Impact factor: 1.671

1. Background

Advances in genome sequencing in the last decades have made it possible to reconstruct genome-scale metabolic network models (GEMs) for many organisms (1–6). Over recent years, the number of available metabolic networks has significantly increased in different taxa of living organisms (7–13). Nowadays, analysis of GEMs plays an indispensable role in metabolic engineering (14, 15). The process of GEM reconstruction is comprised of four fundamental steps (6, 16) including: automated omics-based (mainly genomics-based, i.e., using the sequenced genome of organisms) reconstructing the draft network using toolboxes such as COBRA toolbox (17) or RAVEN toolbox (18); curating the draft reconstruction; converting the network to a computational model and finally evaluating the correctness of models using experimental data. In the process of GEM reconstruction, metabolic data of an organism is collected, and then, converted into a machine readable format, which in turn is converted to a mathematical constraint-based model. In fact, GEMs can be seen as the mathematical representation of metabolic processes. In the Materials and Methods section, we will briefly explain the mathematical framework used in this study. In spite of the great advances in the reconstruction of GEMs (19), current models may not be completely successful in modeling experimental data. Such inconsistencies may occur due to incorrect or incomplete annotations, missing reactions and pathways, incorrect or missing regulatory constraints and inaccurate formulation of the biomass reaction (14, 20). As a result of the potential deficiencies in GEMs, it is always necessary to validate a GEM to ensure its ability in predicting the behavior of organism (21). In a recent comparative study, we assessed the modeling capabilities of GEMs of three Pseudomonas species, namely, P. aeruginosa, P. putida and P. fluorescens (22). Using the previously published biochemical data, we showed that GEMs of P. aeruginosa and P. putida are much more accurate than the P. fluorescens GEM.

2. Objectives

In the present study, we follow a similar idea. Two Bacillus species, namely B. subtilis and B. megaterium are chosen for this analysis. The goal of this work is to compare the computationally-modeled biochemical capabilities of these species with their in vivo biochemical capabilities.

3. Materials and Methods

3.1. Comprehensive Literature Searching for Experimental Data

For comparing the predictive power of GEMs, experimental data were obtained from previous wet-lab studies. For this purpose, we considered all articles in PubMed database, containing the names of the two species of interest. By June 2014, searching for “Bacillus subtilis” AND “Bacillus megaterium” in PubMed database resulted in 610 articles. These articles were carefully investigated to check whether they are appropriate for evaluating the in silico experiments. In selecting the articles for evaluation, a number of criteria were considered. Firstly, only those articles which include data on metabolic activities of the two species were chosen. Secondly, we considered only those in vivo experiments which are related to the metabolic activities present in both of the two GEMs. For example, different growth rates in the same growth medium and the ability of bacteria in consuming/producing a special substance are appropriate results for evaluating simulations. Metabolic engineering of B. subtilis and B. megaterium could have been ideal for evaluating simulations. However, we could not find cases of simultaneous engineering of both species.

3.2. Genome-Scale Metabolic Network Models

In the present study, two genome-scale metabolic network models were used: (i) the GEM of B. subtilis 168 (called iBsu1103) (23); and (ii) the GEM of B. megaterium WSH002 (called iMZ1055) (24).

3.3. Flux Balance Analysis of Metabolic Networks

For mathematical representation of metabolism, an m × n stoichiometric matrix (S) is used. In this matrix, rows and columns represent the system’s metabolites and reactions, respectively. Element S is the stoichiometric coefficient of metabolite i in reaction j. The fluxes of all reactions are represented in vector v with the length of n. Now, consider vector c as an m-dimensional vector of metabolite concentrations. Then, one can show that S.v= dc/dt holds for the metabolic network (25). For a system at steady-state conditions, no net production or consumption of metabolites is possible, which means that dc/dt = 0. Consequently, at steady-state conditions, flux through each reaction is given by the stoichiometric constraint, i.e. S.v = 0. In addition to the stoichiometric constraint, vector v is also limited because of thermodynamic or environmental constraints. These constraints limit each flux v between a lower bound and an upper bound, in the form a. In the especial case of irreversible reaction i, flux through the reaction is limited by the thermodynamic constraint 0 ≤ v. Flux balance analysis (FBA) (25) is a computational technique based on linear programming. The aim of FBA is to find the optimal solution of an objective function (typically biomass production rate) subject to stoichiometric, thermodynamic and environmental constraints. For this purpose, the stoichiometric and thermodynamic constraints are extracted from a GEM, while the environmental constraints should be defined depending on the growth medium (See section 3.4.).

3.4. COBRA Toolbox for in Silico Analyses

For simulating experimental conditions in this study, COBRA toolbox was used (17). This toolbox contains various functions which can be used for performing a variety of in silico metabolic network analyses, including FBA. For simulating a specific experiment, in vivo growth medium conditions were applied to models. For example, the uptake rate of all those metabolites which were absent in the medium were set to zero. On the other hand, the uptake rates of the constituents of the medium were constrained between zero and an upper bound value. After setting the environmental, stoichiometric and thermodynamic constraints, FBA is performed to predict the biomass production rate, and the values of other fluxes. Similar to some of the previous studies (26, 27), the most frequently used functions in this study are explained below: • “changeRxnBounds” can be used to modify lower or upper bounds of a reaction. Using this function, environmental constraints can be simulated. • In some cases it was needed to have a reaction which was not included in the models. In these situations “addReaction” was used. The input of this function is the chemical equation of the reaction. The function adds reactions to the metabolic model. • After applying the desired conditions to the models, FBA should be performed (using “optimizeCbModel” function) to predict the growth phenotypes in a certain growth medium.

4. Results

4.1. Utilization of Amino Acids as the Sole Source of Carbon and Nitrogen

Consuming single l-amino acids as the only source of carbon and nitrogen is a common phenomenon amongst prokaryotes and occurs in most genera of bacteria (28–30). Metabolism of amino acids by bacteria has been widely studied. For example, in a comprehensive study, twenty taxonomically known bacteria (including B. subtilis and B. megaterium), which can utilize amino acids, were examined for their growth capabilities on amino acids (31). In this study, utilization of twenty amino acids was examined (among which utilization pattern of ten amino acids was found to be different in B. subtilis and B. megaterium). In each experiment, Lochhead-Chase basal medium (32) was used, in which glucose and nitrate were replaced by a certain single amino acid. Moreover, this medium contains mineral salts. Since the exact constituents of this medium could not be determined, we performed the in silico simulations in mineral salt medium (MSM) which includes mineral salts. MSM medium contains the following salts: K2HPO4, KH2PO4, (NH4)2SO4, MgCl2, CaCl2, H3BO3, ZnSO4, NiSO4, (NH4)6Mo7O24.4H2O, CuSO4.5H2O, MnSO4, CoCl2, and FeCl3. As our goal is to simulate the amino acid as the single source of carbon and nitrogen, we did not consider NH4 in our simulated media. For simulating this experiment, we defined the mentioned medium for models. This was done by setting the lower bound of uptake reaction for desired ions to -10 millimoles per gram dry weight per hour (mmol.gDW−1.hr−1). Additionally, the lower bounds of glucose and nitrate uptake rates were set to zero. In each of the simulations, we set the lower bound of the uptake rate of a certain amino acid to -10 mmol.gDW−1. hr−1. By this process, we defined each amino acid as the single source of carbon and nitrogen. Uptake reactions of Hydroxyproline and Cystine were not included in the two GEMs, and therefore, FBA was not carried out for these amino acids. In , in silico growth results are compared to the experimental in vivo data for differentially-consumed amino acids in the two species. In this table and also in , the terms “good growth” and “poor growth” refers to the experimental data. We have also differentiated between these two groups in calculation of Kendall rank coefficients which are reported below. From these data, in vivo biomass production in either B. megaterium or B. subtilis can be observed, while, in many cases, in silico simulations fail to predict such differences. In other words, there is no significant correlation between experimental and computational results (Kendall rank coefficient of 0.56 (p-value = 0.07) and -0.38 (p-value = 0.24) for B. subtilis and B. megaterium respectively). In the same study (31), utilization patterns of ten amino acids were found to be similar in B. subtilis and B. megaterium. In silico and in vivo growth rates for these amino acids are compared in Table 2. Here, unlike Table 1, the two GEMs successfully predicted the in vivo results in most cases including the utilization of alanine, leucine, cysteine, glutamic acid, phenylalanine, tyrosine and proline (Kendall rank coefficient of 0.62 for B. subtilis and B. megaterium, p-value < 0.05). These results suggest that in case of similar phenotypes, the responsible reactions/pathways are also included in both GEMs.

Table 2.

In silico and in vivo data of amino acid utilization with similar consumption patterns in B. megaterium and B. subtilis. Relative cell growth estimated visually from the amount of growth on the amino acid. ++: Good growth; +: Poor growth; −: No growth.

Amino acids	Bacillus subtilis		Bacillus megaterium
Amino acids	In vivo data	In silico data (mmol.gDW⁻¹.hr⁻¹)	In vivo data	In silico data (mmol.gDW⁻¹.hr⁻¹)
Glycine	−	0.1055	−	0.2180
Alanine	++	0.3670	++	0.4405
Leucine	−	0	−	0
Cysteine	−	0	−	0
Methionine	+	0	+	0
Glutamic acid	++	0.6007	++	0.7175
Lysine	+	0	+	0
Phenylalanine	−	0	−	0
Tyrosine	−	0	−	0
Proline	++	0.7206	++	0.8303

Table 1.

In silico and in vitro data of amino acid utilization of differentially-consumed patterns in B. Megaterium and B. Subtilis. Relative cell growth estimated visually from the amount of growth on the amino acid. ++: Good growth; + : Poor growth; − : No growth.

Amino Acids	Bacillus subtilis		Bacillus megaterium
Amino Acids	In vitro data	In silico data (mmol.gDW⁻¹.hr⁻¹)	In vitro data	In silico data (mmol.gDW⁻¹.hr⁻¹)
Valine	++	0.6608	−	0.8423
Isolucine	++	0.8820	−	1.0602
Serine	+	0.3056	−	0.3807
Threonine	+	0.4520	−	0.4774
Aspartic acid	−	0.3862	++	0.4676
Arginine	++	0.7423	−	0.8841
Histidine	+	0.6772	++	0
Tryptophan	+	0	−	0

4.2. Carbohydrate Fermentation Capability

A recent analysis has been conducted to study the diversity of a chlorine-resistant Bacillus population isolated from a wastewater treatment station (33). This study has investigated the phenotypic and genotypic diversity of this bacterial population. Based on 16S rRNA gene sequencing and biochemical tests, 12 strains were identified. Similarity searches on GenBank showed that five of these strains were Bacillus subtilis, while one strain was identified as Bacillus megaterium (33). One of the biochemical tests done in this experiment was the carbohydrate fermentation test. This test was done to investigate the acid and gas production during carbohydrate utilization in bacteria. Growth medium used in this test includes NH4H2PO4, KCl, MgSO4, Bromothymol blue and carbohydrate of interest. For simulating this experiment, we proposed that when a strain is capable of growing on a carbohydrate source, it may or may not be able to produce acid (i.e., to ferment the carbohydrate). This means a correct positive result in the in silico analysis might have equivalent negative or positive result in the in vivo experiment. However, if the strain cannot use the carbohydrate as source of carbon and energy, the result of its fermentation test must necessarily be negative, as it cannot use the carbohydrate. For modeling fermentation, in each simulation we set a carbohydrate to be the sole source of carbon/energy, and then, FBA was performed. The results are shown in . Interestingly, in all three cases of the first three carbohydrates (d-glucose, d-mannitol, and l-arabinose) the results of simulation are not inconsistent with experimental data. In the same study (33), the ability of these bacteria in utilizing citrate as single source of carbon has been examined. According to the experimental data, none of those two bacteria was able to grow on citrate. However, in silico growth phenotype of B. megaterium and B. subtilis showed inconsistency with in vivo data, i.e., both strains could grow on citrate.

4.3. Effect of Formaldehyde on the Growth of Bacillus Species

Formaldehyde is an antimicrobial compound used as disinfectant against microbial vegetative cells and spores. This compound can inactivate bacteria, fungi, yeasts and molds. In one study, the minimal inhibitory effect of formaldehyde against several bacteria including Bacillus subtilis and Bacillus megaterium has been measured (34). According to this article, Bacillus megaterium was more resistant than Bacillus subtilis against formaldehyde. Formaldehyde resistance in these two closely-related species is presumably due to the detoxification mechanisms imbedded in their metabolism (35, 36). For simulating formaldehyde resistance, spontaneous uptake reaction of formaldehyde was added to the two GEMs. Then, in each model, we increased the flux of formaldehyde in a stepwise manner until biomass production rate became zero, or formaldehyde uptake rate reached its upper limit. The results of this study for Bacillus subtilis and Bacillus megaterium are shown in , respectively. As it is shown in the graphs, not only the biomass production rates do not decrease by formaldehyde uptake flux, but also the biomass production rates increase in both models. In case of B. subtilis, biomass production rate gets its maximum value (i.e., 0.295 mmol.gDW−1.hr−1) when formaldehyde uptake becomes 15.8 mmol.gDW−1.hr−1 or higher. However, in Bacillus megaterium, by increasing formaldehyde uptake flux to 1000 mmol.gDW−1.hr−1 biomass production rate persistently increases.

5. Discussion

In this work, we showed that in silico simulations fail to predict the differences in the amino acid metabolism of B. subtilis and B. megaterium. In other words, the two GEMs cannot accurately show the differences in amino acid utilization phenotypes. This might be due to the difference between strains used for in vivo experiments and GEM reconstruction. However, such discrepancies might also reflect the poor quality of the two GEMs. The cases of five amino acids, namely valine, isoleucine, serine, threonine and arginine, should be studied in more details. For these amino acids, B. megaterium GEM incorrectly predicts growth, while in silico and in vivo data of B. subtilis are consistent. One possible explanation for this inconsistency is that the B. megaterium strain used for GEM reconstruction has much more metabolic capabilities compared to the B. megaterium strain used in the in vivo studies. However, a stronger possibility is that the B. megaterium GEM includes additional reactions which are not, in reality, in its metabolism. Note that B. subtilis GEM had been used as a reference during the reconstruction of B. megaterium GEM (24). Therefore, some reactions/pathways which are present in B. subtilis but are actually absent in the metabolism of B. megaterium might have been wrongly included in its GEM. Such reactions are either needed for filling the gaps of the model and making it functional or present in the model as a mistake of the method used for model reconstruction. Such pathways can be responsible for these observed false positive growth phenotypes. Based on Table 3, we observe that in case of glucose, mannitol and arabinose, the two GEMs show the capability to use these compounds as carbon source. Consequently, one cannot draw any conclusion on the correctness of the predictions based on the fermentation studies. Furthermore, in case of growth on citrate, in silico growth phenotype of B. megaterium and B. subtilis showed inconsistency with in vivo data, i.e., both strains could grow on citrate. Again, these inconsistencies might be either due to difference in strains used for in vivo experiments and in silico simulations or due to the poor quality of the GEMs.

Table 3.

In silico growth phenotypes vs. in vivo fermentation phenotypes. +: Positive result; −: Negative result; −/+: Both results have been observed. It should be noted that in case of d-glucose, d-mannitol and l-arabinose, in silico data show biomass production from the carbon source, while in vivo data suggest acid production as a result of fermentation. Therefore, inconsistency is guaranteed only when growth is not reported in silico but acid production is reported in vivo. In case of using citrate as the carbon source, in silico results of both GEMs showed inconsistency with in vivo results.

Carbohydrate	Bacillus subtilis			Bacillus megaterium
Carbohydrate	In silico data	In vivo data	Consistent?	In silico data	In vivo data	Consistent?
D-Glucose	+	−/+	yes	+	+	yes
D-Mannitol	+	−/+	yes	+	−	yes
L-Arabinose	+	−	yes	+	+	yes
Citrate as carbon source	+	−	no	+	−	no

Therefore, both models failed to reflect the sensitivity of bacteria to formaldehyde. Although one of the reasons for this failure is that metabolic models cannot predict the inhibitory effects, there must be definitely something wrong with the reactions of these models, which caused this failure. In other words, according to their metabolic models, these bacteria have evolved some pathways for consuming formaldehyde as a source of carbon and converting it to biomass precursors. Clearly, this is not reasonable and is probably because of the existence of extra reactions, or reactions with wrong directionalities in these models. depicts a part of formaldehyde metabolism in the GEM of B. subtilis. According to this pathway, formaldehyde is eventually converted to CO2, bicarbonate, and then to oxaloacetate, which is in turn used for glutamate production. Note that conversion of CO2 to oxaloacetate is not thermodynamically favorable in a cell. Glutamate is a constituent of biomass. Therefore, in simulation, it is logical that increasing formaldehyde flux lead to increase in biomass. shows a part of formaldehyde metabolism in the GEM of B. megaterium. According to this figure, formaldehyde is converted to formate, and then to CO2. The model suggests that CO2, then, reacts with ammonia to produce glycine and tetrahydrofolate. Note that this reaction cannot readily occur in the cell, as it is not thermodynamically favorable. The latter compound can be converted to serine. Note that both glycine and serine are constituents of biomass. Therefore, it is logical for this model not to show formaldehyde inhibitory effect.

6. Conclusion

In the present paper, we compared the predictive power of two genome-scale metabolic networks. The results of such comparison can be used to improve the accuracy of GEMs. The striking result is that, in many of the cases B. subtilis and B. megaterium GEMs fail to accurately predict the experimental data, especially where the phenotypes the two species are different. Consequently, one can conclude that these metabolic network models have some errors that should be subject to a gap filling process. In other words, further modifications should be performed based on literature reports and experimental results, before these models can be used in metabolic engineering and biotechnological applications. In the present paper, and also in the sister paper (22), we showed how the wealth of data in the literature can be used for evaluating the accuracy of a genome-scale metabolic network model. We previously showed that, as expected, consolidated reconstruction of metabolic networks of P. aeruginosa and P. putida reduces the inconsistencies between in silico and in vivo data. On the other hand, independently reconstructed GEMs should be used with care because of their (potentially extensive) errors in predicting experimental phenotypes. Altogether, we recommend that available literature data can be used for GEM evaluation. Although using such data may not be ideal (e.g., due to strain dissimilarities within the same species), it can reduce the massive time, cost and effort requirements for experimental evaluations (for example see (37)). Here, we emphasize that P. fluorescens GEM (i.e., iSB1139) (38) and B. megaterium GEM (i.e., iMZ1055) (24) failed to accurately predict the experimental phenotypes in many cases. This is a serious concern, as these two models are among the most recently reconstructed GEMs (both are published in 2013). This observation suggests that reconstructing GEMs cannot be considered as a mature field, in spite of using advanced bioinformatics tools. Still, quality and completeness of GEMs highly depend on more reliable procedures, such as extensive manual curation (39). Another important point should be noted here. Classification of bacterial species highly depends on the differences in the biochemical capabilities. It was previously suggested that the GEM reconstruction will revolutionize prokaryotic systematics (40) by characterizing the metabolic differences. One of our goals in the present study was to check whether the differences in the biochemical capabilities are predictable by the existing GEMs. However, the results are generally disappointing. With such poor results, one should not expect the existing GEMs to correctly predict the phenotype of an organism, let alone the between-species differences in the biochemical capabilities. Additionally, it was recently shown that a high degree of similarity exists among many GEMs, regardless of their phylogenetic relationships (39). Finally, it should be noted that even for a well-defined and reliable genome-scale model, different alternative optimal solutions are expected to occur (41). For all these reasons, exploiting GEMs for phylogenetic reconstruction is far from reality at the moment, and it requires enormous improvements in GEM reconstruction in the first place.

38 in total

Review 1. Metabolic modeling of microbial strains in silico.

Authors: M W Covert; C H Schilling; I Famili; J S Edwards; I I Goryanin; E Selkov; B O Palsson
Journal: Trends Biochem Sci Date: 2001-03 Impact factor: 13.807

Review 2. Metabolic modelling of microbes: the flux-balance approach.

Authors: Jeremy S Edwards; Markus Covert; Bernhard Palsson
Journal: Environ Microbiol Date: 2002-03 Impact factor: 5.491

Review 3. Thirteen years of building constraint-based in silico models of Escherichia coli.

Authors: Jennifer L Reed; Bernhard Ø Palsson
Journal: J Bacteriol Date: 2003-05 Impact factor: 3.490

4. Genome-scale metabolic model of Helicobacter pylori 26695.

Authors: Christophe H Schilling; Markus W Covert; Iman Famili; George M Church; Jeremy S Edwards; Bernhard O Palsson
Journal: J Bacteriol Date: 2002-08 Impact factor: 3.490

5. In silico genome-scale reconstruction and validation of the Staphylococcus aureus metabolic network.

Authors: Matthias Heinemann; Anne Kümmel; Reto Ruinatscha; Sven Panke
Journal: Biotechnol Bioeng Date: 2005-12-30 Impact factor: 4.530

Review 6. Genome-scale models of microbial cells: evaluating the consequences of constraints.

Authors: Nathan D Price; Jennifer L Reed; Bernhard Ø Palsson
Journal: Nat Rev Microbiol Date: 2004-11 Impact factor: 60.633

7. Global reconstruction of the human metabolic network based on genomic and bibliomic data.

Authors: Natalie C Duarte; Scott A Becker; Neema Jamshidi; Ines Thiele; Monica L Mo; Thuy D Vo; Rohith Srivas; Bernhard Ø Palsson
Journal: Proc Natl Acad Sci U S A Date: 2007-01-31 Impact factor: 11.205

8. Genome-scale reconstruction of metabolic network in Bacillus subtilis based on high-throughput phenotyping and gene essentiality data.

Authors: You-Kwan Oh; Bernhard O Palsson; Sung M Park; Christophe H Schilling; Radhakrishnan Mahadevan
Journal: J Biol Chem Date: 2007-06-15 Impact factor: 5.157

9. Bacillus subtilis yckG and yckF encode two key enzymes of the ribulose monophosphate pathway used by methylotrophs, and yckH is required for their expression.

Authors: H Yasueda; Y Kawahara; S Sugimoto
Journal: J Bacteriol Date: 1999-12 Impact factor: 3.490

10. Modeling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri.

Authors: Adam M Feist; Johannes C M Scholten; Bernhard Ø Palsson; Fred J Brockman; Trey Ideker
Journal: Mol Syst Biol Date: 2006-01-31 Impact factor: 11.429

1 in total

1. Manually curated genome-scale reconstruction of the metabolic network of Bacillus megaterium DSM319.

Authors: Javad Aminian-Dehkordi; Seyyed Mohammad Mousavi; Arezou Jafari; Ivan Mijakovic; Sayed-Amir Marashi
Journal: Sci Rep Date: 2019-12-10 Impact factor: 4.379

1 in total