| Literature DB >> 32804944 |
Wai Kit Ong1,2, Dylan K Courtney1,2, Shu Pan1,2, Ramon Bonela Andrade1, Patricia J Kiley2,3, Brian F Pfleger1,2, Jennifer L Reed1,2.
Abstract
Genome-scale metabolic models have been utilized extensively in the study and engineering of the organisms they describe. Here we present the analysis of a published dataset from pooled transposon mutant fitness experiments as an approach for improving the accuracy and gene-reaction associations of a metabolic model for Zymomonas mobilis ZM4, an industrially relevant ethanologenic organism with extremely high glycolytic flux and low biomass yield. Gene essentiality predictions made by the draft model were compared to data from individual pooled mutant experiments to identify areas of the model requiring deeper validation. Subsequent experiments showed that some of the discrepancies between the model and dataset were caused by polar effects, mis-mapped barcodes, or mutants carrying both wild-type and transposon disrupted gene copies-highlighting potential limitations inherent to data from individual mutants in these high-throughput datasets. Therefore, we analyzed correlations in fitness scores across all 492 experiments in the dataset in the context of functionally related metabolic reaction modules identified within the model via flux coupling analysis. These correlations were used to identify candidate genes for a reaction in histidine biosynthesis lacking an annotated gene and highlight metabolic modules with poorly correlated gene fitness scores. Additional genes for reactions involved in biotin, ubiquinone, and pyridoxine biosynthesis in Z. mobilis were identified and confirmed using mutant complementation experiments. These discovered genes, were incorporated into the final model, iZM4_478, which contains 747 metabolic and transport reactions (of which 612 have gene-protein-reaction associations), 478 genes, and 616 unique metabolites, making it one of the most complete models of Z. mobilis ZM4 to date. The methods of analysis that we applied here with the Z. mobilis transposon mutant dataset, could easily be utilized to improve future genome-scale metabolic reconstructions for organisms where these, or similar, high-throughput datasets are available.Entities:
Mesh:
Year: 2020 PMID: 32804944 PMCID: PMC7451989 DOI: 10.1371/journal.pcbi.1008137
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Comparison of Z. mobilis genome-scale metabolic models.
| Model: | ZmoMBEL601 | ||||
|---|---|---|---|---|---|
| Reference: | (This Study) | [ | [ | [ | [ |
| Year of publication: | 2020 | 2010 | 2011 | 2018 | 2016 |
| 747 | 591 | 739 | 648 | 755 | |
| 612 | 498 | 414 | 507 | 593 | |
| 135 | 93 | 325 | 141 | 162 | |
| 19 | 64 | 182 | 89 | 106 | |
| 478 | 353 | 363 | 360 | 439 | |
| 616 | 579 | 600 | 602 | 658 |
Presented values are based on analysis performed with the model files included with the original publications to allow for a meaningful comparison between models and may not match values reported in these publications.
Reaction counts exclude biomass and exchange reactions. Metabolic reactions further exclude transport reactions across the cellular membranes.
iEM439 is a genome-scale metabolic model for ZM1.
Fig 1Model predictions for growth, ethanol and central metabolic fluxes.
(A) Comparison between model predicted (solid lines) and reported experimental (data points) specific growth rates (top) and ethanol production rates (bottom) against glucose uptake rates for published anaerobic glucose minimal media experiments. Simulation ready models for iZM363, iZM411, and iZmobMBEL601 were not available with their respective publications. (B) Plot of the predicted fluxes based on FBA versus the fluxes found via metabolic flux analysis in Jacobson et al. FBA was run for anaerobic growth in minimal media with a constraint preventing lactate production, forcing flux to ethanol as the primary fermentation product. Flux variability as determined by FVA at the optimal FBA growth rate is less than the size of the markers for each reaction. Grey dashed lines represent a 2-fold change in flux in either direction, reactions falling outside of these boundaries are labeled with their reaction IDs from iZM4_478 (where ACOTA is acetylornithine transaminase, EX_ac_e is acetate exchange, FBA is fructose-bisphosphate aldolase, ORNTAC is ornithine transacetylase, P5CR is pyrroline-5-carboxylate reductase, PDH is pyruvate dehydrogenase, PFL is pyruvate formate lyase, PSERT is phosphoserine transaminase, THRAi is threonine aldolase, TKT1 is transketolase reaction involving sedoheptulose-7-phosphate, and TPI is the triose-phosphate isomerase). Reactions with zero flux in the FBA solution are plotted at 10−4 in the log space.
Fig 2Comparison of model predictions to pooled fitness data.
(A) Analysis of the effect of fitness score cutoff for growth phenotype classification of the pooled growth experiments. The fraction of genes with inconsistent growth phenotypes (i.e., above the cutoff in one experiment and below in the other) is shown in blue, the fraction of false positives (model predicts growth, but categorized as no growth experimentally, i.e., GNG mutants) in orange, the fraction of false negatives (model predicts no growth, but categorized as growing experimentally, NGG mutants) in green, and the total model prediction error (combination of false positive and negatives) is shown in red. The vertical dashed grey line represents the selected fitness score cutoff of -0.6 which minimizes the total error. (B) The fitness scores for genes included in iZM4_478 for the two anaerobic glucose minimal media experiments (Exp. 633 and Exp. 638) are shown as a scatter plot and histograms. The fitness cutoff used for growth classification is shown as dashed grey lines. Genes in the scatter plot are colored based on model growth predictions, with cyan being genes predicted to be non-essential and red being genes predicted to be essential. Genes in the upper-left and lower-right quadrants of the scatter plot are genes where the growth phenotypes are inconsistent between the two experiments.
Comparison of in silico predictions of single knockout mutants vs. pooled experimental results for anaerobic growth in minimal media.
| Experimental Results | |||||
|---|---|---|---|---|---|
| Growth | No growth | Inconsistent | Unavailable | ||
| Growth | 142 (GG) | 50 (GNG) | 25 (GI) | 16 | |
| No growth | 20 (NGG) | 167 (NGNG) | 24 (NGI) | 33 | |
Abbreviations included in the table correspond to the abbreviations used in the text and are defined as the intersection of the categories (e.g., GG stands for model predicts Growth, experimental results indicate Growth).
Based on experimental results from Exp. 633 and Exp. 638 in Deutschbauer et al. [26].
The growth/no growth phenotypes for these mutants are different in the two experimental datasets.
The mutant was not available in the dataset.
Cofitness scores of select metabolic modules.
| Module No. | Average Cofitness | No. of Rxn | Coupled Reactions | No. of Genes | Relevant Mutants | Associated Pathways |
|---|---|---|---|---|---|---|
| M1 | 0.880 | 4 | ALAS_f, HMBS_f, PPBNGS_f, UPP3S_f | 4 | ZMO1198, ZMO1879, ZMO1903 | Porphyrinogen Biosynthesis |
| M2 | 0.859 | 3 | CHORS_f, PSCVT_f, SHKK_f | 3 | ZMO0594, ZMO1693, ZMO1796 | Chorismate biosynthesis |
| M3 | 0.846 | 4 | ASPCT_f, | 3 | ZMO0587, ZMO0791, ZMO1707 | Uridine biosynthesis |
| M10 | 0.746 | 10 | ATPPRT_f, HISTD_f, | 9 | ZMO0421, ZMO1178, ZMO1499, ZMO1500, ZMO1501, ZMO1502, ZMO1503, ZMO1550, ZMO1551 | Histidine biosynthesis |
| M46 | 0.470 | 19 | AMAOTr_f, AOXSr2_f, BTS5_f, DBTS_f, | 13 | ZMO0094, ZMO0423, ZMO0425, ZMO0426, ZMO0427, ZMO1067, ZMO1146, ZMO1222, ZMO1278, ZMO1692, ZMO1915, ZMO1917, ZMO1918, | Biotin biosynthesis |
| M45 | 0.497 | 5 | 3 | ZMO1313, ZMO1684, ZMO1708 | Pyridoxine biosynthesis | |
| M51 | 0.346 | 8 | 3 | ZMO1823, ZMO1824, ZMO1825 | Nitrogen fixation | |
| M52 | 0.332 | 5 | CDPMEK_f, | 4 | ZMO0180, ZMO1128, ZMO1182, ZMO1851 | Isoprenoid Precursor biosynthesis |
| M56 | 0.293 | 3 | PGCD_f, PSERT_f, PSP_L_f | 3 | ZMO1137, ZMO1684, ZMO1685 | Serine biosynthesis |
| M57 | 0.247 | 6 | ADEt2rpp_r, | 3 | ZMO0873, ZMO0874, ZMO0969 | Hopanoid biosynthesis |
| M60 | 0.119 | 11 | ADCL_f, ADCS_f, AKP1_f, DHFS_f, | 7 | ZMO0113, ZMO0114, ZMO0582, ZMO0938, ZMO1006, ZMO1229, ZMO1277 | Folate biosynthesis |
| M63 | -0.140 | 7 | AMPMS2_f, | 3 | ZMO0172, ZMO0738, ZMO1834 | Thiamine biosynthesis |
Module number shows relative position of module based on sorted average cofitness values.
Suffix "_f" represents forward component and "_r" represents reverse component of the reaction. Reactions italicized either had no available mutant in the collection, or had isozymes confounding analysis. Underlined reactions represent exchange or sink reactions missing GPRs. Bolded reactions represent reactions missing GPRs.
Mutants whose genes are associated with the reactions in the module excluding known isozymes and mutants that are absent from the Tn5 mutant collection.
Fig 3Identification of the histidinol-phosphatase gene.
(A) An overview of the histidine biosynthesis pathway, converting 5-phosphoribosyl diphosphate (PRPP) to L-histidine. Note that ZMO1500 and ZMO1502 are subunits associated with the same reaction. Highlighted in red is the gap-filled histidinol-phosphatase reaction lacking an annotated gene. (B) Boxplot of the cofitness values for genes with known histidine biosynthesis genes. Cofitness of the genes in the histidine pathway with the eight genes of the histidine pathway are shown on the left. Candidate genes for the histidinol-phosphatase reaction (i.e., those with the highest average cofitness to the known genes) are shown on the right. The low cofitness outlier in all cases (except ZMO1551) corresponds to the cofitness with the ZMO0421 gene. (C) Growth experiments showed the ZMO1518::Tn5 mutant was a histidine auxotroph that could be rescued by complementation with ZMO1518 on a plasmid. Phenotypes are categorized as growth (++), weak growth (+), and no growth (-). (D) Growth experiments in E. coli ΔhisB demonstrate that ZMO1518 encodes a histidinol-phosphatase. Note that in E. coli, hisB has two functions catalyzing both the sixth and eighth steps in histidine biosynthesis, and therefore the E. coli ΔhisB knockout requires complementation with both ZMO1503 and ZMO1518.
Summary of identified genes.
| Experimental Activity | Identification Method | MEGS host | Previous KEGG Annotation | |
|---|---|---|---|---|
| ZMO0201 | Glutamine amidotransferase of 4-amino-4-deoxychorismate synthase (isozyme) | MEGS | Δ | Glutamine amidotransferase of anthranilate synthase |
| ZMO0563 | Chorismate-pyruvate lyase | MEGS | Δ | Chorismate mutase |
| ZMO1008 | Erythronate-4-phosphate dehydrogenase | MEGS | Δ | FAD linked oxidase domain protein |
| ZMO1518 | Histidinol phosphatase | Bar-Seq Correlation | N/A | Inositol-monophosphatase |
| ZMO1916 | Pimeloyl-ACP methyl ester esterase | MEGS | Δ | Conserved Hypothetical Protein |
Fig 4Pairwise cofitness in poorly correlating modules.
The seven poorly correlating modules are represented as node and edge plots. Each node (black square) is a gene within the module, while each edge between nodes represents the cofitness for that pair of genes. Edges in blue represent positive cofitness and edges in red negative cofitness. Edge thickness correspond to the value of cofitness. These pairwise cofitness values are shown for smaller modules. In Module 45, the gene ZMO1008 was identified via MEGS during model development and is shown here with dashed lines.