Literature DB >> 33428734

Evaluating accessibility, usability and interoperability of genome-scale metabolic models for diverse yeasts species.

Iván Domenzain^1,2, Feiran Li^1,2, Eduard J Kerkhoven^1,2, Verena Siewers^1,2.

Abstract

Metabolic network reconstructions have become an important tool for probing cellular metabolism in the field of systems biology. They are used as tools for quantitative prediction but also as scaffolds for further knowledge contextualization. The yeast Saccharomyces cerevisiae was one of the first organisms for which a genome-scale metabolic model (GEM) was reconstructed, in 2003, and since then 45 metabolic models have been developed for a wide variety of relevant yeasts species. A systematic evaluation of these models revealed that-despite this long modeling history-the sequential process of tracing model files, setting them up for basic simulation purposes and comparing them across species and even different versions, is still not a generalizable task. These findings call the yeast modeling community to comply to standard practices on model development and sharing in order to make GEMs accessible and useful for a wider public.

Entities: CellLine Chemical Disease Species

Keywords: accessibility; genome-scale metabolic models; interoperability; systems biology; usability; yeast species

Year: 2021 PMID： 33428734 PMCID： PMC7943257 DOI： 10.1093/femsyr/foab002

Source DB: PubMed Journal: FEMS Yeast Res ISSN： 1567-1356 Impact factor: 2.796

INTRODUCTION

Genome-scale metabolic model reconstruction has been established as one of the major modeling approaches for systems-level metabolic studies (Gu et al. 2019). These models are mainly built in a bottom-up approach, in which genome information is combined with the accumulated knowledge about the metabolic capabilities of a living organism to reconstruct a complete metabolic map (Nielsen 2017). Another widely used approach for model reconstruction consists of the use of one or multiple well-curated networks as scaffolds, due to the high degree of conservation of metabolism for phylogenetically close species. Metabolic models have been proven to be useful as knowledge databases (Herrgård et al. 2008), tools for contextualization of omics data (Kerkhoven et al. 2016) and for guiding metabolic engineering projects (Meadows et al. 2016), enabling systematic explorations of the relationship between genotypes and phenotypes. The metabolic model iFF708 (Förster et al. 2003) of Saccharomyces cerevisiae, the genome of which was the first eukaryotic one to be sequenced (Goffeau et al. 1996), was the first published GEM for its entire domain in 2003. This model has been used as a scaffold for further network refinements (Duarte, Herrgård and Palsson 2004; Kuepfer, Sauer and Blank 2005; Herrgård et al. 2006; Nookaew et al. 2008), which has facilitated the development of metabolic models for several other budding yeast species over the years, due to their well evolutionarily-conserved metabolic capabilities (Shen et al. 2018). Multiple model reconstructions exist, not just for S. cerevisiae, but for several other yeast species. These reconstructions have usually been carried out by different research groups, resulting in specific network improvements according to their scientific interests, but at the same time yielding incompatible identifiers for reactions and metabolites hampering any systematic comparison and evaluation across models (Herrgård et al. 2008). As GEMs are valuable tools for a wide variety of applications, their end users vary from academic researchers with different backgrounds and levels of computational skills, to professionals in the biotechnology and pharmaceutical industries. Therefore, there is a strong need for computational metabolic models to be accessible and published in a ready-to-use format, which facilitates their utilization by non-expert users. Additionally, the use of consistent and standardized identifiers for their components enables comparisons across models, thus simplifying the process of finding the best model for a given application.

Latest developments on yeasts GEMs

The development, interconnections and applications of metabolic models for different yeast species have been reviewed extensively (Sánchez and Nielsen 2015; Lopes and Rocha 2017; Castillo, Patil and Jouhten 2019; Chen, Li and Nielsen 2019) however, the list of yeast GEM models is continuously increasing both in number of GEMs and encompassed species. Here we briefly summarize the development history of all models for diverse yeast species that are currently available in the scientific literature. The validation strategies and main applications of these models, as described in their original publications, are provided in Table S3 (Supporting Information), indicating the type of biological data and computational methods used for each case. S. cerevisiae is one of the most studied organisms in the Eukarya domain, which has resulted in a long modeling history with 18 networks currently available. The models iND750 (Duarte, Herrgård and Palsson 2004), iLL672 (Kuepfer, Sauer and Blank 2005) and iIN800 (Nookaew et al. 2008) were directly derived from iFF708 (Förster et al. 2003) and subsequently used as templates for iMH805/775 (Herrgård et al. 2006), iMM904 (Mo, Palsson and Herrgård 2009), iAZ900 (Zomorrodi and Maranas 2010) and iTO977 (Österlund et al. 2013) reconstructions. As these multiple reconstructions added new knowledge and gap-fills to the network, a first attempt of unification was carried out by the knowledge base Yeast1, published in 2008 (Herrgård et al. 2008). The concept of standardized identifiers for reactions and metabolites was first implemented in this reconstruction, but simulation capabilities were not achieved. Sequential curation iterations were performed (Yeast2 and Yeast3) until the publication of Yeast4, which notably increased the network connectivity and the number of included metabolites, making it a suitable model for simulation purposes (Dobson et al. 2010). Further updates to the consensus metabolic network have shown to improve predictions on gene essentiality, induced auxotroph phenotypes and cellular growth on diverse environments (Yeast5 (Heavner et al. 2012), Yeast6 (Heavner et al. 2013) and Yeast7 (Aung, Henry and Walker 2013)). In 2019, a new version of the consensus metabolic network, Yeast8, was published (Lu et al. 2019), its reconstruction process combined information from previous GEMs, different curated databases such as KEGG (Kanehisa et al. 2016), SGD (Hellerstedt et al. 2017), BioCyc (Karp et al. 2019), Reactome (Fabregat et al. 2018) and UniProt (The UniProt Consortium 2017) and experimental data on substrate usage. Furthermore, Yeast8 provides an ecosystem of multilayer models suited for different kinds of phenotype predictions, ranging from 1011 strain-specific models to incorporation of enzyme constraints (ecYeast8) and protein 3D structures (proYeast; Lu et al. 2019). In parallel with the development of the consensus network, iSce926 (Chowdhury, Chowdhury and Maranas 2015) was derived from Yeast7 (Aung, Henry and Walker 2013) in 2015, incorporating gene essentiality and synthetic lethality information to curate gene-reaction rules. The model iSc-AMRS-1 (Wichmann et al. 2016) was developed from iLL672 (Kuepfer, Sauer and Blank 2005) in 2016, mainly by curation of proton balancing for mitochondrial ATP production and reaction reversibility, aiming to improve flux distribution predictions in order to investigate production of isopropenoids. The model SpoMBEL1693 for Schizosaccharomyces pombe, a model organism for eukaryotic cell cycle studies, was developed in 2012 using annotated genes and reactions from the KEGG database as a draft network (Sohn et al. 2012). iNX804, a metabolic model for Candida glabrata, known as a platform organism for pyruvate production, was reconstructed in 2013 and used for identification of gene targets for enhanced production of pyruvate-derived fine chemicals (Xu et al. 2013). The metabolism of Candida tropicalis, known as a promising host for α, ω-dicarboxylic acids production, has been studied with the model iCT646, reconstructed through the collection of multiple database information in 2016 (Mishra et al. 2016). The model iOD907, a metabolic network for Kluyveromyces lactis, a yeast commonly used in the dairy industry, was published in 2014 (Dias et al. 2014). Its reconstruction process used iMM904, for S. cerevisiae, as a scaffold and merged it with annotation for metabolic genes and transporters from KEGG (Kanehisa et al. 2017) and TCDB (Saier et al. 2016), respectively. This model was validated with data for growth on diverse carbon sources and used to investigate phenotypic differences for single gene knockout strains between K. lactis and S. cerevisiae (Dias et al. 2014). Pichia pastoris is an established workhorse in biotechnology for heterologous protein production, as it shows superior protein secretion efficiency compared with other yeasts (Schmidt 2004). Additionally, humanized N-glycosylation patterns for recombinant protein production can be obtained by engineering its metabolism. The first two GEMs for P. pastoris, PpaMBEL1254 (Sohn et al. 2010) and iPP668 (Tomàs-Gamisans, Ferrer and Albiol 2016), were both developed in 2010 using genome annotation information from databases and literature. In 2015, ihGlycopastoris (Irani et al. 2016) was specially developed for simulation of recombinant protein production as a target, by combining the previously established iLC915 (Caspeta et al. 2012) model with humanized N-glycosylation pathways. This allowed the investigation of the influence of N-glycosylation processes on protein production and the model was used for the prediction of gene overexpression targets for improving protein yields. The model Kp.1.0 was published in 2017, in which 12 different biomass compositions were tested under different growth conditions, showing minor effects on growth and gene essentiality predictions, but drastic changes in flux distributions (Cankorur-Cetinkaya, Dikicioglu and Oliver 2017). A total of three previous P. pastoris reconstructions (Chung et al. 2010; Sohn et al. 2010; Caspeta et al. 2012) were merged into iMT1026 (Tomàs-Gamisans, Ferrer and Albiol 2016), expanding the representation of fatty acid and sphingolipid metabolism, intact N-glycosylation, O-glycosylation and glycosylphosphatidylinositol(GPI)-anchor pathways. iMT1026 was then curated to iMT1026.v3 in 2018, leading to a refinement of predictions for cellular growth on glycerol and methanol as carbon sources (Tomàs-Gamisans, Ferrer and Albiol 2018). Additionally, the model iRY1243 was created in 2017 by merging iPP668, PpaMBEL1254, iLC915 and iMT1026, also incorporating curation of biosynthesis of vitamins and cofactors, which added more than 200 metabolic genes to the network. This model was validated with the use of RNAseq data for different conditions, utilization of carbon and nitrogen sources and 13C-labeled derived fluxomics, yielding an overall high consistency of predictions for essential genes, flux distributions and different mutant phenotypes (Ye et al. 2017). The yeast Scheffersomyces stipitis (formerly known as Pichia stipitis) has raised interest due to its great native potential for xylose utilization. In 2012, three models were published for this species: iTL885 (Liu et al. 2012) and iSS884 (Caspeta et al. 2012) were derived from previous S. cerevisiae’s models, whilst iBB814 (Balagurunathan et al. 2012) was reconstructed from genome annotation extracted from various databases. A modified version of iBB814, the model iDH814, was published in 2016 and used to elucidate the redox balance shift response to reduced oxygen supply conditions (Hilliard et al. 2018). As these four reconstructions just account for the cytoplasm, mitochondria and peroxisome as cellular compartments, a fully compartmentalized model for this relevant organism is still missing. The oleaginous yeast Yarrowia lipolytica, is another organism for which multiple GEMs already exist. Its first model, iNL895 developed in 2012 (Loira et al. 2012) and other two following models iMK735 (Kavšcek et al. 2015) and iYali4 (Kerkhoven et al. 2016), were derived from previous networks of the phylogenetically distant yeast S. cerevisiae, in contrast to iYL619_PCP (Pan and Hua 2012), reconstructed directly from Y. lipolytica specific information available in public databases and literature. In 2018, iYLI647 (Mishra et al. 2018) was developed using a previous reconstruction for the same species, iMK735 (Kavšcek et al. 2015), as a scaffold and expanded to include the ω-oxidation pathway that converts fatty acids to long-chain dicarboxylic acids (DCAs), the subsequent fatty-acid degrading β-oxidation pathway and branched-chain amino acid degradation pathways, in order to guide simulation of metabolic engineering strategies for enhanced DCA production. During these years, other non-conventional yeasts have gained more attention due to their fascinating and diverse phenotypes. Several GEMs have been constructed as an attempt to understand their particular traits. Rhodotorula toruloides is an oleaginous yeast, which can accumulate lipids up to 70% of its dry mass (Ratledge and Wynn 2002). Previous modeling approaches have explored the use of constraint-based methods together with a reduced metabolic network for this organism to assess lipid accumulation on different substrates (Bommareddy et al. 2015; Castañeda et al. 2018), but its first genome-scale model, rthoGEM (Tiukova et al. 2019), was published in 2019. Cell growth data using glucose, xylose and glycerol as substrates were used to validate the model, while gene targets for triacylglycerol and carotenoid production were predicted with the use of the FSEOF algorithm (Choi et al. 2010). That same year, iRhto1108 (Dinh et al. 2019), was developed using Yeast7 and the Kbase fungal metabolic network (Arkin et al. 2018) as model templates. This model increased the metabolic gene coverage in comparison to rthoGEM (from 926 to 1108) and enabled growth simulations using arabinose and cellobiose as carbon sources. Zygosaccharomyces bailii has been described to have high tolerance towards acetic acid (Palma et al. 2017; Palma, Guerreiro and Sá-Correia 2018). It has been suggested that the Zygosaccharomyces clade diverged from Saccharomyces ancestors just before the whole genome duplication event (WGD; Kurtzman 2003), which took place approximately 100 million years ago, making the Zygosaccharomyces genus the closest pre-WGD ancestral group of relatives to study the genome evolution of S. cerevisiae (Hagman et al. 2013; Solieri et al. 2013). The model ZyPa1 (Filippo et al. 2018) was reconstructed using homology information from 20 different yeasts belonging to the Saccharomycetaceae family, and was then connected to the KEGG database to obtain a draft network. Stoichiometry and localization information for the reactions were extracted from the models Yeast7 (Aung, Henry and Walker 2013) and iOD907 (Dias et al. 2014). ZyPa1 contains 2413 genes, more than twice the number of genes in Yeast8 (Lu et al. 2019), being the metabolic model for a yeast species with the highest number of genes. This GEM has been applied to the study of cellular growth under co-consumption of lactate and glucose. Kluyveromyces marxianus is a thermotolerant yeast that can even tolerate temperatures as extreme as 52°C (Nonklang et al. 2008), making it a specially interesting organism host for industrial bioproduction. The first GEM for K. marxianus, iSM996, was built in 2019 (Marcišauskas, Ji and Nielsen 2019) by using a draft model generated with the RAVEN Toolbox (Wang et al. 2018), aided by the KEGG database and the models iOD907 (Dias et al. 2014) and Yeast7 (Aung, Henry and Walker 2013) as sources for the network gap-filling process. iSM996 was validated using data on carbon and nitrogen source usage, and transcriptome datasets were integrated in order to simulate growth under different temperatures (Marcišauskas, Ji and Nielsen 2019). Lachancea kluyveri is a weak Crabtree positive yeast of industrial relevance due to its capabilities for ethyl-acetate secretion, when cultivated in aerobic batch conditions, and usage of urea and uracil as sole nitrogen sources for growth. In 2020, the model iPN730 (Ghosh et al. 2020) was built on a Kbase workspace (Arkin et al. 2018) using iMM904 (Mo, Palsson and Herrgård 2009) for S. cerevisiae as a template network and other 13 fungi models as references for homologous reactions searches. The model was validated by simulating cellular growth on diverse environments (Ghosh et al. 2020).

A repository for yeast species metabolic models

All aforementioned yeasts GEMs, together with the previously published models, were used to query the literature using the keyword ‘yeast’ together with ‘metabolic model’, ‘GSM’, ‘GEM’ or ‘GENRE’ (genome-scale network reconstruction). In total, 43 model files for 12 different organisms were found either as part of publications in peer-reviewed journals, supplementary files for preprint articles in bioRxiv, or in the yeastnet model database (https://sourceforge.net/projects/yeast) when no specific publication about their reconstruction was found (as in the case of Yeast2, Yeast3 and Yeast4). Most of these yeast species belong to the Saccharomycetales order in the Ascomycota phylum, but some of them have been classified as part of other classes, as Schizosaccharomyces pombe (Schizosaccharomycetes) or even phyla, such as the Basidiomycota fungus Rhodotorula toruloides (Table S1, Supporting Information). As expected, S. cerevisiae is the yeast species for which the most GEMs have been reconstructed, however multiple models are also available for P. pastoris, Y. lipolytica and S. stipitis (Fig. 1A). This collection of model files has been stored in a publicly available GitHub repository at https://github.com/SysBioChalmers/YeastsModels, together with the necessary scripts for their further analysis. The search and exploration processes for these models pointed out several aspects that can be classified into three main categories: accessibility, usability and interoperability.

Figure 1.

Accessibility of metabolic models for diverse yeast species. (A) Number of published models per species. (B) Number of published models per file format. Models available in several formats are counted multiple times. *NA indicates model files that were not available in either their original publications or external model repositories (C) Proportion of models provided as an SBML file in their original source or publication. (D) Proportion of yeast models stored in different public databases. Models stored in several databases are just accounted as part of the one that uploaded them first. (E) Proportion of models with continuous development tracked on public repositories.

Model accessibility

The analyzed models in this review span more than 17 years of active research, in which standards for file formats and sharing practices in the field of systems biology have changed, making the retrieval of their original files a time-consuming and not automatable task. Even though the Systems Biology Markup Language (SBML) was released in 2002 (Hucka et al. 2003), and since then has evolved to become the standard file format for metabolic modeling, 27% of the analyzed models were shared in a different format in their original publications, such as .txt, .XLS and .pdf (Fig. 1B and C), which limits scientific exchange and reproducibility of results on different setups due to their dependence on specific software applications (Ravikrishnan and Raman 2015). As not all models could be successfully obtained from their original sources, models were also sought in other public repositories such as Biomodels (Chelliah et al. 2015), Biomet (Garcia-Albornoz et al. 2014) and openCOBRA models (Ebrahim et al. 2015; Fig. 1D), which contain curated metabolic reconstructions not just for yeast species but for all key phylogenetic groups (Monk, Nogales and Palsson 2014). The models from the last decade present in this catalogue reflect the trend of referring to unambiguous entries in such databases instead of uploading model files as supplementary material to their respective journal websites. Notably, a novel methodology for model sharing and development has been proposed by the Yeast8 project (Lu et al. 2019) and the Memote model test suite (Lieven et al. 2020), which with the aid of version control tools, such as Git and GitHub, provides not just the final snapshot of a GEM but its whole development history, offering also a web platform for open and continuous development. These version control tools have also been implemented for Y. lipolytica, K. marxianus and R. toruloides GEMs (iYali4 (Kerkhoven et al. 2016), iSM996 (Marcišauskas, Ji and Nielsen 2019), rthoGEM (Tiukova et al. 2019) and iRhto1108 (Dinh et al. 2019)), which represent 11% of the collected models (Fig. 1E). More community-driven modeling efforts are expected to emerge in the next years as a way to circumvent the drawback of having multiple independent reconstructions available for some of these yeast species.

Model usability

In order to evaluate the complexity of the process of getting started when utilizing a GEM, a testing pipeline was developed using the RAVEN (Wang et al. 2018), COBRA (Heirendt et al. 2019) and COBRApy (Ebrahim et al. 2013) toolboxes, which in a series of sequential steps aims to obtain feasible flux balance analysis simulations (Orth, Thiele and Palsson 2010), with cellular growth maximization as an objective function, assuming that no prior knowledge about the model´s specific structure and identifiers was available. In total, SBML files for 37 models were found available in this study, and therefore analyzed by the mentioned pipeline. The first tested functionality was the importability of each SBML model into a non-empty MATLAB structure (Table S2, Supporting Information). This was satisfactorily achieved for the majority of these models, 97%. The only non-loadable SBML file was also tested with the COBRApy toolbox, but its import could not be accomplished due to parsing errors. Secondly, a default objective function was sought in the model structure by retrieving any non-zero coefficient in the objective function field or so called ‘c vector’. Of the analyzed models, 76% showed a predefined objective function. Further exploration found that all of these objectives are maximization of the growth rate, ‘biomass exchange’ or ‘biomass formation’. Taking this into account, traceability of a biomass pseudoreaction was also evaluated. For doing so, the presence of the substrings ‘growth’, ‘biomass’ and ‘vgro’ was explored in the model.rxns and model.rxnNames fields. In total 84% of the tested models contain a biomass pseudoreaction identifiable with the used patterns. This does not imply that a biomass reaction is absent for the 16% remaining models, but that the search for it would require a customized manual procedure for each of them. For all of these models, maximization of the found biomass reaction was set as an objective function and all of their exchange reactions were opened in both directions (lower and upper bounds of −1000 and 1000 mmol/gDw h, respectively) to check in silico cellular growth capabilities. In total, 76% of the tested subset (28 models) showed a non-zero growth rate when subject to these constraints. We consider these models as available in a ready-to-use setup, as no further steps or manual inspection was needed to simulate growth. Detailed information for the evaluated metrics and features can be found in Table S2 (Supporting Information). In order to assess the utilization of these models by the scientific community, the total and average annual citations were used as proxy metrics. Figure 2E shows that a larger proportion of the cited models that were recently published (<5 years ago) have been made available in a ready-to-use format (77%) in comparison to those that were published a longer time ago (62%). For the S. cerevisiae network reconstructions, it is clear that older models are on average more used or referred to in the scientific literature. However, as time has passed more models have become available and decays on citations for older models usually coincide with publication and rise of newer ones (Fig. 2F). This might suggest that scientific interest shifts towards more recent models as they accumulate the knowledge gathered by previous reconstruction iterations.

Figure 2.

Model usability. (A) Proportion of tested SBML models successfully imported with the RAVEN, COBRA or COBRApy toolboxes (total = 37 models). (B) Proportion of tested models with a default objective function. (C) Proportion of tested models with a biomass pseudoreaction identifiable with the substrings ‘biomass’, ‘growth’ or ‘vgro’. (D) Proportion of models yielding a non-zero growth rate according to the developed testing pipeline. (E) Citation landscape of models of yeasts metabolism. Annual average citations vs elapsed time since publication per species, the proportion of ‘operative models’ (available in a ready-to-use format, according to the developed testing pipeline) is indicated in the upper part for models that have been published more or less than 5 years ago. (F) Evolution of the annual citations for models of S. cerevisiae metabolism. Citations were queried from Google scholar, accessed on September 4th, 2020.

Interoperability

As described above and repeatedly concluded (Dräger and Palsson 2014; Ebrahim et al. 2015; Heavner and Price 2015; Sánchez and Nielsen 2015; Mendoza et al. 2019), the lack of identifier consistency and connection to external databases for all of the relevant components of GEMs (metabolites, reactions, genes and cellular compartments) together with the use of non-standardized file formats, are the main obstacles for direct model comparison and assessment, even across reconstructions for a single species. In order to aid systematic model development, according to community-agreed practices, a standardized set of metabolic model tests (Memote) has recently been developed as an open-source software suite (Lieven et al. 2020). Memote tests are divided into organism- and model-specific ones, not applicable to all reconstructions, and a section of independent tests, which check for model consistency (in terms of mass and charge balance, metabolite connectivity and stoichiometric consistency), and annotation, or connection to external databases, for metabolites, reactions, genes and SBO terms (systems biology ontology terms; Courtot et al. 2011). This pipeline assigns a numerical score, based on the specific model characteristics, to each of the independent tests, relevant for comparing evolution of particular model features across versions. The 37 SBML model files analyzed above were furthermore tested by the Memote suite. As this software relies on the latest version of the SBML Level 3 Flux Balance Constraints package (Olivier and Bergmann 2018), not all of the models could be tested due to parsing errors for those available in previous or conflicting SBML versions (36%), as shown in Fig. 3A. Noteworthy, this is not an indicator of model quality or predictive performance, but rather one of compliance with model format standards. Further details for all of the individual tests and computed scores are available as HTML reports and also as part of Table S2 (Supporting Information), both stored in the aforementioned GitHub repository.

Figure 3.

Memote tests results. (A) Proportion of models for which the automated Memote test was accomplished. (B) Memote test scores for the consensus reconstructions of the S. cerevisiae metabolic network. Scores for metabolites, reactions and SBO terms evaluate the degree of annotation for such components with external databases identifiers that can facilitate the traceability of a component across different model versions. The Memote global score takes into account the structure, consistency, annotation and functionality of metabolic models. The community-driven series of consensus metabolic network reconstructions for S. cerevisiae has tried to overcome some of the obstacles mentioned above by keeping consistency of identifiers across the subsequent model refinement iterations. However, this approach has not yet been applied to any of the other yeast species models analyzed in this review. Such consistency allows to interpret Memote standardized test results as an evolution of the network in different regards, offering a systematic guidance for further development. Annotation of metabolites, reactions and SBO terms has been improved throughout the different versions of the S. cerevisiae model (Fig. 3B). Resultingly, Yeast8 shows the most complete degree of annotation for all of these features, even though standardized gene identifiers that are traceable to an external database are still missing.

CONCLUSIONS

Here we reviewed, collected and evaluated the usability of the available GEMs for different yeasts species, offering a valuable concentrated resource for the community. The model recollection process evidenced that not all of them are easily accessible and multiple sources were needed to be queried. Even though specialized databases for curated GEMs exist, connections between them are still missing, which might hamper large-scale multi-species studies. We also found that GEM files have been shared in a wide variety of file formats, making the utilization of some of them dependent on specific software tools. Storing and sharing models using the latest version of the standard SBML format will facilitate scientific exchange and enable reproducibility of results, avoiding platform dependent parsing issues. As part of this review, a simplified model test pipeline was developed and run for all of the yeast GEMs with an available SBML file. With the aim of obtaining feasible FBA simulations with the minimal number of steps, we simulated the initial familiarization process of a non-expert user with a new model. It was found that 28 of the tested models (representing 62% of the models in this catalogue) were available in a ready-to-use format, as in-silico growth was obtained without any further knowledge or utilization experience on them. This result must not be interpreted as a measurement of model quality, as biological meaningfulness or consistency of predictions were not evaluated. More robust tests were performed with the aid of the Memote suite. Nonetheless, this was not possible for all of the analyzed models due to outdated file formats. For such cases, update of their respective SBML files is recommended in order to ensure compatibility with the latest modeling and analysis tools and to facilitate further development. The results of the Memote standardized tests illustrated a progressive evolution concerning the annotation of model components for the different versions of the S. cerevisiae metabolic network, highlighting the advantages of community-driven model development. The total or partial lack of cross-references of model components to widely used external databases is still a common trait of the models in this catalogue. GEMs are usually described as valuable scientific resources not just for quantitative predictions but as genome-scale knowledgebases of living organisms. However, as their usability and exploration are still hindered by the lack of format consistency, cross-references and continuous community development, the full exploitation of their potential remains restricted to expert users.

81 in total

1. Unravelling genomic diversity of Zygosaccharomyces rouxii complex with a link to its life cycle.

Authors: Lisa Solieri; Tikam Chand Dakal; Maria Antonietta Croce; Paolo Giudici
Journal: FEMS Yeast Res Date: 2013-01-14 Impact factor: 2.796

2. Revising the Representation of Fatty Acid, Glycerolipid, and Glycerophospholipid Metabolism in the Consensus Model of Yeast Metabolism.

Authors: Hnin W Aung; Susan A Henry; Larry P Walker
Journal: Ind Biotechnol (New Rochelle N Y) Date: 2013-08

3. Reconstruction and analysis of the genome-scale metabolic network of Candida glabrata.

Authors: Nan Xu; Liming Liu; Wei Zou; Jie Liu; Qiang Hua; Jian Chen
Journal: Mol Biosyst Date: 2012-11-22

4. A constraint-based model of Scheffersomyces stipitis for improved ethanol production.

Authors: Ting Liu; Wei Zou; Liming Liu; Jian Chen
Journal: Biotechnol Biofuels Date: 2012-09-21 Impact factor: 6.040

5. BioMet Toolbox 2.0: genome-wide analysis of metabolism and omics data.

Authors: Manuel Garcia-Albornoz; Subazini Thankaswamy-Kosalai; Avlant Nilsson; Leif Väremo; Intawat Nookaew; Jens Nielsen
Journal: Nucleic Acids Res Date: 2014-05-03 Impact factor: 16.971

6. Metabolic modeling to identify engineering targets for Komagataella phaffii: The effect of biomass composition on gene target identification.

Authors: Ayca Cankorur-Cetinkaya; Duygu Dikicioglu; Stephen G Oliver
Journal: Biotechnol Bioeng Date: 2017-08-15 Impact factor: 4.530

7. SBML Level 3 Package: Flux Balance Constraints version 2.

Authors: Brett G Olivier; Frank T Bergmann
Journal: J Integr Bioinform Date: 2018-03-09

8. Connecting extracellular metabolomic measurements to intracellular flux states in yeast.

Authors: Monica L Mo; Bernhard O Palsson; Markus J Herrgård
Journal: BMC Syst Biol Date: 2009-03-25

9. Integration and Validation of the Genome-Scale Metabolic Models of Pichia pastoris: A Comprehensive Update of Protein Glycosylation Pathways, Lipid and Energy Metabolism.

Authors: Màrius Tomàs-Gamisans; Pau Ferrer; Joan Albiol
Journal: PLoS One Date: 2016-01-26 Impact factor: 3.240