Literature DB >> 26082168

Microbial bioinformatics for food safety and production.

Wynand Alkema, Jos Boekhorst, Michiel Wels, Sacha A F T van Hijum.

Abstract

In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput 'omics' technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety.

Entities: CellLine Chemical Disease Gene Species

Keywords: bioinformatics; food, genomics; microorganisms; predictive models

Mesh：

Year: 2015 PMID： 26082168 PMCID： PMC4793891 DOI： 10.1093/bib/bbv034

Source DB: PubMed Journal: Brief Bioinform ISSN： 1467-5463 Impact factor: 11.622

Background

Food is an indispensable part of our daily life. Many food products undergo some form of processing before they reach the consumer, ranging from fermentation to packaging. In many of these processes, microorganisms play important roles, either in transforming the food into the desired end product (e.g. fermentation of olives, rice, bread, alcoholic beverages such as beer and wine, fermented meat, kimchi and various fermented dairy products such as cheese and yogurt) or in spoiling or contaminating the food. The type of microorganisms used in a fermentation process greatly influences the properties of the fermented product [1]. For example, yeasts produce ethanol as the main fermentation product, whereas the main fermentation product of lactic acid bacteria is lactic acid. The food industry is very active in optimizing strain performance with respect to diversification of product properties such as flavour and texture and with respect to controlling fermentation, by using defined starter cultures to initiate the fermentation process [1]. Strain optimization is an expert-knowledge-guided process involving trial-and-error approaches that are nowadays increasingly backed up by recent high-throughput ‘omics’ developments to improve fermentation processes [2] and to assess safety of food products [3]. Bioinformatics plays an increasing role in predicting and assessing the desired and undesired effects of microorganisms on food [4]. A combination of bioinformatics with laboratory verification of selected findings is particularly powerful. In this review, we focus on bioinformatics methods that can be used to improve the microbial production of fermented food products. These include genomics-based functional predictions, the creation of genome-scale metabolic models and prediction of complex food properties, such as taste and texture, and properties of complex fermentations. All application areas (outlined in the paragraphs below) and their relation to data streams and bioinformatics are described in Figure 1. A glossary of the bioinformatics concepts, methods and tools is provided in Table 1.

Figure 1.

Data and bioinformatics applied in food application areas. Central in this figure are the food application areas (right panel). From organisms, different data sets can be obtained (data reservoir); their abbreviation is given within parentheses. Middle panel: one (of many important) methods and other methods/data sources (see Table 1 for an explanation) relevant for a main application area shown. Interpretation example: for safety assessment, genomes (G), literature (L) and phenotypes (H) are used with the gene function annotation (2.3), orthology (2.4), comparative genomics (2.5) and predicting phenotypes (4) techniques (see Table 1).

Table 1.

Glossary of food bioinformatics concepts and techniques, their explanation and their application

Term	Description and examples of tools
1. Big data/grid/cloud/	With the increasing volume and heterogeneity of data sets (often referred to as “Big Data”), high performance computing is needed for analysis of the data. Many bioinformatics methods have been adapted to run on clusters of multiple computers (grid computing) and on large remotely located servers (cloud computing) [5]. Galaxy [6] and KNIME [7] are two popular software solutions to integrate and distribute larger data analysis tasks to the grid/cloud. Examples of cloud-based bioinformatics applications: HBLAST (BLAST, the most used bioinformatics sequence alignment tool) [8]; TPP, a proteomic analysis tool [9]; HIPPIE: promoter analysis provided as Amazon Machine Image [10]; BG7: bacterial genome annotation based on Amazon Web Services [11].
1.1 Data mining	Statistical and machine learning techniques to determine trends in typically large data sets. Unsupervised techniques (sample grouping is not explicitly used in the analysis) include: principal component analysis (PCA) and clustering algorithms (e.g. K-means, hierarchical). Supervised techniques (sample grouping is taken into account) include: ANOVA, Mann–Whitney U test, partial least squares analysis (PLS), machine learning (e.g. by support vector machines (SVM) [12] and random forest [13]): a computational model is trained to use properties derived from samples to predict the status of samples. See term 4 for examples.
1.2 Virtual machines (VM)	A large computer file (disk image) that consists of an operating system (e.g. Linux), software tools and data. The image can be run on an actual computer using virtual machine software that emulates an actual computer. In other words, a computer in a computer. The advantage of VMs is that they are portable (can be run on many different types of computer hardware), easy to backup and more straight-forward to maintain. Examples of the use of VMs are the generic bioinformatics tools in the NEBC Bio-Linux distribution [14] and the 16s analysis suite Qiime (see term 3.1).
1.3 Databases	Databases are organized collections of biological data. Bioinformatics is only successful if databases with high-quality data are available, together with structured vocabularies that describe the content of the data sets. An updated overview of relevant biological databases can be found here: http://www.oxfordjournals.org/our_journals/nar/database/cap/.
2. Genome sequencing	Determining the complete genome sequence of a microbial strain of interest. Next-generation sequencing (NGS) techniques allow for high-throughput and high-quality sequencing results. Especially the combination if different techniques (e.g. Illumina and Pacific Biosystems or PacBio) result in high-quality (circular) genomes [15].
2.1 Sequencing data (FASTQ)	Sequencing data are represented in FASTQ format. These files provide, next to the raw sequence data, additional information regarding the quality of the reads. In this manner, quality control and trimming can be applied.
2.2 Assembly	Raw sequence reads of different NGS technologies can be assembled into contigs, long stretches of DNA sequence representing part of the genome. Most of the assembly methods are based on alignment of sequence reads with each other (de novo assembly) or against a reference genome (mapping assembly), thereby generating long DNA sequences from the fragments generated by the sequencing. Some examples of assembly tools are Ray [16], MIRA [17] and IDBA [18].
2.2.1 Scaffolding	Organizing the contigs from the assembly (2.2) to larger, gapped, DNA sequences. Some NGS techniques (e.g. Illumina) allow the synthesis of paired end (PE) or mate pair (MP) libraries; libraries with a fixed insert size that are sequenced at both ends. As reads span a larger DNA fragment, the matched reads pairs can be used to order contigs, even if the sequence in between the contigs has not been assembled. In general, most assembly tools allow for scaffolding, but also dedicated tools exist, such as SSPACE [19].
2.2.2 Gap closure strategies	After scaffolding, genome sequences will most often contain gaps. Common strategies to fill these gaps are generating new sequencing data using, for example, PacBio’s long reads [15], or predicting the most likely order and orientation of the contigs using bioinformatics tools like Projector2 [20] or Mauve [21]. These tools infer contig order by comparing them to one or more reference sequences.
2.3 Gene function annotation	Gene function is typically inferred from similarity in amino acid sequence. Gene functions can be predicted by comparing sequences to databases containing genes with known functions with tools like RAST [22] and Prokka [23].
2.4 Orthology	Genes in different organisms are orthologous when they were the same gene in the last common ancestor. Reconstructing the evolutionary history of genes allows the prediction of functional equivalence (i.e. orthologous genes are likely to have similar functions). Tools are OrthoMCL [24] and Orthagogue [25].
2.5 Comparative genomics	All analyses in which genome sequences or genome content of multiple organisms are compared.
2.6 Metabolic modelling	Prediction of growth, and recruitment of metabolic pathways, of microbes by using the genome sequence as an inventory of all possible metabolic reactions. Genome-scale metabolic models can be constructed using automated [26] or comparative genomics analyses [27]. Once constructed, the models can be used to simulate growth by, for example, flux balance analysis (FBA) and to determine the boundaries of fluxes by flux variability analysis (FVA). Tools for modelling are Pysces [28], the SEED [26] and VANTED [29].
3 Microbiome analysis	All microbes present in a particular niche are termed a microbiome. Analysis of microbiomes can be done using different next-generation sequencing-based techniques (see below).
3.1 16s rRNA sequencing	16s amplicon sequencing is the generation of sequence reads from conserved regions of the 16s gene. Amplicon sequencing (e.g. by Illumina) is used to identify the bacterial (and sometimes archaeal) component of microbial communities. Examples of software to infer community composition from sequencing data are Qiime [30] and Mothur [31]. 16s sequencing is a relatively cheap and well-established technique and as such an ideal starting point for characterization of complex cultures for which limited prior knowledge is available.
3.1.1 Functional prediction	16s sequences derived from a particular ecological niche indicate the taxa present and their relative abundance. From these data, presence of gene functions in those taxa can be performed using, e.g., PICRUSt [32]. It infers the presence of gene functions in given taxa using already sequenced genomes part of the same taxa.
3.2 Shotgun metagenomics and metatranscriptomics	Random fragments of the DNA or (enriched) mRNA of a given microbiome are sequenced with next-generation sequencing [33, 34]. Metagenomics and metatranscriptomics techniques are powerful, as they allow circumventing growing microbes while still determining their gene content or gene expression. This provides insight into the molecular functions encoded by the DNA, taxonomic assignment of that DNA fragments or inferring similar information for expressed mRNAs. This method can be used in addition to sequencing of individual isolates from complex cultures. Sequencing of individual isolates, however, has the advantage that comparative genomics can be done and metabolic models can be built more straight-forwardly, provided that the isolates under study are representative of the biodiversity present in the complex culture.
3.2.1 Assembly	Using the sequence overlap, the DNA/RNA-derived sequences can be assembled into larger contigs, see [35] for a recent comparison of tools. Functional annotation of these larger fragments is more straight-forward, but the fraction of reads that can be assembled into contigs depends on both the complexity of the microbiome (many different microbes with varying abundances) as well as the presence of microdiversity (many different microbes with similar genome sequences).
3.2.2 Annotation	Similar to the genome of a single bacterium, the sequences of a metagenome can be functionally and taxonomically annotated by comparing (assembled) sequences or predicted gene products against one or more reference databases with sequences with known functions from known taxonomic origin. Gene context such as operons are, however, primarily missing in shotgun metagenomics reads/contigs. A few tools are: PhymmBL [36] (taxonomic classification using sequence-based models), MG-RAST [37] (functional and taxonomic classification using alignment to reference databases) and MetaPhlAn [38] (taxonomy prediction using taxon-specific marker genes).
3.3 Strain typing and tracking	Pinpointing the presence of a particular microbe (strain) in a biological sample. Using MLST markers [39] (multi-locus sequence typing), PCR based on unique DNA fragments [40] or strain-specific markers [41], the abundance of particular strains can be followed during the course of a fermentation. Potential downside of these techniques is that only known biodiversity can be traced. Therefore, the performance must be evaluated on new strains. New biodiversity can be uncovered, provided that a genomic target is well-designed (e.g. targeting a gene that is single copy with sufficient resolution to distinguish between strains).
4 Predicting phenotypes	Gene–trait matching: machine learning or statistics methods are used to predict the phenotype of a bacterial strain based on the presence/absence of particular genes [42], (parts of) metabolic pathways [43, 44] or classifications from experts [45]. Transcriptome–trait matching: gene expression data (based on microarray or RNAseq) instead of gene presence are used. Transcriptome data from multiple strains grown under the same condition [46] or the same strain grown under different conditions can be used [47–49].
5 Metabolomics	The simultaneous measurement of multiple metabolites in biological samples [50]. Metabolomics is a technique that can be applied to describe reaction products of microorganisms in defined media and in food samples. Its data are very suitable to be associated to results from sensory measurements [51, 52].

Translating genome information into functional predictions

The prediction of function from sequence information is one of the fundamental roles of bioinformatics. The large variety of sequencing techniques generates a large amount of genomics data. Harnessing the power of these data requires careful identification of functional elements in these data and associating the sequence information with function, for example by comparing predicted protein sequences to sequences with known functions. This type of analysis can identify functions for genes (crucial information for metabolic modelling; see below), e.g. prediction of laccases [53]; predict functions for most genes in a bacterial genome [23, 54]; and suggest properties for specific strains of bacteria by projecting the predicted functions of all its genes on pathway databases [55, 56], predicting properties of, e.g., Bifidobacteria in the gut environment [57] or even predict functionalities of complex microbial communities [22, 32]. For genes where a sequence similarity search does not yield a good prediction, their function may be deduced by correlating the presence and absence of the gene in organisms with the presence and absence of a certain phenotypic trait in the same set of organisms (also referred to as gene–trait matching; GTM) [42, 58]. For example, a set of proteins was predicted to be involved in the degradation of plant (oligo-) saccharides by linking isolation source of bacteria to gene presence/absence [59]. Comparative analysis of the genome sequences of a species where some strains have a positive impact (e.g. flavour enhancement) while others are detrimental (e.g. spoilage) can be used to identify genetic elements potentially underlying these differences, as was done for the yeast Brettanomyces bruxellensis [60]. Tools that can be used to link -omics data to phenotypes are PhenoLink [58] and DuctApe [43]. These approaches require a genome sequence, which might be relatively difficult to obtain for microbes that are difficult to grow in culture. Techniques like multiple displacement amplification [61] can be used to amplify DNA from a single cell, and a range of genome assembly tools can be used to assemble the reads obtained from single-cell sequencing [62]. Mobile elements such as transposons, plasmids or phages can carry functionality from one bacterial strain to another. An example is the galactose utilization operon transfer between Lactococcus lactis strains studied by next-generation sequencing and bioinformatics techniques [63]. Identifying potential transposon insertion sites is crucial to this end and can be facilitated by bioinformatics tools such as transposon insertion finder [64].

Improving metabolite production and biomass

Improvement of the food production process by optimizing biomass yield is a topic of continuous attention. A technique to rationally improve fermentation yield is genome-scale metabolic modelling [65]. In this process, the genome sequence of the organism is used as an inventory of the metabolic potential of the strain of interest. Metabolic models have been made for many microbes, including several of food-relevant microorganisms [66-69]. Although the quality of a genome sequence can be a limiting factor (e.g. missed gene due to low sequencing coverage), the metabolic model can be completed by identifying metabolic reactions that are missing in the model, but likely present due to the fact that they are part of metabolic reaction cascades or ‘pathways’ [70]. Complete genome-scale metabolic models together with algorithms such as flux balance analysis allow the in silico simulation of growth of the organism under the (metabolic) restrictions provided by the substrate availability in the medium. These growth simulations can then be used to optimize medium composition to better fit the organism requirements [71]. In addition, the models can suggest alternative or cheaper substrates for fermentation [69], and improve the production of compounds such as amino acids [72] or succinic acid [73], taking into account possible changes in activity with respect to flavour or texture activity of the strain. These models have also been implemented in complex (multistrain) fermentation processes, providing insight in the interactions between different species/strains in a complex fermentation [74]. A second factor that improves the overall yield is the robustness of strains after harvesting. Also, this factor can significantly be influenced by changing fermentation conditions under which starter cultures are prepared. By correlating gene expression levels to the survival of L. lactis, an application of transcriptome–trait matching (TTM), a number of genes that were potentially causative related to survival were identified. Subsequent knock-out of the genes proved that these genes were indeed important for the strains’ phenotype. This shows that not only gene content but also expression of genes is important for a given phenotype. In other words, preconditioning L. lactis strains, followed by GTM and TTM, allows improving their survival to heat and oxidative stresses, typically encountered during spray drying [46, 47].

Improving texture and flavour

The fermentation process also influences the texture and flavour properties of the food product. These characteristics are microorganism-specific [75] and can be changed by fermentation, e.g. the production of flavours by adding adjunct strains to cheese fermentations [76], or the addition of exopolysaccharide-producing organisms to improve the texture of yoghurt [77, 78]. Also, flavour profiles of wine can be modified by either altering fermentation conditions or changing the wine starter cultures [79]. Whereas improvements can be made by testing a variety of experimental settings, bioinformatics and data analytics may be used to optimize the experimental designs [80-82]. The performance of a microorganism under particular fermentation conditions may be deduced from gene content of these microorganisms. Using a metabolic model, L. lactis MG1363 flavour formation could be predicted and was subsequently experimentally verified [67]. Likewise, the genomic sequence of Lactobacillus delbrueckii subsp. bulgaricus revealed how this organism is adapted to for the fermentation of milk and the production of yoghurt [83]. Similar analyses have been carried out for Oenococcus oeni [84] and yeast genomes [85] and their relation to wine fermentation. Due to the larger complexity of yeast genomes, this analysis is more challenging [86]. Using GTM growth on various sugars can relatively well be predicted based on gene content, e.g. for L. lactis, Lactobacillus plantarum, Lactobacillus paracasei and Bifidobacterium breve [58, 87–89]. In the same studies it became apparent that predicting more complex phenotype such as stress tolerance is less straight-forward to predict based on gene content alone [58, 87]. Information on the transcript levels of the genes (see above) might be taken into account to better predict these phenotypes. TTM can similarly be used to associate the expression of microorganism genes to texture and flavour characteristics of a product, such as improving the production of organic acids by knowledge-based altering fermentation conditions [48]. The effects on taste and texture are mainly caused by the metabolites that are produced or converted during fermentations. Rather than associating gene content with effects on taste texture, metabolite patterns may be used directly to predict final sensory characteristics. The golden standard test for sensory characteristics of a fermented product is a quantitative descriptive analysis by a trained sensory panel. These tests are elaborate and require production of substantial amounts of the product. The results are dependent on the panel experience and the attributes that are used to describe the product properties [51]. With metabolomics profiling techniques, it is now possible to simultaneously measure hundreds of metabolites in food samples [50]. This, together with the development of small-scale product screening methods [90], has led to the development of many new statistical methods to associate instrumental data, such as, for example, gas chromatography–mass spectrometry, to sensory data [51, 52, 91–94].

Risk assessment

Rather than predicting functions for all genes in a bacterial genome, selectively screening microbial genome sequences for genes with specific functionalities can be a highly sensitive and computationally efficient way of identifying potential health or safety risks of microbial strains present in a sample. The potential of a specific bacterium for antibiotic resistance or virulence can be investigated by comparing its genome sequence to a reference database containing known resistance genes and virulence factors [95]. Similar approaches have been described for the identification of persistence of bacteria in food products [45], anaerobic spore-forming organisms in food [96] and potential pathogens in metagenomics data [97]. This (meta)genomics-based methodology can be extended to a wide range of functionalities, e.g. production of antimicrobial peptides [98-100] and resistance to cleaning procedures commonly used in food production settings [101, 102]. A requirement for getting useful results out of metagenomics experiments is a dedicated database with gene–function relations and access to domain knowledge on the specific functionality to specify gene functions.

Mixed culture fermentations characterization

Complex fermentations involve an (un)defined (wild) starter culture with different microbes (bacteria, yeasts and fungi) that together ferment a substrate to the product. Examples are cheese, malolactic wine, soy and seafood fermentations [103, 104]. In these fermentations, strong succession of microbes can occur, for instance in wine fermentation, the microbes Saccharomyces cerevisiae and Oenococcus oenii [105, 106]. Similar to the above-described GTM and TTM approaches to associate (transcription of) genes to phenotypes, presence and absence of (combinations of) microorganisms (or their functionality) can be associated to fermentation product characteristics. The first step in characterizing a fermentation is to determine what microorganisms are present at the different stages of the fermentation and to correlate these to other measurements such as metabolomics [107] or the presence of phages [108]. The properties of microbial consortia are determined by the functional potential encoded in all microbial genomes. Metagenomics has an advantage over conventional sequencing of single isolates from consortia because it also reveals DNA of otherwise unculturable organisms. Based on the sequences found in a consortium, functionalities of the microbes can be predicted. Due to the succession of microbes in a fermentation, it is important to omit DNA from dead microbes before building predictive models based on sequences. One way to sequester ‘dead’ DNA, and therefore not sequencing it, is the use of propidium mono azide [109]. Next-generation sequencing techniques that profile, e.g., the 16S gene present in all bacteria are increasingly used over molecular biology techniques, e.g. gel-based methods [110, 111]. The bioinformatics analysis of 16S data from food fermentations is quite well-established (Table 1), resulting in descriptions of the taxa present in a particular fermentation at best at the species level, but for some taxa, the genus level is challenging to obtain [112]. There is a large biodiversity beyond the species level that is not taken into account with, e.g., 16S sequencing. Even within a bacterial species, there is considerable biodiversity. For example, all genes present in strains of the Lactobacillus genus (its pan-genome) comprise over 14 000 gene families, with a single genome encoding ∼3 000 proteins [113]. A gene family typically consists of genes that are evolutionary conserved, but that might have different enzymatic functions depending on the specific protein sequence [114]. Comparative genomics, in combination with molecular strain typing, techniques have been used to uncover strain-level diversity in complex, yet relatively defined, fermentations in general [41] and specifically for L. lactis and Leuconostoc mesenteroides from cheese [108], Lactobacillus sakei from meat fermentations [115], Lactobacillus sanfranciscensis in sourdough fermentations [116] and wine yeasts [86]. With shotgun metagenomics, the DNA in the mixed-culture fermentation is profiled, but strain-level diversity is extremely difficult to deduce from shotgun metagenomics sequence fragments [108]. On the other hand, due to the enormous biodiversity, the actual presence of any strain isolate thought to be of importance in a particular mixed-culture fermentation should be established. The combination of shotgun metagenomics and comparative genomics could prove to be particularly powerful, as the shotgun metagenome DNA sequences can be aligned to the genomes of isolates in order to prove that the functionality present in the isolates covers that of the metagenome [41, 108]. Metatranscriptomics approaches allow profiling the mRNA-derived sequences of a complex fermentation. An advantage of metatranscriptomics over metagenomics approaches is that the gene expression measurement allows determining what genes are actually expressed in a mixed culture. Application of ‘metatranscriptomics’ using microarrays with the genomes of several species to determine global gene expression across species has been reported for Kimchi [117]. Only recently, metagenome and metatranscriptome sequencing of bacterial communities involved in cheese rind fermentations has been reported [118]. The strength of this study is that the metagenomics and metatranscriptomics profiles were traced to their likely sources (genome sequences of isolates from the rind cheese fermentation). Using experimental setups like the latter in combination with metabolomics measurements and appropriate follow-up studies should strengthen the point to use metagenomics/metatranscriptomics techniques to characterize and potentially optimize fermentations. Bacteriophages play an important role in industrial fermentations due to the phenomenon of maintaining biodiversity through phage predation [119], but also because phage sweeps disrupt fermentation processes [120, 121]. Currently, however, predicting the specificity of bacteriophages and the interactions between microbes in mixed-culture fermentation are time-consuming tasks [108, 121–123]. Bioinformatics techniques that analyse the interaction of microbes and bacteriophages, and in-depth knowledge of the metabolic requirements of the microbial consortia present during fermentation could in the future lead to knowledge-based improvements of fermentation stability. This could be achieved by performing experiments with synthetic microbial consortia. The design of these consortia is currently being developed [81], and cross-kingdom interactions are being studied [124]. In a study where cheese rind bacterial communities were created based on various -omics data, knowledge of the fermentation and dedicated follow-up experiments, the potential of predicting properties of complex fermentations [118] was demonstrated. This study did not explicitly describe whether the selected strains (or close relatives) were actually present in a real fermentation. This has been described for representative L. lactis and L. mesenteroides strains of a complex cheese fermentation [108] and an L. lactis strain from a defined consortium [125].

Branding, tracing and detection

Food production and food consumption take place in complex environments in which next to the microorganisms present in the natural environment, many other sources of proteins, fat and carbohydrates are present. The presence of the endogenous flora as well as the macromolecular structures of the food can cause a lot of difficulty in detection and tracing of specific microorganisms, such as potential food pathogens or probiotic strains added to the food product for enhanced functionality. Next to classical detection DNA-based techniques such as (q)PCR [126], new methods based on genomic data have been developed that allow for a fast and accurate tracking or detection of specific species or even strains among the natural microflora. By specific amplification and sequencing of a locus that was identified to be discriminatory between different L. plantarum strains, it was shown that one could quantify the relative presence of different strains through the passage of the gastrointestinal tract [40]. This same approach can also be followed to design specific primers to discriminate between pathogenic and non-pathogenic populations of specific species [127] and to detect a strain of interest in food products, allowing dedicated branding of a specific product. Next to dedicated tracing of a single strain, metagenome approaches as described for studying complex fermented products, for example in cheese [118] and fermented foods of plant origin [128, 129], will also have their benefit in the detection of spoilage bacteria. Especially as these methods allow for direct profiling of the product, and do not require a culture step that could create bias in the results, they could very well prove to be more specific to detect spoilage bacteria from a product. Culturing steps will always have their merit due to limited costs and requirement of limited amounts of material. Especially in fermented products, 16s community profiling approaches will allow detecting low abundant microbes that might be overgrown in culture-dependent detection methods.

Perspectives

Bioinformatics is increasingly applied in food fermentation and safety. Below we describe some new and exciting developments in this field. Sequence-based prediction of microbial functionality is just starting. An inventory is needed of which functionality for which bacteria can reliably be determined using sequence data. New publicly available data sets with genotype/phenotype/transcriptome such as those available for L. lactis and L. plantarum could help to develop new sequence-based functional prediction strategies such as further specified protein domains to more specifically screen for, e.g., carbohydrate active enzymes [130] and relating promoters or regulatory binding sites to phenotype [42]. By consolidating the above information, a knowledge-based in silico screening of culture collections for desired traits can be established. This would require databases that use controlled vocabularies to integrate data from genomics, systems biology, phenotypes, ingredient information, properties of batches of foods, on-line measuring of parameters during the food making process and ‘biomarkers’ for functionality in specific taxa (based on, e.g., GTM). Specific emphasis should be put in propagating the FAIR (findable, accessible, interoperable, re-usable; http://datafairport.org/) principle in storing data. Given that analyses will become more standardized and computer resource-intensive, the software and databases could be set up in a virtual machine that can subsequently be run on computer clusters or in the cloud. First steps towards data consolidation are being made in the EU-funded project GenoBox (www.genobox.eu) that aims to create a database that consolidates genotype and phenotype data that allow screening microbial genomes for functionality and safety risk factors. Similarly, IBM and MARS have established a consortium that aims to sequence the food supply chain (http://www.research.ibm.com/client-programs/foodsafety/). Their aim is to determine nominal levels of microbial components in many food products across the globe. The resulting database can be used to assess risks of the presence of certain microbes/functionality in a given food product. Given that sufficient biodiversity has been recorded into this database, it could also be used for branding products based on unique microbiota signatures present in fermented products or foods that contain a microbiome. Another important factor to consider in steering the performance of fermentations is the interactions between microbes and their environment. This new layer of complexity has been studied, for instance, for microbe–plant interactions for rice or coconut [131, 132] and the use of systems biology beyond genome-scale metabolic models by using kinetic models to describe interactions between microbes and their matrix [133]. These studies require a substantial knowledge base on both the properties of the microorganisms and the physical properties of the matrix in which the organism operate. In conclusion, the increasing amount of data on food fermentation and safety encourages consolidating this information in databases that with the right experimental design, algorithms, expertise and follow-up experiments should allow enhancing the prediction of fermentation performance and safety. Exploiting the vast biodiversity to create new food products or to optimize existing ones is gaining momentum. Sequence-based prediction of microbial functionality is a powerful tool, with a clear application in screening biobanks. Increased availability of public data sets of fermentations will allow developing better predictive models for microbial functionality. Detection of spoilage strains on the basis of genotype.

Funding

This work was supported by a project from the Top Institute Food and Nutrition, Wageningen, the Netherlands and the Kluyver Center for Genomics of Industrial Fermentation, Delft, the Netherlands.

131 in total

Review 1. Bacteriophages as biocontrol agents of food pathogens.

Authors: Jennifer Mahony; Olivia McAuliffe; R Paul Ross; Douwe van Sinderen
Journal: Curr Opin Biotechnol Date: 2010-11-05 Impact factor: 9.740

Review 2. Workflow based framework for life science informatics.

Authors: Abhishek Tiwari; Arvind K T Sekhar
Journal: Comput Biol Chem Date: 2007-08-19 Impact factor: 2.877

Review 3. Yeast diversity and native vigor for flavor phenotypes.

Authors: Francisco Carrau; Carina Gaggero; Pablo S Aguilar
Journal: Trends Biotechnol Date: 2015-01-24 Impact factor: 19.536

Review 4. Environmental responses and phage susceptibility in foodborne pathogens: implications for improving applications in food safety.

Authors: Thomas Denes; Martin Wiedmann
Journal: Curr Opin Biotechnol Date: 2013-09-25 Impact factor: 9.740

5. Design, construction, and characterization methodologies for synthetic microbial consortia.

Authors: Hans C Bernstein; Ross P Carlson
Journal: Methods Mol Biol Date: 2014

6. Development of a minimal growth medium for Lactobacillus plantarum.

Authors: A Wegkamp; B Teusink; W M de Vos; E J Smid
Journal: Lett Appl Microbiol Date: 2010-01 Impact factor: 2.858

7. Genotypic adaptations associated with prolonged persistence of Lactobacillus plantarum in the murine digestive tract.

Authors: Hermien van Bokhorst-van de Veen; Maaike J Smelt; Michiel Wels; Sacha A F T van Hijum; Paul de Vos; Michiel Kleerebezem; Peter A Bron
Journal: Biotechnol J Date: 2013-08 Impact factor: 4.677

8. Investigation of associations of Yarrowia lipolytica, Staphylococcus xylosus, and Lactococcus lactis in culture as a first step in microbial interaction analysis.

Authors: S Mansour; J Bailly; S Landaud; C Monnet; A S Sarthou; M Cocaign-Bousquet; S Leroy; F Irlinger; P Bonnarme
Journal: Appl Environ Microbiol Date: 2009-08-14 Impact factor: 4.792

9. Genome-scale metabolic model for Lactococcus lactis MG1363 and its application to the analysis of flavor formation.

Authors: Nicolas A L Flahaut; Anne Wiersma; Bert van de Bunt; Dirk E Martens; Peter J Schaap; Lolke Sijtsma; Vitor A Martins Dos Santos; Willem M de Vos
Journal: Appl Microbiol Biotechnol Date: 2013-08-24 Impact factor: 4.813

Review 10. Basic concepts and principles of stoichiometric modeling of metabolic networks.

Authors: Timo R Maarleveld; Ruchir A Khandelwal; Brett G Olivier; Bas Teusink; Frank J Bruggeman
Journal: Biotechnol J Date: 2013-07-29 Impact factor: 4.677

16 in total

1. Analysis of Bacterial Communities by 16S rRNA Gene Sequencing in a Melon-Producing Agro-environment.

Authors: Eduardo Franco-Frías; Victor Mercado-Guajardo; Angel Merino-Mascorro; Janeth Pérez-Garza; Norma Heredia; Juan S León; Lee-Ann Jaykus; Jorge Dávila-Aviña; Santos García
Journal: Microb Ecol Date: 2021-02-11 Impact factor: 4.552

2. Fermentation for future food systems: Precision fermentation can complement the scope and applications of traditional fermentation.

Authors: Ting Shien Teng; Yi Ling Chin; Kong Fei Chai; Wei Ning Chen
Journal: EMBO Rep Date: 2021-04-27 Impact factor: 8.807

3. High-Level Heat Resistance of Spores of Bacillus amyloliquefaciens and Bacillus licheniformis Results from the Presence of a spoVA Operon in a Tn1546 Transposon.

Authors: Erwin M Berendsen; Rosella A Koning; Jos Boekhorst; Anne de Jong; Oscar P Kuipers; Marjon H J Wells-Bennik
Journal: Front Microbiol Date: 2016-12-02 Impact factor: 5.640

Review 4. "I Am I and My Bacterial Circumstances": Linking Gut Microbiome, Neurodevelopment, and Depression.

Authors: Juan M Lima-Ojeda; Rainer Rupprecht; Thomas C Baghai
Journal: Front Psychiatry Date: 2017-08-22 Impact factor: 4.157

5. Safety Assessment of Lactobacillus helveticus KLDS1.8701 Based on Whole Genome Sequencing and Oral Toxicity Studies.

Authors: Bailiang Li; Da Jin; Smith Etareri Evivie; Na Li; Fenfen Yan; Li Zhao; Fei Liu; Guicheng Huo
Journal: Toxins (Basel) Date: 2017-09-24 Impact factor: 4.546

Review 6. Lactic Acid Bacteria and Their Bacteriocins: Classification, Biosynthesis and Applications against Uropathogens: A Mini-Review.

Authors: Mduduzi Paul Mokoena
Journal: Molecules Date: 2017-07-26 Impact factor: 4.411

Review 7. Machine Learning Approaches for Epidemiological Investigations of Food-Borne Disease Outbreaks.

Authors: Baiba Vilne; Irēna Meistere; Lelde Grantiņa-Ieviņa; Juris Ķibilds
Journal: Front Microbiol Date: 2019-08-06 Impact factor: 5.640

Review 8. African Sorghum-Based Fermented Foods: Past, Current and Future Prospects.

Authors: Oluwafemi Ayodeji Adebo
Journal: Nutrients Date: 2020-04-16 Impact factor: 5.717

9. The controversial nature of the Weissella genus: technological and functional aspects versus whole genome analysis-based pathogenic potential for their application in food and health.

Authors: Hikmate Abriouel; Leyre Lavilla Lerma; María Del Carmen Casado Muñoz; Beatriz Pérez Montoro; Jan Kabisch; Rohtraud Pichner; Gyu-Sung Cho; Horst Neve; Vincenzina Fusco; Charles M A P Franz; Antonio Gálvez; Nabil Benomar
Journal: Front Microbiol Date: 2015-10-27 Impact factor: 5.640

10. Microbiological Safety and the Management of Microbial Resources in Artisanal Foods and Beverages: The Need for a Transdisciplinary Assessment to Conciliate Actual Trends and Risks Avoidance.

Authors: Vittorio Capozzi; Mariagiovanna Fragasso; Pasquale Russo
Journal: Microorganisms Date: 2020-02-22