| Literature DB >> 30621881 |
Balamurugan Jagadeesan1, Peter Gerner-Smidt2, Marc W Allard3, Sébastien Leuillet4, Anett Winkler5, Yinghua Xiao6, Samuel Chaffron7, Jos Van Der Vossen8, Silin Tang9, Mitsuru Katase10, Peter McClure11, Bon Kimura12, Lay Ching Chai13, John Chapman14, Kathie Grant15.
Abstract
Next Generation Sequencing (NGS) combined with powerful bioinformatic approaches are revolutionising food microbiology. Whole genome sequencing (WGS) of single isolates allows the most detailed comparison possible hitherto of individual strains. The two principle approaches for strain discrimination, single nucleotide polymorphism (SNP) analysis and genomic multi-locus sequence typing (MLST) are showing concordant results for phylogenetic clustering and are complementary to each other. Metabarcoding and metagenomics, applied to total DNA isolated from either food materials or the production environment, allows the identification of complete microbial populations. Metagenomics identifies the entire gene content and when coupled to transcriptomics or proteomics, allows the identification of functional capacity and biochemical activity of microbial populations. The focus of this review is on the recent use and future potential of NGS in food microbiology and on current challenges. Guidance is provided for new users, such as public health departments and the food industry, on the implementation of NGS and how to critically interpret results and place them in a broader context. The review aims to promote the broader application of NGS technologies within the food industry as well as highlight knowledge gaps and novel applications of NGS with the aim of driving future research and increasing food safety outputs from its wider use.Entities:
Keywords: Data sharing; Food safety and quality; Implementation; Metabarcoding; Metagenomics; Microbiology; Next generation sequencing; Whole genome sequencing
Mesh:
Year: 2018 PMID: 30621881 PMCID: PMC6492263 DOI: 10.1016/j.fm.2018.11.005
Source DB: PubMed Journal: Food Microbiol ISSN: 0740-0020 Impact factor: 5.516
Summary of commonly used Whole Genome Sequencing platforms.
| Platform | Sequencing technology | Read length | Output/run | Error rate | Example of use | Type of instrument and run time |
|---|---|---|---|---|---|---|
| Illumina | Sequencing by synthesis | Short reads 1 × 36bp – 2 × 300bp | 0.3–1000Gb | Low | Variant calling | Benchtop |
| Ion Torrent | Sequencing by synthesis | Short reads 200–400bp | 0.6–15Gb | Low | Variant calling | Benchtop |
| PacBio | Single molecule sequencing bysynthesis | Long reads Up to 60kb | 0.5–10Gb | High | De novo assembly of small bacterial genomes and large genome finishing | Large scale |
| Oxford Nanopore | Single molecule | Long reads Up to 100kb | 0.1–20Gb | High | Complete genome of isolates and metagenomics | Portable |
cgMLST and Genomic Reference databases for key food pathogens.
| Pathogen | DB location | Hosted by | Validation |
|---|---|---|---|
| Institut Pasteur, FR | |||
| Warwick University, UK | – | ||
| Warwick University, UK | – | ||
| Warwick University, UK | – | ||
| University of Oxford, UK | – |
Bioinformatic tools and pipelines for WGS analysis.
| Functionality | Name | Platform compatibility | Description | Link | Reference |
|---|---|---|---|---|---|
| Pre-processing of raw reads | Trimmomatic | Linux | Variety of useful trimming tasks for Illumina paired-end and single ended data (cut adapter and other Illumina-specific sequences, based on quality, …) | ||
| Quality control | FastQC | Linux | Quality control checks on raw sequence data with a modular set of analyses | None | |
| checkM | Linux | Set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes (estimates of genome completeness and contamination, plots depicting key genomic characteristic, …) | |||
| Pre-processing of raw reads/Quality control | FaQCs | Linux | Combines several features including data quality visualization and trimming, filtering the PhiX control sequences, conversion of FASTQ formats, multi-threading. | ||
| De novo assembly | Velvet | Linux | De novo genomic assembler specially designed for short read sequencing technologies | ||
| SPAdes | Linux/MacOS | Assembly toolkit containing various assembly pipelines which works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads | |||
| MIRA | Linux/MacOS | Whole genome shotgun and EST sequence assembler for Sanger, 454, Solexa (Illumina), IonTorrent data and PacBio | |||
| HGA | Linux | Provide hierarchical genome assembly: de novo bacterial genome assembly using high coverage short sequencing reads | |||
| Canu | Linux | Fork of the Celera Assembler designed for high-noisesingle-molecule sequencing (such as the PacBio RSII or Oxford Nanopore MinION) | |||
| Reference Mapping | Burrows-Wheeler Aligner (BWA) | Linux | Align sequencing reads against a large reference genome and support Illumina, SOLiD, 454, Sanger reads, PacBio reads | ||
| SMALT | Linux | Aligns DNA sequencing reads with a reference genome and support reads from Illumina, Roche-454, Ion Torrent, PacBio or ABI-Sanger | |||
| Bowtie2 | Windows/Linux/MacOS | Ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. | |||
| Genome Viewer/Genome annotation | Prokka | Linux/MacOS | Tool to annotate bacterial, archaeal and viral genomes quickly and produce standards-compliant output files | ||
| NCBI prokaryotic genome annotation pipeline | Web-based | Designed to annotate bacterial and archaeal genomes (chromosomes and plasmids), including prediction of protein-coding genes, as well as other functional genome units such as structural RNAs, tRNAs, small RNAs, pseudogenes, control regions, direct and inverted repeats, insertion sequences, transposons and other mobile elements | |||
| RAST | Windows/Linux/MacOS | Fully-automated service for annotating complete or nearly complete bacterial and archaeal genomes, providing high quality genome annotations for these genomes across the whole phylogenetic tree | |||
| Variant/SNPcalling | SRST2 | Linux | Designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes. | ||
| VarScan2 | Windows/Linux/MacOS | Platform-independent mutation caller for targeted, exome, and whole-genome resequencing data generated on Illumina, SOLiD, Life/PGM, Roche/454, and similar instruments. | |||
| BFCtools/SAMtools | Linux | Set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF | |||
| kSNP | Linux/MacOS | SNP discovery and SNP annotation from whole genomes | |||
| Mobile element detection | PhiSpy | Linux | Identify prophages in complete bacterial genome sequences | ||
| PlasmidFinder | Web-based | Identify plasmids in total or partial sequenced isolates of bacteria | |||
| Virulence/Resistome analysis | VirulenceFinder | Web-based | Identify virulence genes in total or partial sequenced isolates of bacteria | ||
| VFDB | Web-based | Integrated and comprehensive online resource for curating information about virulence factors of bacterial pathogens | |||
| MYKROBE PREDICTOR | Windows/Linux/MacOS | Analyse the whole genome of a bacterial sample and predict which drugs the infection is resistant to | |||
| ResFinder | Web-based | Identify acquired antimicrobial resistance genes and/or find chromosomal mutations in total or partial sequenced isolates of bacteria | |||
| Phylogenetic analysis | FastTree | Windows/Linux/MacOS | Infer approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. | ||
| RAxML | Windows/Linux/MacOS | Programme for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees. It can also be used for post-analyses of sets of phylogenetic trees, analyses of alignments and, evolutionary placement of short reads | |||
| PhyML | Web-based/Windows/Linux/MacOS | Phylogeny software based on the maximum-likelihood principle | |||
| Visualization | Microreact system | Web-based | Phylogeographic analysis of SNP or MLST data | ||
| PHYLOViZ | Web-based/Windows/Linux | Epidemiological analysis and visualization of sequence (SNP and MLST) data | |||
| GenGIS | Windows/MacOS | Analysis of phylogenetic data and associated metadata on digital maps. | |||
| Bioinfonnatic suite/ pipeline | CLC Genomics Workbench | Windows/Linux/ MacOS | Analyse and visualize NGS data (resequencing, read mapping, de novo assembly, variant analysis, metagenomics, ...) | None | |
| BioNum erics | Windows | Quality control, assembly, reference mapping, SNP calling, wgMLST calling, phylogenetic tree, ... | None | ||
| Ridom SeqSphere + | Windows | Quality control, assembly, reference mapping, SNP calling, cgMLST calling, phylogenetic tree, ... | |||
| Geneious | Windows/Linux/MacOS | Assembly, genome browser, SNP calling, phylogenetic tree, ... | |||
| CFSAN SNP pipeline | Linux | Reference mapping, SNP calling | |||
| Lyve-SET SNp pipeline | Linux | Quality control, reference mapping, hqSNP calling, phylogenetic tree | |||
| SNVPhyl (Single Nucleotide Variant PHYLogenomics) | Linux | Reference mapping, SNP calling, phylogenetic tree | |||
| Basepace | Cloud-computing platform | Quality control, assembly, reference mapping, SNP calling, cgMLST calling, plasmid, virulence, ...(over 70 bioinformatic tools) | None | ||
| Integrated Rapid Infectious Disease Analysis (IRIDA) platform | Linux | Data storage, management, assembly, reference mapping, SNP calling, phylogenetic tree |
Bioinformatic pipelines for metabarcoding, meta-omics analyses and ecological network inference.
| Functionality | Name | Description | Link | Reference |
|---|---|---|---|---|
| Metabarcoding pipeline | QDME2 | Complete metabarcoding workflow: from raw reads to abundance tables | ||
| MOTHUR | Complete metabarcoding workflow: from raw reads to abundance tables | |||
| Oligotyping | Computational method to identify subtle variations among 16S Ribosomal RNA gene sequences | |||
| DADA2 | From raw reads to amplicon sequence variant abundance table | |||
| Meta-omics pipeline | MG-RAST | Complete metagenomic workflow: from raw reads to functional annotations | ||
| MOCAT2 | Complete metagenomic workflow: from raw reads to functional annotations | |||
| ANvro | Omics data analysis and visualization platform | |||
| IMP | Complete metagenomic and metatranscriptomic integrative workflow | |||
| Network inference | Co Net | Ensemble correlation-based network inference | ||
| sparCC | Correlation-based network inference | |||
| SPIEC-EASI | Inference of graphical models of species association from genomics data | |||
| eLSA | Inference of time-dependent associations in time series datasets |
Fig. 1.Summary of potential NGS use by the food industry.
Complete list of members of the Expert Group
| Dr Kathie Grant – Chair | Public Health England | UK |
| Dr Balamurugan Jagadeesan – Vice-Chair | Nestlé Research Center, Nestec Ltd | CH |
| Prof. Frank Aarestrup | Technical University of Denmark (DTU) | DK |
| Dr Marc Allard | Food and Drug Administration (FDA) | US |
| Dr Samuel Chaffron | CNRS and University of Nantes | FR |
| Dr Lay Ching Chai | University of Malaya | MY |
| Dr John Chapman | Unilever | NL |
| Dr Peter Gerner-Smidt | Centers for Disease Control and Prevention (CDC) | US |
| Prof. Dag Harmsen | Munster University Hospital | DE |
| Dr Mitsuru Katase | Fuji Oil Co., Ltd. | JP |
| Dr Bon Kimura | Tokyo University of Marine Science & Technology | JP |
| Mr Sebastien Leuillet | Institut Merieux (Merieux NutriSciences) | FR |
| Dr Peter McClure | Mondelez International | UK |
| Dr Trevor Phister | PepsiCo International | UK |
| Dr Masami Takeuchi | Food and Agricultural Organisation (FAO) | IT |
| Dr Silin Tang | Mars Global Food Safety Center | CN |
| Dr Jos van der Vossen | The Netherlands Organisation for Applied Scientific Research (TNO) | NL |
| Dr Anett Winkler | Cargill | BE |
| Dr Yinghua Xiao | Aria Foods | DK |
The participation of these experts was supported by ILSI Japan, ILSI North America or ILSI Southeast Asia Region.
This company is a member of ILSI Japan.