| Literature DB >> 29375610 |
Salvatore Alaimo1, Gioacchino P Marceca1, Rosalba Giugno2, Alfredo Ferro1, Alfredo Pulvirenti1.
Abstract
Growing grapevine (Vitis vinifera) is a key contribution to the economy of many countries. Tools provided by genomics and bioinformatics did help researchers in obtaining biological knowledge about the different cultivars. Several genetic markers for common diseases were identified. Recently, the impact of microbiome has been proved to be of fundamental importance both in humans and in plants for its ability to confer protection or induce diseases. In this review we report current knowledge about grapevine microbiome, together with a description of the available computational methodologies for meta-omics analysis.Entities:
Keywords: Vitis vinifera; bioinformatic tools and databases; metagenomics; metatranscriptomics; microbiome
Year: 2018 PMID: 29375610 PMCID: PMC5767322 DOI: 10.3389/fpls.2017.02241
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Brief description of metabarcoding: tools advantages and disadvantages.
| QIIME | Pipeline for performing microbiome analysis by exploiting a set of integrated scripts for analyzing raw microbial DNA samples, including taxonomic classification using marker genes. | Allows flexible multi-script pipelines to be constructed. Allows wide statistical analysis with advanced graphical visualizations. Provides compute resources for free. | Command line interface. Installation on local machine may be difficult for non-experts. Not multi-platform. |
| OBITool | Set of programs specifically designed for analyzing NGS data in a DNA metabarcoding context, designed to target microbial communities from various ecological contexts. | Relies mainly on filtering and sorting algorithms, allowing users to set up flexible data analysis pipelines. It takes into account taxonomic annotations, allowing sorting and filtering of sequence records based on the taxonomy. | Command line interface. Installation on local machine may be difficult for non-experts. Not multi-platform. |
| PRINSEQ | Projected to trim adapter sequences and low quality ends and to remove the reads containing ambiguous nucleotides and duplicate reads from the sequencing data output, accelerating read data analysis. | User-friendly. Generates complete statistics of data-seq for parameters like sequence length, GC content, quality score and replicates. Capable of treating both single and paired-end reads. Exploitable also for metagenomics and metatranscriptomics data. | Window size needs to be defined by users for the initial trimming step. Limited to pre-processing. |
| MOTHUR | Principally designed to target the microbial ecology community, it provides an extensible package with functionality accessible through a domain-specific language. It incorporates algorithms from previous tools plus additional features. | Single program for complete analysis with basic visualizations. | Custom command line interface. Incomplete usage of software engineering techniques. Not multi-platform. |
| DADA2 | R package implementing the full amplicon workflow, from filtering to merging of paired-end reads. | Uses a statistical model of amplicon errors to infer sequence variance instead of construct OTUs. Very high accuracy. | Command line interface. |
Brief description of metatranscriptomics tools: advantages and disadvantages.
| HUMAnN2 | Pipeline for profiling the presence/absence and activity level of microbial pathways in a community. | Easy to install and extensive documentation and examples. Uses commonly available tools and databases. | Command line interface. |
| MetaTrans | Pipeline aiming to analyze structure and functions of active microbial communities using the power of multi-threading computers. | Its design facilitates the inclusion of third-party tools in each of its stages. Possibility to perform RNA-Seq analyses addressing both 16S rRNA taxonomy and gene expression. | Installation on local computer may be difficult for non-experts. Require proper local setup on a powerful computer. |
| COMAN | Web-based tool dedicated to automatically and comprehensively analyzing metatranscriptomic data. | Easy-to-use interface and extensive instructions for non-experts. Processes uploaded raw reads automatically to ultimately achieve functional assignments, which are then exploited to perform further analysis. | Web-based interface not suitable for big analysis. |
Brief description of GRA estimation tools: advantages and disadvantages.
| TETRA | Pioneering classifier that uses tetranucleotide-derived z-score correlations to taxonomically classify genomic fragments. Compositional-based. | Provides statistical analysis of tetranucleotide usage patterns in genomic fragments. It works either via a web-service or a stand-alone program. | Accuracy at genus level is reached using long reads (>1 kb). Tends to create multiple clusters for reads originating from highly abundant species when the sample contains multiple species with highly varying levels of abundance. |
| CompostBin | DNA compositional-based algorithm which adopts a weighted Principal Component Analysis (PCA)-based strategy. Compositional-based. | Reduces the dimensionality of compositional space. Bins raw sequence reads without need for assembly or training. | Accuracy at genus level is reached using long reads (>1 kb). Tends to create multiple clusters for reads originating from highly abundant species when the sample contains multiple species with highly varying levels of abundance. |
| TACOA | Multi-class taxonomic classifier combining the idea of the k-nearest neighbor with strategies from kernel-based learning. Compositional-based. | Easily installed and run on a desktop computer. Its reference set can be easily updated with newly sequenced genomes. | Accuracy at genus level is reached using long reads (>1 kb). |
| AbundanceBin | Binning tool, based on the l-tuple content of reads, developed on the assumption that reads are sampled from genomes following a Poisson distribution. Compositional-based. | Capable to return accurate results also when the sequence lengths are very short (~75 pb). | Binning efficiency decrease in case of samples which tend to have a uniform distribution of species. |
| MEGAN | Standalone computer program allowing large metagenomic data sets. It uses BLAST or other comparison tools to assign species to each read, and then employs the NCBI taxonomy. Alignment-based. | Allows large data sets to be dissected without the need for assembly or the targeting of specific phylogenetic markers. Provides statistical and graphical output. Computes quantitatively accuracy and specificity. | Uses bit-score of individual hits as the sole parameter for judging significance, thus affecting specificity and accuracy of taxonomic assignments in different scenarios. |
| GRAMMy | Probabilistic framework developed for GRA. It uses the Mixture Model theory. | Exploitable with mapping, alignment and composition-based tools. Possibility to handle very short reads obtaining accurate results. | Accuracy in estimated abundance decreases in case of closely related microbes whose genomic sequences are highly similar. |
Figure 1Results of the metabarcoding analysis of grapevine in Chile. The study analyzes 6 vines from three Chilean geographic areas, and sclerophyllous trees from the adjacent forest area. Vines are sampled from both leaves and fruits. All analysis are replicated 3 times, leading to 54 samples, of which 36 grapevine and 18 controls. 2 samples are discarded due to quality issues. Principal coordinates analysis plot is reported to visualize relationships between samples (A). We also show the relative abundance of operational taxonomic units, assigned using QIIME feature classifier, at phylum level (B), together with the median abundance values obtained for the most significant species in the differential analysis between sampling sites (C) and plant species (D).