| Literature DB >> 29123534 |
Susan Jones1, Amanda Baizan-Edge2, Stuart MacFarlane3, Lesley Torrance2,3.
Abstract
Viruses cause significant yield and quality losses in a wide variety of cultivated crops. Hence, the detection and identification of viruses is a crucial facet of successful crop production and of great significance in terms of world food security. Whilst the adoption of molecular techniques such as RT-PCR has increased the speed and accuracy of viral diagnostics, such techniques only allow the detection of known viruses, i.e., each test is specific to one or a small number of related viruses. Therefore, unknown viruses can be missed and testing can be slow and expensive if molecular tests are unavailable. Methods for simultaneous detection of multiple viruses have been developed, and (NGS) is now a principal focus of this area, as it enables unbiased and hypothesis-free testing of plant samples. The development of NGS protocols capable of detecting multiple known and emergent viruses present in infected material is proving to be a major advance for crops, nuclear stocks or imported plants and germplasm, in which disease symptoms are absent, unspecific or only triggered by multiple viruses. Researchers want to answer the question "how many different viruses are present in this crop plant?" without knowing what they are looking for: RNA-sequencing (RNA-seq) of plant material allows this question to be addressed. As well as needing efficient nucleic acid extraction and enrichment protocols, virus detection using RNA-seq requires fast and robust bioinformatics methods to enable host sequence removal and virus classification. In this review recent studies that use RNA-seq for virus detection in a variety of crop plants are discussed with specific emphasis on the computational methods implemented. The main features of a number of specific bioinformatics workflows developed for virus detection from NGS data are also outlined and possible reasons why these have not yet been widely adopted are discussed. The review concludes by discussing the future directions of this field, including the use of bioinformatics tools for virus detection deployed in analytical environments using cloud computing.Entities:
Keywords: bioinformatics & computational biology; crop protection; food security; next generation sequencing (NGS); viral diagnostic
Year: 2017 PMID: 29123534 PMCID: PMC5662881 DOI: 10.3389/fpls.2017.01770
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1(A) Outline of outcomes from PCR, microarray, and NGS-sequence based approaches for virus detection in plants. (B) Outline of potential stages in an RNA-seq analysis workflow for virus detection in plants.
Bioinformatics tools for the identification of viruses in RNA-sequence samples.
| VirFind | Ho and Tzanetakis, | Web based tool that maps and removes host reads, gives taxonomic information for virus reads |
Quality Control Mapping to host: Bowtie2 (Langmead, Assembly of non-host reads: Velvet Contigs compared with GenBank using Blastx Unmapped reads translated and compared against NCBI conserved domain database (Marchler-Bauer et al., | 38 Plant samples from 19 species | sRNA | Web based tool with GUI |
| Taxonomer | Flygare et al., | Fast web-based metagenomics analysis tool based on | Comprised of 4 modules Binner: compares reads to reference Classifier: exact Protonomer: further classification in protein space AfterBurner: discovery of novel taxa | Human | mRNA | Webserver: |
| VSD toolkit | Barrero et al., | Modules and workflows in the Yabi analytical environment for identification of viral sequences in plants |
Quality Control Overlapping contigs merged Contigs >40 nt aligned to plant, virus and viroid Genbank databases using Blastn and Blastx Unmapped contigs filtered and analyzed to identify putative circular viroids | 21 Plant genomes | sRNA | Source code available to use with Yabi (Hunter et al., |
| Metavisitor | Carissimo et al., | Modular tools and workflows within the Galaxy analytical environment, designed for detection and reconstruction of viral genomes |
Quality control Mapping to host, symbionts and parasites: Bowtie2 Unmapped reads assembled using Velvet + Oases (Schulz et al., Contigs compared to Genbank virus database using Blastx and Blastn Blast guided scaffolding of selected virus sequences | Human Drosphila Mosquito | mRNA | Source code available to use within Galaxy (Afgan et al., |
| VIP | Li et al., | An integrated pipeline for metagenomics of virus identification and discovery |
Quality Control Mapping to host: Bowtie2 Fast Mode: reads mapped to Virus Pathogen Resource (ViPR) (Pickett et al., Sense mode: reads mapped to virus RefSeq (O'Leary et al., Unmapped reads mapped to RefSeq protein: RAPSearch (Zhao et al., Options for | Human | mRNA | Local Installation. Code available at |
| ViromeScan | Rampelli et al., | Tool for metagenomics viral community profiling |
Mapping to built-in databases (includes plants): Bowtie2 Mapped reads processed for quality Human Best Match Tagger (BMTagger) (Agarwala and Aleksandr, Screened reads re-mapped to virus database: Bowtie2 | Human | mRNA | Local installation. Code available at |
| VirusHunter | Zhao et al., | Data analysis pipeline for novel virus identification from Roche 454 sequencers and other long read platforms |
Similar reads clustered using CD-Hit (Li and Godzik, Repeat regions masked with Repeat Masker (Smit et al., Mapping to host (default: Human): Blastn Unmapped sequences aligned to NCBI nucleotide database and classified into taxonomies Unmapped sequences mapped to NCBI nr databases using Blastx | BHK (hamster) cell culture infected with viruses | mRNA | |
| ezVIR | Petty et al., | Bioinformatics pipeline to evaluate spectrum of known human viruses |
Mapping to host (Human genome): Bowtie2 Nonhost mapped to custom virus database: Bowtie2 Additional analysis on specific mapped classes provides targeted strain classification | Human | mRNA | Local installation. Code available: |
| Virus Detect | Zheng et al., | Bioinformatics pipeline to analyse sRNA datasets for both known and novel virus identification |
Maps reads to virus reference sequences: BWA Mapped reads assembled using references Mapped reads Reference assemblies and Contigs compared to virus reference nulcotides: Blastn Unmatched contigs matched to virus reference Blastx | Plants (Potato) | sRNA | Webserver: |
| VirusFinder | Wang et al., | Software for detection of viruses and their host integration sites |
Mapping to host (Human genome): Bowtie2 Non-host reads mapped one of two virus databases (Hirahata et al., Mapped reads Contigs mapped to virus database and human genome For specific virus of interest host integration sites are identified using BWA (Li and Durbin, | Human | RNA-seq |
References for databases and software with multiple entries in table: Genbank (Benson et al., .
RNA-seq datasets used to test two automated virus detection tools.
| Pear ( | RNA-seq, SE | ASGV, AGCAV, ASPV, PrVT (Jo et al., | |
| Pepper ( | RNA-seq, PE | ALPV, BPEV, cgLCuV, CYVMVA, PepLCB, PepLCBV, PeSV, ToLCRnV, ToLCBDB, ToLCJoV, GaILV, TolCGV, TVCV (Jo et al., | |
| Grape Vine ( | sRNA-seq, SE | GRSPaV, GVB, GFkV, GLRaV-3, HSVd (Barrero et al., |
The specific sequence datasets used are indicated by their SRR number in the SRA (Leinonen et al., .
Pear: Prunus virus T (PrVT), Apple green crinkle associated virus (AGCAV), Apple stem grooving virus (ASGV), Apple stem pitting virus (ASPV). Pepper: Aphid lethal paralysis virus (ALPV), Pepper leaf curl Bangladesh virus (PepLCBV), Pea streak virus (PeSV), Pepper leaf curl virus betasatellite (PepLCVB), Tobacco vein clearing virus (TVCV), Bell pepper endornavirus (BPEV). Tomato leaf curl Ranchi virus (ToLCRnV), Tomato leaf curl Bangladesh betasatellite (ToLCBDB), Tomato leaf curl virus (ToLCJoV), Tomato leaf curl Gujarat virus (TolCGV), Cotton leaf curl virus (CLCuV), Croton yellow vein mosaic virus (CYVMVA), Pepper leaf curl virus betasatellite (PepLCVB), Gaillardia latent virus (GaILV). Grape vine: Grapevine rupestris stem pitting-associated virus (GRSPaV), Grapevine virus B (GVB), Grapevine fleck virus (GFkV), Grapevine leafroll-associated virus (GLRaV-3), Hop stunt viroid (HSVd).