Literature DB >> 23709497

viRome: an R package for the visualization and analysis of viral small RNA sequence datasets.

Mick Watson1, Esther Schnettler, Alain Kohl.   

Abstract

SUMMARY: RNA interference (RNAi) is known to play an important part in defence against viruses in a range of species. Second-generation sequencing technologies allow us to assay these systems and the small RNAs that play a key role with unprecedented depth. However, scientists need access to tools that can condense, analyse and display the resulting data. Here, we present viRome, a package for R that takes aligned sequence data and produces a range of essential plots and reports.
AVAILABILITY AND IMPLEMENTATION: viRome is released under the BSD license as a package for R available for both Windows and Linux http://virome.sf.net. Additional information and a tutorial is available on the ARK-Genomics website: http://www.ark-genomics.org/bioinformatics/virome. CONTACT: mick.watson@roslin.ed.ac.uk.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23709497      PMCID: PMC3712215          DOI: 10.1093/bioinformatics/btt297

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

RNA interference (RNAi) is mediated by small RNAs, such as micoRNAs (miRNAs) of 21–22 nt (Lagos-Quintana ), small interfering RNAs (siRNAs) of 21–22 nt (Bernstein ; Zamore ) and PIWI-interacting RNAs (piRNAs) of 24–30 nt (Aravin ; Brennecke ), and these molecules regulate many biological processes. These pathways are also a major part of the antiviral response in both insects and plants, including a variety of important mosquito-borne diseases of humans and animals, such as West Nile Virus, Dengue Virus and Chikungunya Virus. In arthropods, these are characterized by the production of 21–22 nt virus-derived small interfering RNAs (viRNAs) or 24–30 nt viral piRNA-like molecules (Blair, 2011; Donald ; Myles ). Second-generation sequencing allows scientists to assay these systems in unprecedented depth, and short reads capture both the 21–22 nt siRNAs and the 24–30 nt piRNAs. However, there is a need for scientists to be able to summarize, analyse and visualize the results of such experiments. Here, we present viRome, a package for R, which takes aligned sequencing data in the BAM format (Li ) and produces a variety of plots and reports that are essential to the analysis of data from viral siRNA datasets. Software packages to analyse viral siRNA data exist. Paparrazi (Vodovar ) is designed to reconstruct viral genomes from siRNA data and produces some similar plots to viRome. Alternatively, Visitor (Antoniewski, 2011), an informatic pipeline for analysing short-read viRNA data, also produces several similar plots. However, both are implemented in Perl and are limited to the Linux/Unix operating system; they include alignment as part of the analysis; therefore, using an alternative aligner would require programming skills; finally, the plots are generated in batch mode; hence, there is no interaction between the user and the software. As a package for R, viRome improves on these software packages in several ways, including (i) viRome allows interaction between the user and the software during report and graph generation, (ii) viRome is available on any operating system that supports R and has been tested on Microsoft Windows and several Linux distributions, (iii) viRome separates visualization from alignment; therefore, the user is free to use any alignment software they wish and (iv) as an R package, viRome integrates seamlessly with other R packages from the Bioconductor project (Gentleman ).

2 ANALYSIS AND VISUALIZATION

As input, viRome takes aligned sequence data in the BAM format. Many tools exist for alignment (Fonseca ) and provided they support the SAM/BAM format, viRome is capable of working with their output. Many of the functions within viRome attempt to summarize millions of data points into tables and plots that allow biological interpretation. One of the benefits of viRome is that most functions return the summarized data, as well as creating a plot. This allows users to create their own plots if they wish. Figure 1 shows a selection of plots produced by viRome.
Fig. 1.

Clockwise from top-left: a plot of read-length distribution; genomic location of 21–22 nt reads; genomic location of 25–29 nt reads; heatmap and sequence logo showing T1 bias; heatmap and sequence logo showing A10 bias; barplot showing T1 bias; 5′ read distance plot for 25–29 nt reads showing enrichment of 10 nt overlap; and a heatmap showing the genomic location of 18–36 bp reads (counts per position: black is low, red is high)

Global analyses: One of the first requirements is to plot a histogram of the lengths of mapped reads—a peak at 21–22 nt implying an siRNA response, and a high frequency of 24–30 nt with a peak at 28 a piRNA response. In viRome, this can be created using the barplot.bam function. Users may also create a report using the sequence.report function. This produces a data.frame in R that summarizes and counts the sequences aligned to each base in a given reference sequence. Users can see the exact sequence, its length, the location and strand of the alignment plus a count of how many times that sequence occurs. As a data.frame, this can be easily exported to Excel or other spreadsheet software. Clockwise from top-left: a plot of read-length distribution; genomic location of 21–22 nt reads; genomic location of 25–29 nt reads; heatmap and sequence logo showing T1 bias; heatmap and sequence logo showing A10 bias; barplot showing T1 bias; 5′ read distance plot for 25–29 nt reads showing enrichment of 10 nt overlap; and a heatmap showing the genomic location of 18–36 bp reads (counts per position: black is low, red is high) Location-based analyses: Although many viruses are targeted by the siRNA pathway throughout the genome, others are targeted only in limited regions (Sabin ). A heatmap representing the occurrence of all mapped read lengths across all genomic locations can be produced using the size.position.heatmap function, and barplots showing counts for each genomic location for each read length generated using the stacked.barplot function. Read-based analyses: Read-based analyses allow users to focus on patterns in particular subsets of reads. Single barplots showing the location, strand and count of reads mapping throughout the genome can be visualized using the position.barplot function. The base composition of subsets of reads can be calculated with the make.pwm function. Sequence signatures of the piRNA pathway include a strong U1 bias in primary, antisense piRNAs and following ‘ping-pong’ cycle amplification involving AGO3 and Aub, a strong A10 bias in secondary sense piRNAs in Drosophila (Brennecke ). Similar motifs have been found in piRNAs and viral piRNA-like molecules in mosquitoes or derived cell lines (Morazzani ; Schnettler ; Vodovar ). The output of make.pwm can be plotted as a heatmap using the pwm.heatmap function, or used with external packages such as seqLogo and motifStack to produce sequence logos. Finally, the 5′-ends of complementary piRNAs are most frequently separated by 10 nt (Brennecke ; Vodovar ) because of the earlier described ‘ping-pong’ amplification. The distance between 5′-ends of piRNAs mapping to opposite strands can be summarized and visualized using the read.dist.plot function.

3 CONCLUSIONS

Deep sequencing experiments have revealed a variety of interesting and unique signatures of the miRNA, siRNA and piRNA pathways, and there is a need for software that allows scientists to process such data. We have developed viRome, a package for R that allows the interactive generation of a range of informative plots and reports. As an R package, viRome is available on a range of operating systems. viRome is released under an open-source license and can be downloaded from http://virome.sf.net, where a tutorial is also available. Funding: UK Biotechnology and Biological Sciences Research Council (BBSRC) (BB/J004243/1; BB/J004235/1) (to M.W.); UK Medical Research Council (MRC) (to A.K. and E.S.); The Netherlands Organisation for Scientific Research NWO (Rubicon Fellowship number: 825.10.021) (to E.S.). Conflict of Interest: none declared.
  17 in total

1.  RNAi: double-stranded RNA directs the ATP-dependent cleavage of mRNA at 21 to 23 nucleotide intervals.

Authors:  P D Zamore; T Tuschl; P A Sharp; D P Bartel
Journal:  Cell       Date:  2000-03-31       Impact factor: 41.582

2.  The small RNA profile during Drosophila melanogaster development.

Authors:  Alexei A Aravin; Mariana Lagos-Quintana; Abdullah Yalcin; Mihaela Zavolan; Debora Marks; Ben Snyder; Terry Gaasterland; Jutta Meyer; Thomas Tuschl
Journal:  Dev Cell       Date:  2003-08       Impact factor: 12.270

3.  In silico reconstruction of viral genomes from small RNAs improves virus-derived small interfering RNA profiling.

Authors:  Nicolas Vodovar; Bertsy Goic; Hervé Blanc; Maria-Carla Saleh
Journal:  J Virol       Date:  2011-08-31       Impact factor: 5.103

4.  Tools for mapping high-throughput sequencing data.

Authors:  Nuno A Fonseca; Johan Rung; Alvis Brazma; John C Marioni
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

5.  Visitor, an informatic pipeline for analysis of viral siRNA sequencing datasets.

Authors:  Christophe Antoniewski
Journal:  Methods Mol Biol       Date:  2011

6.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

7.  Bioconductor: open software development for computational biology and bioinformatics.

Authors:  Robert C Gentleman; Vincent J Carey; Douglas M Bates; Ben Bolstad; Marcel Dettling; Sandrine Dudoit; Byron Ellis; Laurent Gautier; Yongchao Ge; Jeff Gentry; Kurt Hornik; Torsten Hothorn; Wolfgang Huber; Stefano Iacus; Rafael Irizarry; Friedrich Leisch; Cheng Li; Martin Maechler; Anthony J Rossini; Gunther Sawitzki; Colin Smith; Gordon Smyth; Luke Tierney; Jean Y H Yang; Jianhua Zhang
Journal:  Genome Biol       Date:  2004-09-15       Impact factor: 13.583

8.  Discrete small RNA-generating loci as master regulators of transposon activity in Drosophila.

Authors:  Julius Brennecke; Alexei A Aravin; Alexander Stark; Monica Dus; Manolis Kellis; Ravi Sachidanandam; Gregory J Hannon
Journal:  Cell       Date:  2007-03-08       Impact factor: 41.582

9.  RNA interference targets arbovirus replication in Culicoides cells.

Authors:  Esther Schnettler; Maxime Ratinier; Mick Watson; Andrew E Shaw; Melanie McFarlane; Mariana Varela; Richard M Elliott; Massimo Palmarini; Alain Kohl
Journal:  J Virol       Date:  2012-12-26       Impact factor: 5.103

10.  Dicer-2 processes diverse viral RNA species.

Authors:  Leah R Sabin; Qi Zheng; Pramod Thekkat; Jamie Yang; Gregory J Hannon; Brian D Gregory; Matthew Tudor; Sara Cherry
Journal:  PLoS One       Date:  2013-02-12       Impact factor: 3.240

View more
  16 in total

1.  Ebola Virus Produces Discrete Small Noncoding RNAs Independently of the Host MicroRNA Pathway Which Lack RNA Interference Activity in Bat and Human Cells.

Authors:  Abhishek N Prasad; Adam J Ronk; Steven G Widen; Thomas G Wood; Christopher F Basler; Alexander Bukreyev
Journal:  J Virol       Date:  2020-02-28       Impact factor: 5.103

2.  Repertoire of virus-derived small RNAs produced by mosquito and mammalian cells in response to dengue virus infection.

Authors:  Erin E Schirtzinger; Christy C Andrade; Nicholas Devitt; Thiruvarangan Ramaraj; Jennifer L Jacobi; Faye Schilkey; Kathryn A Hanley
Journal:  Virology       Date:  2014-12-17       Impact factor: 3.616

Review 3.  Unraveling the web of viroinformatics: computational tools and databases in virus research.

Authors:  Deepak Sharma; Pragya Priyadarshini; Sudhanshu Vrati
Journal:  J Virol       Date:  2014-11-26       Impact factor: 5.103

4.  Small RNA responses of Culex mosquitoes and cell lines during acute and persistent virus infection.

Authors:  Claudia Rückert; Abhishek N Prasad; Selene M Garcia-Luna; Alexis Robison; Nathan D Grubaugh; James Weger-Lucarelli; Gregory D Ebel
Journal:  Insect Biochem Mol Biol       Date:  2019-04-05       Impact factor: 4.714

5.  An Arabidopsis Natural Epiallele Maintained by a Feed-Forward Silencing Loop between Histone and DNA.

Authors:  Astrid Agorio; Stéphanie Durand; Elisa Fiume; Cécile Brousse; Isabelle Gy; Matthieu Simon; Sarit Anava; Oded Rechavi; Olivier Loudet; Christine Camilleri; Nicolas Bouché
Journal:  PLoS Genet       Date:  2017-01-06       Impact factor: 5.917

6.  The Antiviral RNAi Response in Vector and Non-vector Cells against Orthobunyaviruses.

Authors:  Isabelle Dietrich; Xiaohong Shi; Melanie McFarlane; Mick Watson; Anne-Lie Blomström; Jessica K Skelton; Alain Kohl; Richard M Elliott; Esther Schnettler
Journal:  PLoS Negl Trop Dis       Date:  2017-01-06

7.  The Diversity of Viral Community in Invasive Fruit Flies (Bactrocera and Zeugodacus) Revealed by Meta-transcriptomics.

Authors:  Wei Zhang; Yan-Chun Zhang; Zi-Guo Wang; Qiao-Ying Gu; Jin-Zhi Niu; Jin-Jun Wang
Journal:  Microb Ecol       Date:  2021-06-25       Impact factor: 4.552

8.  Induction and suppression of tick cell antiviral RNAi responses by tick-borne flaviviruses.

Authors:  Esther Schnettler; Hana Tykalová; Mick Watson; Mayuri Sharma; Mark G Sterken; Darren J Obbard; Samuel H Lewis; Melanie McFarlane; Lesley Bell-Sakyi; Gerald Barry; Sabine Weisheit; Sonja M Best; Richard J Kuhn; Gorben P Pijlman; Margo E Chase-Topping; Ernest A Gould; Libor Grubhoffer; John K Fazakerley; Alain Kohl
Journal:  Nucleic Acids Res       Date:  2014-07-22       Impact factor: 16.971

9.  Prevalence of a Novel Bunyavirus in Tea Tussock Moth Euproctis pseudoconspersa (Lepidoptera: Lymantriidae).

Authors:  Xiaoqing Wang; Qiaoying Gu; Wei Zhang; Hongyan Jiang; Shichun Chen; Guy Smagghe; Jinzhi Niu; Jin-Jun Wang
Journal:  J Insect Sci       Date:  2021-07-01       Impact factor: 1.857

10.  An Aedes aegypti-Derived Ago2 Knockout Cell Line to Investigate Arbovirus Infections.

Authors:  Christina Scherer; Jack Knowles; Vattipally B Sreenu; Anthony C Fredericks; Janina Fuss; Kevin Maringer; Ana Fernandez-Sesma; Andres Merits; Margus Varjak; Alain Kohl; Esther Schnettler
Journal:  Viruses       Date:  2021-06-03       Impact factor: 5.818

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.