Literature DB >> 29087450

ViewBS: a powerful toolkit for visualization of high-throughput bisulfite sequencing data.

Xiaosan Huang1, Shaoling Zhang1, Kongqing Li2, Jyothi Thimmapuram3, Shaojun Xie3, Jonathan Wren.   

Abstract

Motivation: High throughput bisulfite sequencing (BS-seq) is an important technology to generate single-base DNA methylomes in both plants and animals. In order to accelerate the data analysis of BS-seq data, toolkits for visualization are required.
Results: ViewBS, an open-source toolkit, can extract and visualize the DNA methylome data easily and with flexibility. By using Tabix, ViewBS can visualize BS-seq for large datasets quickly. ViewBS can generate publication-quality figures, such as meta-plots, heat maps and violin-boxplots, which can help users to answer biological questions. We illustrate its application using BS-seq data from Arabidopsis thaliana. Availability: ViewBS is freely available at: https://github.com/xie186/ViewBS. Contact: xie186@purdue.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29087450      PMCID: PMC5860610          DOI: 10.1093/bioinformatics/btx633

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

A combination of bisulfite treatment of DNA and high-throughput sequencing [bisulfite sequencing (BS-seq)] has been extensively used to investigate DNA methylation at single-base resolution level for animals and plants (Krueger ). Analysis of BS-seq data includes alignments of reads, identification of differentially methylated regions (DMR) and visualization of the data. Multiple tools for read alignments of BS-seq data have been developed (Guo ; Harris ; Krueger and Andrews, 2011; Pedersen ). Many tools (Akalin ; Hansen ; Wang ) are available for DMR identification as well. To visualize the DNA methylome data, heat map, meta-line plot and boxplot are frequently used in published papers (Krueger ). Developing a toolkit with functions to generate publication-ready figures will be broadly used in the analysis of BS-seq data. Here we have developed ViewBS—a standalone program capable of both profiling genome-wide DNA methylation and visualizing DNA methylation patterns of BS-seq data at selected regions. We used a few examples to demonstrate that ViewBS is easy to use and very powerful to generate publication-quality figures.

2 Implementation

ViewBS has one main command named ViewBS. Under ViewBS, eight tools are developed. Getopt:: Long:: Subcommand is used to process command line options with sub commands. Bio:: DB:: HTS:: Tabix (another Perl module) is used to quickly retrieve genome-wide cytosine report as input. Perl module Bio:: SeqIO is used to retrieve information from sequence data. To generate publication-quality figures, three packages (reshape2, ggplot2 and pheatmap) are used in R. The source code is freely available at https://github.com/xie186/ViewBS.

3 Functions and examples

ViewBS has several tools which determine the required and optional arguments. These tools can be divided into two parts: profiling of genome-wide DNA methylation and visualization of DNA methylation patterns of BS-seq data at selected regions (Fig. 1). Each tool will generate the result in tab-delimited files, corresponding figures in PDF files and shell scripts to regenerate the figures.
Fig. 1

Summary of the functions of ViewBS

Summary of the functions of ViewBS To use ViewBS, users typically need to prepare two types of data set. ViewBS uses genome-wide cytosine methylation report generated by Bismark as input file, which contains the sequence context and has seven columns in the following format: chromosome, position, strand, count methylated, count unmethylated, C-context and trinucleotide context. The second one is input file for selected regions of interest which includes the genomic coordinates. To evaluate the performance of ViewBS, we used the BS-seq data (see Supplementary Material) for Arabidopsis thaliana (Stroud ). Figures generated by the tools of ViewBS were based on these data. Details can be found in supplementary data.

Profiling of genome-wide DNA methylation

For genome-wide DNA methylation profiling, we offer several tools: BisNonConvRate, MethCoverage, GlobalMethLev, MethGeno and MethLevDist. BisNonConvRate is a tool for estimating non-conversion rate of BS-seq data. In plant, reads mapping to the non-methylated chloroplast genome can be used to assess bisulfite conversion efficiency. BisNonConvRate is especially useful in this case. Users can provide the chromosome ID for chloroplast and BisNonConvRate will estimate the non-conversion rate. MethCoverage can be used to assess the read coverage distribution of cytosine in each context. This tool is useful to let users know the read coverage of BS-seq data. Global DNA methylation levels are common information for understanding the BS-seq data. GlobalMethLev is the tool to generate weighted DNA methylation levels for the samples that the users provide. Distribution of methylation levels is another feature that researchers use to profile DNA methylation data. The tool MethLevDist can generate methylation level distributions for not only genome-wide, but also for the regions of interest, for example genic regions, TSS regions, etc. Another tool named MethGeno can generate the methylation levels across the chromosomes.

Visualization of DNA methylation patterns at selected regions of interest

For visualization of DNA methylation patterns at functionally important regions, we offer three tools: MethOverRegion, MethHeatmap and MethOneRegion. MethOverRegion is a tool which can generate meta-plot for selected regions, for example a list of genes. By default, the selected region will be split into 60 bins. The flanking 2 kb regions of the selected regions will be split into 100-bp bins. For each bin, weighted methylation levels will be recorded and plotted along the selected regions. The users can set the size of flanking regions and the number of bins. MethHeatmap is a tool that can be used to generate heat map and/or violin-box plot for selected regions, for example a list of DMRs. A violin plot shows the full distribution of the data (Hintze and Nelson, 1998) and a box plot shows summary statistics such as mean/median and interquartile ranges. Here MethHeatmap combines these two methods together into one violin-boxplot to visualize the data. MethOneRegion is a tool which can be used to visualize DNA methylation information for just one region provided. This is useful if the users don’t want to load the large dataset to genome browser (like IGV Thorvaldsdottir et al., 2013) and just want a snapshot of one region. In this tool, the users can define the size of flanking regions that they want to visualize. The DNA methylation in the region provided will be shown in shaded transparent area. Evaluation of ViewBS was carried out using BS-seq data from Arabidopsis. Details of the results, including the maximum memory consumptions and time used for each of the tools of ViewBS, were shown in the supplementary files.

4 Conclusions

We conclude that ViewBS is an extremely efficient and flexible software package to accelerate research in the era of bisulfite sequencing data. It provides a set of toolkits to enable rapid analysis of whole genome bisulfite sequencing. Click here for additional data file.
  9 in total

1.  BRAT-BW: efficient and accurate mapping of bisulfite-treated reads.

Authors:  Elena Y Harris; Nadia Ponts; Karine G Le Roch; Stefano Lonardi
Journal:  Bioinformatics       Date:  2012-05-03       Impact factor: 6.937

Review 2.  DNA methylome analysis using short bisulfite sequencing data.

Authors:  Felix Krueger; Benjamin Kreck; Andre Franke; Simon R Andrews
Journal:  Nat Methods       Date:  2012-01-30       Impact factor: 28.547

3.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.

Authors:  Helga Thorvaldsdóttir; James T Robinson; Jill P Mesirov
Journal:  Brief Bioinform       Date:  2012-04-19       Impact factor: 11.622

4.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications.

Authors:  Felix Krueger; Simon R Andrews
Journal:  Bioinformatics       Date:  2011-04-14       Impact factor: 6.937

5.  swDMR: A Sliding Window Approach to Identify Differentially Methylated Regions Based on Whole Genome Bisulfite Sequencing.

Authors:  Zhen Wang; Xianfeng Li; Yi Jiang; Qianzhi Shao; Qi Liu; BingYu Chen; Dongsheng Huang
Journal:  PLoS One       Date:  2015-07-15       Impact factor: 3.240

6.  BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data.

Authors:  Weilong Guo; Petko Fiziev; Weihong Yan; Shawn Cokus; Xueguang Sun; Michael Q Zhang; Pao-Yang Chen; Matteo Pellegrini
Journal:  BMC Genomics       Date:  2013-11-10       Impact factor: 3.969

7.  Non-CG methylation patterns shape the epigenetic landscape in Arabidopsis.

Authors:  Hume Stroud; Truman Do; Jiamu Du; Xuehua Zhong; Suhua Feng; Lianna Johnson; Dinshaw J Patel; Steven E Jacobsen
Journal:  Nat Struct Mol Biol       Date:  2013-12-15       Impact factor: 15.369

8.  BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions.

Authors:  Kasper D Hansen; Benjamin Langmead; Rafael A Irizarry
Journal:  Genome Biol       Date:  2012-10-03       Impact factor: 13.583

9.  methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles.

Authors:  Altuna Akalin; Matthias Kormaksson; Sheng Li; Francine E Garrett-Bakelman; Maria E Figueroa; Ari Melnick; Christopher E Mason
Journal:  Genome Biol       Date:  2012-10-03       Impact factor: 13.583

  9 in total
  12 in total

1.  DNA demethylation and hypermethylation are both required for late nodule development in Medicago.

Authors:  Y Pecrix; E Sallet; S Moreau; O Bouchez; S Carrere; J Gouzy; M-F Jardinaud; P Gamas
Journal:  Nat Plants       Date:  2022-07-11       Impact factor: 17.352

2.  Single-base methylome analysis reveals dynamic changes of genome-wide DNA methylation associated with rapid stem growth of woody bamboos.

Authors:  Liang-Zhong Niu; Wei Xu; Peng-Fei Ma; Zhen-Hua Guo; De-Zhu Li
Journal:  Planta       Date:  2022-08-01       Impact factor: 4.540

3.  MBD5 and MBD6 couple DNA methylation to gene silencing through the J-domain protein SILENZIO.

Authors:  Lucia Ichino; Brandon A Boone; Luke Strauskulage; C Jake Harris; Gundeep Kaur; Matthew A Gladstone; Maverick Tan; Suhua Feng; Yasaman Jami-Alahmadi; Sascha H Duttke; James A Wohlschlegel; Xiaodong Cheng; Sy Redding; Steven E Jacobsen
Journal:  Science       Date:  2021-06-03       Impact factor: 63.714

4.  The characterization of Mediator 12 and 13 as conditional positive gene regulators in Arabidopsis.

Authors:  Qikun Liu; Sylvain Bischof; C Jake Harris; Zhenhui Zhong; Lingyu Zhan; Calvin Nguyen; Andrew Rashoff; William D Barshop; Fei Sun; Suhua Feng; Magdalena Potok; Javier Gallego-Bartolome; Jixian Zhai; James A Wohlschlegel; Michael F Carey; Jeffrey A Long; Steven E Jacobsen
Journal:  Nat Commun       Date:  2020-06-03       Impact factor: 14.919

5.  MethGET: web-based bioinformatics software for correlating genome-wide DNA methylation and gene expression.

Authors:  Chin-Sheng Teng; Bing-Heng Wu; Ming-Ren Yen; Pao-Yang Chen
Journal:  BMC Genomics       Date:  2020-05-29       Impact factor: 3.969

6.  A viral guide RNA delivery system for CRISPR-based transcriptional activation and heritable targeted DNA demethylation in Arabidopsis thaliana.

Authors:  Basudev Ghoshal; Brandon Vong; Colette L Picard; Suhua Feng; Janet M Tam; Steven E Jacobsen
Journal:  PLoS Genet       Date:  2020-12-14       Impact factor: 5.917

7.  Epigenetic features improve TALE target prediction.

Authors:  Annett Erkes; Stefanie Mücke; Maik Reschke; Jens Boch; Jan Grau
Journal:  BMC Genomics       Date:  2021-12-29       Impact factor: 3.969

Review 8.  Air pollution-induced epigenetic changes: disease development and a possible link with hypersensitivity pneumonitis.

Authors:  Suranjana Mukherjee; Sanjukta Dasgupta; Pradyumna K Mishra; Koel Chaudhury
Journal:  Environ Sci Pollut Res Int       Date:  2021-09-08       Impact factor: 4.223

9.  The plant mobile domain proteins MAIN and MAIL1 interact with the phosphatase PP7L to regulate gene expression and silence transposable elements in Arabidopsis thaliana.

Authors:  Melody Nicolau; Nathalie Picault; Julie Descombin; Yasaman Jami-Alahmadi; Suhua Feng; Etienne Bucher; Steven E Jacobsen; Jean-Marc Deragon; James Wohlschlegel; Guillaume Moissiard
Journal:  PLoS Genet       Date:  2020-04-14       Impact factor: 5.917

10.  OsChz1 acts as a histone chaperone in modulating chromatin organization and genome function in rice.

Authors:  Kangxi Du; Qiang Luo; Liufan Yin; Jiabing Wu; Yuhao Liu; Jianhua Gan; Aiwu Dong; Wen-Hui Shen
Journal:  Nat Commun       Date:  2020-11-11       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.