Literature DB >> 28979767

BAT: Bisulfite Analysis Toolkit: BAT is a toolkit to analyze DNA methylation sequencing data accurately and reproducibly. It covers standard processing and analysis steps from raw read mapping up to annotation data integration and calculation of correlating DMRs.

Helene Kretzmer^1,2, Christian Otto^1,2,3, Steve Hoffmann^1,2.

Abstract

Here, we present BAT, a modular bisulfite analysis toolkit, that facilitates the analysis of bisulfite sequencing data. It covers the essential analysis steps of read alignment, quality control, extraction of methylation information, and calling of differentially methylated regions, as well as biologically relevant downstream analyses, such as data integration with gene expression, histone modification data, or transcription factor binding site annotation.

Entities: Chemical Disease Gene Species

Keywords: DMRs; DNA methylation; RRBS; WGBS; bisulfite sequencing; epigenetics; integrative analysis; software

Year: 2017 PMID： 28979767 PMCID： PMC5590080 DOI： 10.12688/f1000research.12302.1

Source DB: PubMed Journal: F1000Res ISSN： 2046-1402

Introduction

High-throughput DNA methylation sequencing protocols, such as whole-genome bisulfite sequencing (WGBS) and targeted bisulfite sequencing (e. g., RRBS), have made it possible to precisely and accurately measure this major epigenetic modification on a genome wide scale. The impact of DNA methylation on processes, such as cell differentiation, gene expression, chromatin structure, and cancerogenesis, has raised substantial interest in analyzing DNA methylation in many sectors of life sciences. For example, methylomes of a large number of samples have been sequenced in the context of cancer projects and developmental studies [1– 5]. Also researchers investigating obesity, neurodegenerative diseases, Alzheimer’s, or Parkinson’s disease, have begun to focus on DNA methylation [6– 9]. A number of time consuming data analysis steps are required in virtually all these projects, i. e., quality control, read alignment, and methylation rate calculation. However, performing each step by hand is highly error prone, takes time, and impacts reproducibility. To ensure a consistent and reproducible processing, we have developed the Bisulfite Analysis Tooklit BAT. The workflow enables a fast and easy analysis of bisulfite converted high-throughput sequencing reads. It is specifically designed to facilitate the analysis for biologists and physicians with little bioinformatic knowledge, as well as for bioinformaticians that already work on sequencing data, but are not familiar with the characteristics of bisulfite sequencing data.

Methods

BAT is a modular toolkit allowing to easily generate workflows to analyze bisulfite sequencing data. The toolkit includes modules for read alignment (mapping module), methylation level estimation (calling module), grouping of samples (grouping module) and identification of differentially methylated regions (DMR module) ( Figure 1). Further modules allow the integration of gene expression, histone modification data, or transcription factor binding site annotation. These modules facilitate the functional analysis of the effects of differential methylation.

Figure 1.

BAT workflow.

BAT workflow.

It comprises four modules covering (left to right) read alignment, methylation rate calling, basic group analysis, and DMR calling. The modules consist of a collection of scripts that build up on one another, but easily single steps can be covered by alternative tools. Each of the modules can be run on its own, and the minimal system requirements depend on the respective module. The computational most expensive module is the mapping module. Here, the aligner segemehl [10] in its bisulfite mode is used, which requires about 55 GB physical RAM for the alignment of reads to the human genome hg19. The toolkit itself is written in Perl and calls software components mainly written in C and R to ensure swift calculations. All software requirements are listed on our website ( www.bioinf.uni-leipzig.de/Software/BAT/install/#requirements). The default parameters for the tools included in the BAT pipeline are optimized to process bisulfite sequencing data for most applications. In order to enhance reproducibility and reduce potential errors, the number of parameters that need to be set by the user has been carefully reduced to a minimum. Due to its modularity, however, the toolkit is flexible and can easily be extended or customized to specific needs. To allow for workflow modifications and extensions, standardized formats are used and interfaces to several other tools are provided. Basic steps, e.g., processing from raw reads to a single alignment file from multiple sequencing runs, is split into its pre-, post-, and main processing steps, to allow for the customized extension of the workflow. Error handling is eased by parameter and file checks prior to the analysis, and meaningful error messages allow a quick trouble shooting. A detailed documentation of all modules, including parameter description, recommended additional tools, analysis reports, and data visualizations produced by the BAT workflow are summarized on www.bioinf.uni-leipzig.de/Software/BAT. Moreover, all automatically created visualizations are shown on the webpage. Data and figures displayed there are derived from a small example data set of two groups with four samples each, adopted from Kretzmer et al [11]. Our webpage provides raw FASTQ files of one sample as well as the methylation rate files of all eight samples along with expression and annotation data. This example data set and shell scripts covering all modules of BAT can be downloaded and adapted together with the toolkit. Furthermore, BAT is provided as Docker [12] image and can be obtained from https://hub.docker.com/r/christianbioinf/bat/. The Docker images ensure platform independent usage of our toolkit. All programs that are used by BAT are already installed in the Docker image and dependencies are resolved. Existing hard drives are mounted to avoid time consuming translocation or upload of the frequently huge data.

Use cases

Resembling a common study design, we assembled a small case-control example dataset, adopted from recently published data [11]. It is a subset of a paired-end human WGBS dataset, comprising 8 samples (control: S1–S4, case: S5–S8). It comprises the raw reads in FASTQ format of one sample and the already called methylation rates of all 8 samples in VCF format. The following modules can now be used to process and analyze bisulfite sequencing data including detection of methylation differences between case and control samples. The use case starts with the alignment of the raw sequencing data using the mapping module. The single components of BAT and their functionality are described in the following:

Mapping

The read alignment step is taken care of by the module BAT_mapping. It includes a bisulfite-sensitive read alignment using segemehl [10], a quality filtering step, and the conversion of the alignments to an indexed and compressed BAM file by samtools [13]. Using BAT_mapping_stat, the quality of the mapping can be assessed by the number and fraction of mapped pairs or reads, the multiplicity of read alignments, and the alignments’ error rates. In case of large experiments where a sample is sequenced multiplexed on multiple lanes or flow cells, the read alignments of each sample can easily be merged using BAT_merging, including the addition of read group information to allow for tracebacks of lane effects if necessary.

Calling

Following mapping, the methylation information needs to be extracted from the read alignments. Prior to this methylation calling it is, however, recommended to exclude potential biases by clipping alignment overlaps of paired-end reads (e.g. using bamutil’s clipoverlap [14]) or by excluding incompletely converted or artificially introduced cytosines with the M-bias detection method (e.g. using BSeQC [15]). Subsequently, the methylation information can be extracted using the module BAT_calling, which returns a VCF-style file that includes detailed information for each cytosine. This initial set of positions can be filtered by coverage using BAT_filtering to exclude unreliable methylation information from either lowly covered or very highly covered positions (e.g. in repetitive regions). Moreover, it is also possible to filter by genomic context (e.g., to restrict to CG context only). Apart from a VCF file, BAT_filtering reports the methylation level at positions passing the filter in bedGraph format for easy inspection in IGV [16] or upload to the UCSC genome browser [17]. Additionally, the module automatically produces plots showing the distribution of coverages and methylation rates for the complete and the filtered set of positions ( Figure 2A), giving the user the opportunity to check and possibly fine tune the filtering parameters.

Figure 2.

Figure 2. Selection of figures generated on-the-fly by BAT during the analysis of the example dataset.

Annotation items are ENCODE transcription factor binding sites for GM12878 cell line. A) Distribution of coverage. B) Circos plot showing the genome-wide methylation level of eight samples as heatmap. C) Binned distribution of average methylation rate per CpG for each group. D) Boxplots of genome-wide mean methylation rate per group. E) Hierarchical clustered heatmap of the methylation rates of all samples over all annotation items. F) Boxplots of average methylation rate per annotation item. G) Correlating DMR plot shows methylation and expression of a DMR - gene pair. Note that all figures were produced by BAT itself, but were minorly post-edited to fit the limited space.

Figure 2. Selection of figures generated on-the-fly by BAT during the analysis of the example dataset.

Groups

The third module now facilitates the transition from single sample analysis to groups of multiple samples. First, methylation information from individual samples is combined to groups and summarized with BAT_summarize. It reports the mean methylation rate per group and position as well as difference of the group’s mean methylation rate per position. The summary module can be parameterized to only report positions where each group has a minimum number of samples with sufficient coverage. For convenience, all files are exported in both bedGraph and bigWig format for inspection in UCSC genome browser or IGV. Moreover, a circos plot containing a genome-wide methylation rate heatmap for each sample is automatically produced ( Figure 2B). Based on the summary files, a number of overview statistics and plots can be generated using BAT_overview. This includes a hierarchical clustering of the samples based on their methylation profile, a plot of binned mean methylation rates per group ( Figure 2C), boxplots of group-wise mean methylation rates ( Figure 2D), a smoothed scatterplot showing the correlation between the groups’ mean methylation rate per position, and a barplot of the distribution of group methylation differences. Subsequently, BAT_annotation can be used to inspect the methylation of the samples in regions of interest or annotations such as transcription factor binding sites (TFBS), CpG islands, shores, or promoter regions. Therefore, a hierarchically clustered heatmap of all samples ( Figure 2E), is produced and the per-group and per-sample mean methylation rate is calculated ( Figure 2F).

Differential methylated regions

Finally, the fourth module features the identification and analysis of differentially methylated regions (DMRs) between groups ( BAT_DMRcalling). It employs the DMR calling tool metilene [18] which is based on circular binary segmentation of the group methylation difference signal in conjunction with a two-dimensional non-parametric statistical test. Afterwards, the DMRs reported by metilene can be filtered by several criteria, e.g., length (in nt or number of Cs), significance (i.e., q-value), and minimum mean methylation difference, and then converted to BED/bedGraph format. The BED file contains unique identifiers per DMR and reports regions of hyper/hypo methylation. Additionally, the bedGraph file can be used to display the mean group methylation difference of the DMRs. Moreover, BAT_DMRcalling produces overview statistics of the set of filtered DMRs including a histogram of the length and methylation difference of the filtered DMRs, a correlation plot of the mean methylation rate of DMRs in both groups and a plot of the methylation difference vs. the q-value for each DMR. Last but not least, BAT_correlating allows for integration of the DMRs with expression data. Given the methylation information, an expression value of genes, and an association between DMRs and genes, the correlation between both types of data can be examined in order to find correlating DMRs (cDMRs). For each DMR-gene pair, a linear and non-linear correlation coefficient is calculated and a correlation plot ( Figure 2G), showing methylation and expression of each sample, is generated.

Summary

BAT has already successfully been applied in the framework of a large cancer genome study, the ICGC MMML-Seq [11]. The streamlined processing and analysis modules improve and accelerate the analysis by reducing hands on time and user errors. The modularity of BAT, as well as its input and output formats, allow to easily extend or customize the default workflows. For instance, it is possible to easily integrate tools such as BisSNP [19] or BS-Snper [20] or DMR calling tools. The custom visualizations of the methylation data facilitate data mining and allow to inspect the data quality at each step of the analysis. This is necessary to increase the chance of an early detection of errors, e.g., in library preparation and data handling. Therefore, quality control statistics and graphics are produced continually throughout the entire pipeline. Taken together, BAT is a collection of modular steps for analyzing bisulfite sequencing data that (i) can easily be run on various platforms due to the virtualization via Docker, (ii) can be combined with or extended by other tools, (iii) automatically generates publication-ready graphics, and (iv) supports data integration, e.g., annotation or gene expression data.

Software and data availability

Software available from: www.bioinf.uni-leipzig.de/Software/BAT/download Source code available from: https://github.com/helenebioinf/BAT Archived source code as at time of publication: http://doi.org/10.5281/zenodo.838200 [21]. License: MIT Example data available from: www.bioinf.uni-leipzig.de/Software/BAT/download/#example_data The manuscript “BAT: Bisulfite Analysis Toolkit” presents a software pipeline for the analysis of sequencing-based analysis of bisulfite treated DNA. It introduces the major modules of this pipeline and familiarize the reader with their basic function, compatibilities and output, but is obviously not intended to provide sufficient detail to allow reimplementation of the described modules. Instead, it refers to external resources such as research articles and documentary webpages, which provide most of this information. The article excels in providing a researcher who has to choose among several software pipelines for his next methylation project with the necessary information on BAT, without attempting to benchmark it against other approaches. Especially, the offer of a dockerized pipeline version and a real example datasets ensures the applicability of the software, while simultaneously proving the claim of improved reproducibility. Another prominent claim, namely the compatibility with other modules for instance alternatives to segemehl, is less well documented. Here the article would profit from an extended example in which some of the modules are exchanged by third party alternatives, e.g. in the alignment step or during the grouping. Finally, the authors describe the utility of their diagnostic diagrams depicted in figure 2 for the detection of quality problems. To this end a supplementary figure/resource in which a number of examples of how several quality problems manifest in these diagrams is required, not only to proof this statement, but also to educate less experienced users. Minor comment: The sentences “However, performing each step by hand is highly error prone, takes time, and impacts reproducibility” in the introduction is formulated unfavorably, as it can be misread as a suggestion that someone would attempt to analyze a WGBS dataset by hand. I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. BAT: Bisulphite analysis toolkit is a timely software which provides an end-to-end solution for performing DNA methylation analysis. The toolkit follows "good software practises" and has a clearly laid out work flows, efficient code , extensive documentation and has limited dependencies. Further, ability to perform the complete analysis from sequencing data to actual interpretation and integration of data as shown in the example data in the toolkit from the manuscript by the same authors "DNA methylome analysis in Burkitt and follicular lymphomas identifies differentially methylated regions linked to somatic mutation and transcriptional control" [1] suggests that the method implemented in the toolkit for calling differentially methylated regions (DMRs) is not only much faster than existing solutions but also extracts biologically relevant information about methylation. I outline my reasons below : Is the rationale for developing the new software tool clearly explained? The rationale for developing the software well explained as sequencing data especially bisulphite sequencing data are prone to human errors and increasing number of samples being processed for cohorts tackling complex disease phenotypes warrant for a streamlined reproducible workflows. Also the ease of use is a term often loosely used for many bioinformatics tools are under-appreciated by the community but the authors have done well here by providing a docker image that obviates any platform dependencies to provide an out of the box solution. Suggestion: As a rationale it would would great if the authors could add a few lines on their method of calculating DMRs in the introduction to contrast with existing tools, I believe this would enhance the manuscript and further convince the readers to use this tool-kit. Is the description of the software tool technically sound? Software documentation is thorough and technically sound. Moreover, Dr. Hoffmann's lab has been quite consistent in releasing regular updates for their previous tools and is responsive to bug-reports. Suggestion: It is accurate that segemehl requires 55GB to align the entire human genome but it would be important to also point out that alignment could be run on individual chromosomes separately and then combined later which significantly reduces this memory intensive step. Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others? The workflows are well laid out and broken down into the individual modules establishing a replicable software design. Each module can be run individually or together through the perl wrapper and come with appropriate description of flags used in the command line help allowing a look under the hood of the code. Further, each tool is well documented and the code is commented making the tool reproducible. The example provided herewith runs well and one can quickly reproduce the plots from Kretzmer et al 2015 [1]. Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. This article presents a tool aggregate which can be a useful one-stop-shop and/or starting off point for analyzing bisulfite data. The authors detail the package and demonstrate its usefulness in an example analysis. Most of the information needed to decide whether to use this package is contained in the article. A notable exception are the "further modules" mentioned in the first paragraph of the Methods section. While it is clearly useful to integrate gene expression, histone modification data and transcription factor binding site information to your analysis, the reader cannot get an impression of whether the package does this effectively. It would be useful to include either an expansion on this topic or better yet to add example analysis with these modules as well, space permitting. If the authors are space confined, it would be useful to point to where an example of this can be found on the web, as I was unable to locate it on the project page. A critical omission from the introduction is a short technical background on bisulfite sequencing and its analysis. The reader has no basis to understand why a "VCF-style file that includes detailed information for each cytosine" (in the Calling subsection) would be useful. Some minor issues were: the phrase "grouping of samples" in the second sentence of the Methods section does not really clarify anything about the function of the grouping module. I would suggest to use "sample group analysis" "Due to its modularity, however" is awkwardly worded and could be better expressed as "The toolkit's modularity makes it flexible, extensible and customizable for users with specific needs". the sentence "Basic steps, e.g. ..." should say "are" instead of "is". "Resembling a common study design," in the Use cases section does not express what I believe is the authors' intended meaning, and could be better worded as "In order to illustrate the results of using our toolkit on a common study design," I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

19 in total

1. Fast and sensitive mapping of bisulfite-treated sequencing data.

Authors: Christian Otto; Peter F Stadler; Steve Hoffmann
Journal: Bioinformatics Date: 2012-05-10 Impact factor: 6.937

2. BLUEPRINT: mapping human blood cell epigenomes.

Authors: Joost H A Martens; Hendrik G Stunnenberg
Journal: Haematologica Date: 2013-10 Impact factor: 9.941

Review 3. Epigenetics and human obesity.

Authors: S J van Dijk; P L Molloy; H Varinli; J L Morrison; B S Muhlhausler
Journal: Int J Obes (Lond) Date: 2014-02-25 Impact factor: 5.095

4. A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression.

Authors: Kevin C Wang; Yul W Yang; Bo Liu; Amartya Sanyal; Ryan Corces-Zimmerman; Yong Chen; Bryan R Lajoie; Angeline Protacio; Ryan A Flynn; Rajnish A Gupta; Joanna Wysocka; Ming Lei; Job Dekker; Jill A Helms; Howard Y Chang
Journal: Nature Date: 2011-03-20 Impact factor: 49.962

5. The NIH Roadmap Epigenomics Mapping Consortium.

Authors: Bradley E Bernstein; John A Stamatoyannopoulos; Joseph F Costello; Bing Ren; Aleksandar Milosavljevic; Alexander Meissner; Manolis Kellis; Marco A Marra; Arthur L Beaudet; Joseph R Ecker; Peggy J Farnham; Martin Hirst; Eric S Lander; Tarjei S Mikkelsen; James A Thomson
Journal: Nat Biotechnol Date: 2010-10 Impact factor: 54.908

6. International network of cancer genome projects.

Authors: Thomas J Hudson; Warwick Anderson; Axel Artez; Anna D Barker; Cindy Bell; Rosa R Bernabé; M K Bhan; Fabien Calvo; Iiro Eerola; Daniela S Gerhard; Alan Guttmacher; Mark Guyer; Fiona M Hemsley; Jennifer L Jennings; David Kerr; Peter Klatt; Patrik Kolar; Jun Kusada; David P Lane; Frank Laplace; Lu Youyong; Gerd Nettekoven; Brad Ozenberger; Jane Peterson; T S Rao; Jacques Remacle; Alan J Schafer; Tatsuhiro Shibata; Michael R Stratton; Joseph G Vockley; Koichi Watanabe; Huanming Yang; Matthew M F Yuen; Bartha M Knoppers; Martin Bobrow; Anne Cambon-Thomsen; Lynn G Dressler; Stephanie O M Dyke; Yann Joly; Kazuto Kato; Karen L Kennedy; Pilar Nicolás; Michael J Parker; Emmanuelle Rial-Sebbag; Carlos M Romeo-Casabona; Kenna M Shaw; Susan Wallace; Georgia L Wiesner; Nikolajs Zeps; Peter Lichter; Andrew V Biankin; Christian Chabannon; Lynda Chin; Bruno Clément; Enrique de Alava; Françoise Degos; Martin L Ferguson; Peter Geary; D Neil Hayes; Thomas J Hudson; Amber L Johns; Arek Kasprzyk; Hidewaki Nakagawa; Robert Penny; Miguel A Piris; Rajiv Sarin; Aldo Scarpa; Tatsuhiro Shibata; Marc van de Vijver; P Andrew Futreal; Hiroyuki Aburatani; Mónica Bayés; David D L Botwell; Peter J Campbell; Xavier Estivill; Daniela S Gerhard; Sean M Grimmond; Ivo Gut; Martin Hirst; Carlos López-Otín; Partha Majumder; Marco Marra; John D McPherson; Hidewaki Nakagawa; Zemin Ning; Xose S Puente; Yijun Ruan; Tatsuhiro Shibata; Michael R Stratton; Hendrik G Stunnenberg; Harold Swerdlow; Victor E Velculescu; Richard K Wilson; Hong H Xue; Liu Yang; Paul T Spellman; Gary D Bader; Paul C Boutros; Peter J Campbell; Paul Flicek; Gad Getz; Roderic Guigó; Guangwu Guo; David Haussler; Simon Heath; Tim J Hubbard; Tao Jiang; Steven M Jones; Qibin Li; Nuria López-Bigas; Ruibang Luo; Lakshmi Muthuswamy; B F Francis Ouellette; John V Pearson; Xose S Puente; Victor Quesada; Benjamin J Raphael; Chris Sander; Tatsuhiro Shibata; Terence P Speed; Lincoln D Stein; Joshua M Stuart; Jon W Teague; Yasushi Totoki; Tatsuhiko Tsunoda; Alfonso Valencia; David A Wheeler; Honglong Wu; Shancen Zhao; Guangyu Zhou; Lincoln D Stein; Roderic Guigó; Tim J Hubbard; Yann Joly; Steven M Jones; Arek Kasprzyk; Mark Lathrop; Nuria López-Bigas; B F Francis Ouellette; Paul T Spellman; Jon W Teague; Gilles Thomas; Alfonso Valencia; Teruhiko Yoshida; Karen L Kennedy; Myles Axton; Stephanie O M Dyke; P Andrew Futreal; Daniela S Gerhard; Chris Gunter; Mark Guyer; Thomas J Hudson; John D McPherson; Linda J Miller; Brad Ozenberger; Kenna M Shaw; Arek Kasprzyk; Lincoln D Stein; Junjun Zhang; Syed A Haider; Jianxin Wang; Christina K Yung; Anthony Cros; Anthony Cross; Yong Liang; Saravanamuttu Gnaneshan; Jonathan Guberman; Jack Hsu; Martin Bobrow; Don R C Chalmers; Karl W Hasel; Yann Joly; Terry S H Kaan; Karen L Kennedy; Bartha M Knoppers; William W Lowrance; Tohru Masui; Pilar Nicolás; Emmanuelle Rial-Sebbag; Laura Lyman Rodriguez; Catherine Vergely; Teruhiko Yoshida; Sean M Grimmond; Andrew V Biankin; David D L Bowtell; Nicole Cloonan; Anna deFazio; James R Eshleman; Dariush Etemadmoghadam; Brooke B Gardiner; Brooke A Gardiner; James G Kench; Aldo Scarpa; Robert L Sutherland; Margaret A Tempero; Nicola J Waddell; Peter J Wilson; John D McPherson; Steve Gallinger; Ming-Sound Tsao; Patricia A Shaw; Gloria M Petersen; Debabrata Mukhopadhyay; Lynda Chin; Ronald A DePinho; Sarah Thayer; Lakshmi Muthuswamy; Kamran Shazand; Timothy Beck; Michelle Sam; Lee Timms; Vanessa Ballin; Youyong Lu; Jiafu Ji; Xiuqing Zhang; Feng Chen; Xueda Hu; Guangyu Zhou; Qi Yang; Geng Tian; Lianhai Zhang; Xiaofang Xing; Xianghong Li; Zhenggang Zhu; Yingyan Yu; Jun Yu; Huanming Yang; Mark Lathrop; Jörg Tost; Paul Brennan; Ivana Holcatova; David Zaridze; Alvis Brazma; Lars Egevard; Egor Prokhortchouk; Rosamonde Elizabeth Banks; Mathias Uhlén; Anne Cambon-Thomsen; Juris Viksna; Fredrik Ponten; Konstantin Skryabin; Michael R Stratton; P Andrew Futreal; Ewan Birney; Ake Borg; Anne-Lise Børresen-Dale; Carlos Caldas; John A Foekens; Sancha Martin; Jorge S Reis-Filho; Andrea L Richardson; Christos Sotiriou; Hendrik G Stunnenberg; Giles Thoms; Marc van de Vijver; Laura van't Veer; Fabien Calvo; Daniel Birnbaum; Hélène Blanche; Pascal Boucher; Sandrine Boyault; Christian Chabannon; Ivo Gut; Jocelyne D Masson-Jacquemier; Mark Lathrop; Iris Pauporté; Xavier Pivot; Anne Vincent-Salomon; Eric Tabone; Charles Theillet; Gilles Thomas; Jörg Tost; Isabelle Treilleux; Fabien Calvo; Paulette Bioulac-Sage; Bruno Clément; Thomas Decaens; Françoise Degos; Dominique Franco; Ivo Gut; Marta Gut; Simon Heath; Mark Lathrop; Didier Samuel; Gilles Thomas; Jessica Zucman-Rossi; Peter Lichter; Roland Eils; Benedikt Brors; Jan O Korbel; Andrey Korshunov; Pablo Landgraf; Hans Lehrach; Stefan Pfister; Bernhard Radlwimmer; Guido Reifenberger; Michael D Taylor; Christof von Kalle; Partha P Majumder; Rajiv Sarin; T S Rao; M K Bhan; Aldo Scarpa; Paolo Pederzoli; Rita A Lawlor; Massimo Delledonne; Alberto Bardelli; Andrew V Biankin; Sean M Grimmond; Thomas Gress; David Klimstra; Giuseppe Zamboni; Tatsuhiro Shibata; Yusuke Nakamura; Hidewaki Nakagawa; Jun Kusada; Tatsuhiko Tsunoda; Satoru Miyano; Hiroyuki Aburatani; Kazuto Kato; Akihiro Fujimoto; Teruhiko Yoshida; Elias Campo; Carlos López-Otín; Xavier Estivill; Roderic Guigó; Silvia de Sanjosé; Miguel A Piris; Emili Montserrat; Marcos González-Díaz; Xose S Puente; Pedro Jares; Alfonso Valencia; Heinz Himmelbauer; Heinz Himmelbaue; Victor Quesada; Silvia Bea; Michael R Stratton; P Andrew Futreal; Peter J Campbell; Anne Vincent-Salomon; Andrea L Richardson; Jorge S Reis-Filho; Marc van de Vijver; Gilles Thomas; Jocelyne D Masson-Jacquemier; Samuel Aparicio; Ake Borg; Anne-Lise Børresen-Dale; Carlos Caldas; John A Foekens; Hendrik G Stunnenberg; Laura van't Veer; Douglas F Easton; Paul T Spellman; Sancha Martin; Anna D Barker; Lynda Chin; Francis S Collins; Carolyn C Compton; Martin L Ferguson; Daniela S Gerhard; Gad Getz; Chris Gunter; Alan Guttmacher; Mark Guyer; D Neil Hayes; Eric S Lander; Brad Ozenberger; Robert Penny; Jane Peterson; Chris Sander; Kenna M Shaw; Terence P Speed; Paul T Spellman; Joseph G Vockley; David A Wheeler; Richard K Wilson; Thomas J Hudson; Lynda Chin; Bartha M Knoppers; Eric S Lander; Peter Lichter; Lincoln D Stein; Michael R Stratton; Warwick Anderson; Anna D Barker; Cindy Bell; Martin Bobrow; Wylie Burke; Francis S Collins; Carolyn C Compton; Ronald A DePinho; Douglas F Easton; P Andrew Futreal; Daniela S Gerhard; Anthony R Green; Mark Guyer; Stanley R Hamilton; Tim J Hubbard; Olli P Kallioniemi; Karen L Kennedy; Timothy J Ley; Edison T Liu; Youyong Lu; Partha Majumder; Marco Marra; Brad Ozenberger; Jane Peterson; Alan J Schafer; Paul T Spellman; Hendrik G Stunnenberg; Brandon J Wainwright; Richard K Wilson; Huanming Yang
Journal: Nature Date: 2010-04-15 Impact factor: 49.962

7. BSeQC: quality control of bisulfite sequencing experiments.

Authors: Xueqiu Lin; Deqiang Sun; Benjamin Rodriguez; Qian Zhao; Hanfei Sun; Yong Zhang; Wei Li
Journal: Bioinformatics Date: 2013-09-23 Impact factor: 6.937

8. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.

Authors: Helga Thorvaldsdóttir; James T Robinson; Jill P Mesirov
Journal: Brief Bioinform Date: 2012-04-19 Impact factor: 11.622

9. An integrated encyclopedia of DNA elements in the human genome.

Authors:
Journal: Nature Date: 2012-09-06 Impact factor: 49.962

10. The UCSC Genome Browser database: 2014 update.

Authors: Donna Karolchik; Galt P Barber; Jonathan Casper; Hiram Clawson; Melissa S Cline; Mark Diekhans; Timothy R Dreszer; Pauline A Fujita; Luvina Guruvadoo; Maximilian Haeussler; Rachel A Harte; Steve Heitner; Angie S Hinrichs; Katrina Learned; Brian T Lee; Chin H Li; Brian J Raney; Brooke Rhead; Kate R Rosenbloom; Cricket A Sloan; Matthew L Speir; Ann S Zweig; David Haussler; Robert M Kuhn; W James Kent
Journal: Nucleic Acids Res Date: 2013-11-21 Impact factor: 16.971

6 in total

1. Exposure to 3,3',4,4',5-Pentachlorobiphenyl (PCB126) Causes Widespread DNA Hypomethylation in Adult Zebrafish Testis.

Authors: Neelakanteswar Aluru; Jan Engelhardt
Journal: Toxicol Sci Date: 2022-06-28 Impact factor: 4.109

Review 2. Analysis and Performance Assessment of the Whole Genome Bisulfite Sequencing Data Workflow: Currently Available Tools and a Practical Guide to Advance DNA Methylation Studies.

Authors: Ting Gong; Heather Borgard; Zao Zhang; Shaoqiu Chen; Zitong Gao; Youping Deng
Journal: Small Methods Date: 2022-01-22

3. wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data.

Authors: Marius Wöste; Elsa Leitão; Sandra Laurentino; Bernhard Horsthemke; Sven Rahmann; Christopher Schröder
Journal: BMC Bioinformatics Date: 2020-05-01 Impact factor: 3.169

4. epiGBS2: Improvements and evaluation of highly multiplexed, epiGBS-based reduced representation bisulfite sequencing.

Authors: Fleur Gawehns; Maarten Postuma; Morgane van Antro; Adam Nunn; Bernice Sepers; Samar Fatma; Thomas P van Gurp; Niels C A M Wagemaker; A Christa Mateman; Slavica Milanovic-Ivanovic; Ivo Groβe; Kees van Oers; Philippine Vergeer; Koen J F Verhoeven
Journal: Mol Ecol Resour Date: 2022-03-03 Impact factor: 8.678

5. msPIPE: a pipeline for the analysis and visualization of whole-genome bisulfite sequencing data.

Authors: Heesun Kim; Mikang Sim; Nayoung Park; Kisang Kwon; Junyoung Kim; Jaebum Kim
Journal: BMC Bioinformatics Date: 2022-09-19 Impact factor: 3.307

6. EpiMOLAS: an intuitive web-based framework for genome-wide DNA methylation analysis.

Authors: Sheng-Yao Su; I-Hsuan Lu; Wen-Chih Cheng; Wei-Chun Chung; Pao-Yang Chen; Jan-Ming Ho; Shu-Hwa Chen; Chung-Yen Lin
Journal: BMC Genomics Date: 2020-04-02 Impact factor: 3.969

6 in total