| Literature DB >> 26415965 |
Kamal Kishore1, Stefano de Pretis2, Ryan Lister3,4, Marco J Morelli5, Valerio Bianchi6, Bruno Amati7,8, Joseph R Ecker9,10, Mattia Pelizzola11,12.
Abstract
BACKGROUND: Numerous methods are available to profile several epigenetic marks, providing data with different genome coverage and resolution. Large epigenomic datasets are then generated, and often combined with other high-throughput data, including RNA-seq, ChIP-seq for transcription factors (TFs) binding and DNase-seq experiments. Despite the numerous computational tools covering specific steps in the analysis of large-scale epigenomics data, comprehensive software solutions for their integrative analysis are still missing. Multiple tools must be identified and combined to jointly analyze histone marks, TFs binding and other -omics data together with DNA methylation data, complicating the analysis of these data and their integration with publicly available datasets.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26415965 PMCID: PMC4587815 DOI: 10.1186/s12859-015-0742-6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Diagram describing input and output for the methylPipe and compEpiTools R packages. Most typical input data and output are listed for both packages. Regions of interest (ROIs) might be both input and output for these tools. For example, input ROIs can be generated in R based on the UCSC table browser or can be based on Bioconductor gene models or reference genome-sequence packages. Output ROIs are generated by methylPipe and compEpiTools and can typically feedback on the same tools as a new set of genomic regions to be investigated, often associated with scores or more complex data. Abbreviations: differentially methylated regions (DMRs); methyl-cytosine (mC); CpG Islands (CGIs); GeneOntology (GO); long non-coding RNAs (lncRNAs); transcription factors (TFs). The dashed arrow identifies a computational step that can be covered with additional tools (see the text for details)
Fig. 2The integrative heatmap generated by the compEpiTools heatmapData and heatmapPlot functions. Heatmaps can easily be obtained incorporating any mixture of data and annotation tracks. Heatmap rows represent ROIs, while columns represent tracks profiled over those ROIs (or bins thereof). Data and annotation tracks might contain either quantitative (e.g. normalized reads counts) or categorical (e.g. presence/absence of a ChIP-seq peak) data. If available, the significance of associated data can be incorporated affecting colour brightness. In this example, generated as described in detail in the supplemental material, NIH Roadmap DNA methylation data where visualized together with ENCODE histone marks for a set of differentially methylated regions. ROIs were clustered based on the data available in all the displayed tracks including gene models annotations. The schema on the top of the figure depicts the workflow leading to the heatmap. A set of standard Bioconductor objects, listed in red, is the input for the heatmapData and heatmapPlot compEpiTools functions. The underlined text points to the key analysis steps automatically performed internally to the functions generating the heatmap, calling routines available in the same packages
Comparison of methylPipe and compEpiTools features with functionalities offered by other similar tools
| BiSeq | M3D | Bsseq | DSS | methylKit | methPipe | radMeth | methylSig | WBSA | DMAP | Methy-Pipe | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| methylPipe | Supporting targeted BS-seq data (e.g. RRBS) | + | + | + | + | + | + | + | + | + | + | + |
| Supporting WGBS data | - | - | - | + | - | + | + | - | (a) | (b) | - | |
| Supporting multi-samples WGBS data | - | - | - | - | - | + | + | - | - | (b) | - | |
| Non-CpG mCs | - | - | - | - | - | - | - | - | + | - | + | |
| hmCs | - | - | - | - | + | + | - | - | - | - | - | |
| Supporting low-resolution DNA methylation data | - | - | - | - | - | - | - | - | - | - | - | |
| Computing absolute methylation (mC/bp) | - | - | - | - | - | - | - | - | - | - | - | |
| Computing relative methylation (mC/C) | + | - | + | - | + | + | - | + | + | - | + | |
| Supporting ROIs binning | - | - | - | - | - | - | - | - | - | - | - | |
| Pairwise DMR analysis (45’) | + (NA) | + (NA) | + (NA) | + (3d) | + (NA) | + (36’) | + (90’) | + (NA) | + (a) | + | + (NA) | |
| Multi-groups DMR analysis | - | - | + | - | - | - | + | - | - | + | - | |
| Browser-like data plot | - | - | - | - | - | - | + | - | - | - | ||
| compEpiTools | Computing Promoter-CpG content | - | - | - | - | - | - | - | - | - | - | - |
| Routines for reads counting | - | - | - | - | - | - | - | - | - | - | - | |
| Determining Signal enrichment | - | - | - | - | - | - | - | - | - | - | - | |
| ROIs Annotation | + | - | - | - | + | - | - | + | + | + | - | |
| RNAPII stalling index | - | - | - | - | - | - | - | - | - | - | - | |
| Non-redundant GO enrichment enrichment | - | - | - | - | - | - | - | - | - | - | - | |
| Enhancers identification | - | - | - | - | - | - | - | - | - | - | - | |
| lncRNAs identification | - | - | - | - | - | - | - | - | - | - | - | |
| Integrative heatmaps | - | - | - | - | - | - | - | - | - | - | - | |
| Ref. | [ | [ | [ | [ | [ | [ | [ | [ | [ | [ | [ |
The first column lists the key features offered by methylPipe and compEpiTools. Column headers report the tool name and reference. A “+” sign indicates that the feature is provided by a given tool, while a “–“sign indicates that it is not available. The “Pairwise DMR analysis” row includes in parenthesis the time (in minutes or days) needed for a complete WGBS differential analysis between two samples; NA is reported if this analysis is not supported for WGBS data. (a) WBSA is an online web-service imposing a limitation or 2GB for the upload of fastq files, which is clearly insufficient for the analysis of a WGBS dataset; the software can be installed locally although this requires significant effort (requiring Perl, R, MySQL, Java and C compiler) and it is only available for Linux; the analysis of the H1 and IMR90 WGBS was reported by the Authors to be completed in one week. (b) We could not use DMAP (no version details were provided for the code available) at the time of this comparison for the analysis of WGBS data, since an error was returned; a new version was provided to us, which was still requiring details on the restriction enzymes, necessary for the analysis of targeted DNA methylation datasets; eventually it remains unclear to us whether DMAP is able to analyse WGBS data