| Literature DB >> 32357829 |
Marius Wöste1, Elsa Leitão2, Sandra Laurentino3, Bernhard Horsthemke2,4, Sven Rahmann5, Christopher Schröder2,5.
Abstract
BACKGROUND: Analysing whole genome bisulfite sequencing datasets is a data-intensive task that requires comprehensive and reproducible workflows to generate valid results. While many algorithms have been developed for tasks such as alignment, comprehensive end-to-end pipelines are still sparse. Furthermore, previous pipelines lack features or show technical deficiencies, thus impeding analyses.Entities:
Keywords: Analysis pipeline; Analysis workflow; Epigenetics; Methylation; Whole-genome bisulfite sequencing
Mesh:
Substances:
Year: 2020 PMID: 32357829 PMCID: PMC7195798 DOI: 10.1186/s12859-020-3470-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1wg-blimp workflow overview. Users only need to provide FASTQ files and a reference genome, and wg-blimp will perform alignment, deduplication, QC checks, DMR calling, segmentation and annotation. Once the pipeline results are available, users can inspect results using a web interface
Fig. 2Segmentation tab of wg-blimp R Shiny GUI. Once the analysis pipeline completes, users may load results into wg-blimp’s R Shiny App. The tab depicted here displays MethylSeekR results and allows users to include or exclude PMD computation by toggling a single checkbox
Comparison of WGBS end-to-end pipelines. Most pipelines use similar software for ”standard” WGBS analysis tasks such as alignment or QC. wg-blimp improves on existing pipelines by providing a more comprehensive workflow as well as an interactive user interface
| pipeline | installation | workflow management | adapter trimming | alignment | methylation calling | quality control | DMR detection | segmentation | annotation |
|---|---|---|---|---|---|---|---|---|---|
| wg-blimp | Bioconda Docker | Snakemake | / | bwa-meth | MethylDackel | MultiQC | bsseq camel metilene | MethylSeekR | CGIs genes repetitive regions |
| BAT | manual Docker | Perl/R/shell scripts | / | segemehl | haarz | BAT | metilene | / | arbitrary BED file |
| bicycle | manual Docker Live CD | Java Application | bicycle | bicycle | bicycle | bicycle | bicycle | / | arbitrary BED file |
| CpG_Me/DMRichR | manual | shell/R scripts | Trim Galore! | Bismark | Bismark | MultiQC | bsseq dmrseq | / | CGIs genes |
| ENCODE pipeline | DNAnexus | DNAnexus | Trim Galore! | Bismark | Bismark | SAMtools Bismark | / | / | genes |
| Methy-Pipe | manual | Makefile | Methy-Pipe | Methy-Pipe | Methy-Pipe | Methy-Pipe | Methy-Pipe | / | genes |
| Nextflow methylseq (Bismark) | Nextflow | Nextflow | Trim Galore! | Bismark | Bismark | MultiQC | / | / | / |
| Nextflow methylseq (bwa-meth) | Nextflow | Nextflow | Trim Galore! | bwa-meth | MethylDackel | MultiQC | / | / | / |
| PiGx | GNU guix | Snakemake | Trim Galore! | Bismark | methylKit | MultiQC | methylKit | methylKit | CGIs genes |
| snakePipes | Bioconda | Snakemake | Cutadapt Trim Galore! Fastp | bwa-meth | MethylDackel | MultiQC | dmrseq DSS metilene | / | genes |
Fig. 3Distance from UMR/LMR centers to closest TSS for H1 ESCs. UMRs/LMRs were automatically inferred using wg-blimp’s MethylSeekR integration. UMRs and LMRs show a clear separation, with most UMRs being located in close proximity of TSSs