| Literature DB >> 32241255 |
Sheng-Yao Su1,2,3, I-Hsuan Lu2, Wen-Chih Cheng4, Wei-Chun Chung2, Pao-Yang Chen5, Jan-Ming Ho2, Shu-Hwa Chen6, Chung-Yen Lin7,8,9.
Abstract
BACKGROUND: DNA methylation is a crucial epigenomic mechanism in various biological processes. Using whole-genome bisulfite sequencing (WGBS) technology, methylated cytosine sites can be revealed at the single nucleotide level. However, the WGBS data analysis process is usually complicated and challenging.Entities:
Keywords: DNA methylation data analysis; Docker; Galaxy platform; WGBS pipeline
Year: 2020 PMID: 32241255 PMCID: PMC7114791 DOI: 10.1186/s12864-019-6404-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Implementation and integration of DocMethyl and EpiMOLAS_web
Fig. 2EpiMOLAS, including DocMethyl and EpiMOLAS_web, is a two-phase approach for WGBS data analysis. DocMethyl requires Bisulfite-Seq read data, reference genome sequences, and gene annotation files to determine the DNA methylation status. The resulting mtable files are uploaded to the EpiMOLAS_web server via the web user interfaces for users to perform various types of data analyses and visualization modules
Figure 3Workflow of DocMethyl-PE for WGBS paired-end data analysis, including Trim Galore, FastQC, Bismark tools, and the in-house program EpiMOLAS.jar
Figure 4EpiMOLAS_web analysis modules with corresponding usage scenarios
The comparisons of EpiMOLAS with other platforms and tools on the WGBS analysis
| EpiMOLAS | BAT | ENCODE-WGBS | snakePipe | NGI-MethylSeq | Mint | MethylPipe | MethylSig | Methylkit | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Environment | Docker|Galaxy web server | Docker | Shell script | Bioconda Snakemake | Docker Nextflow | Galaxy | R package | R package | R package | |
| Sequence context | CGCHG, CHH | CG | CG, CHG, CHH | CG, CHG, CHH | CG, CHG, CHH | CG | CG, CHG, CHH | CG, CHG, CHH | CG, CHG, CHH | |
| Start with | raw reads | raw reads | raw reads | raw reads | raw reads | raw reads | methylation call file | methylation call file | methylation call file | |
| Docker container | + | + | – | – | + | – | NA | NA | NA | |
| Web interface | + (Galaxy) | – | + | – | – | + (Galaxy) | NA | NA | NA | |
| Adapter and base quality trimming a | + | – | + | + | + | + | NA | NA | NA | |
| QC report b | + | + | + | + | + | + | NA | NA | NA | |
| Read mapping c | + | + | + | + | + | + | NA | NA | NA | |
| Methylation sites calling c | + | + | + | + | + | + | NA | NA | NA | |
| Discriptive statistics | + | + | – | + | + | + | + | + | + | |
| Find DMRs d | + (simple) | + (metilene) | – | + (metilene) | – | + (DSS) | + | + | + | |
| Clustering Analysis | + (heatmap) | + (heatmap) | – | + (heatmap) | – | – | – | + (heatmap) | – | |
| GO term enrichment | + | – | – | – | – | – | + | – | – | |
| KEGG pathway enrichment | + | – | – | – | – | – | – | – | – | |
| TFBS enrichment | – | – | – | – | – | – | – | + | – | |
| Genome-wide visualization | + (circos plot) | + (circos plot) | – | – | – | – | – | – | – | |
| Interacitve Quantitative Analysis | + | – | – | – | – | – | NA | NA | NA | |
| Data browsing and retrieving UI | + | – | – | – | – | – | NA | NA | NA | |
| Gene list with tracking logs | + | – | – | – | – | – | NA | NA | NA | |
| Venn analysis on gene lists | + | – | – | – | – | – | – | – | – | |
| Interplay with other high throughput data | Protein Interactome e | transcriptome | – | RNA-seq, ChIP-seq, ATAC-seq, Hi-C etc. | – | hydroxyl-methylation data | RNA-seq, ChIP-seq, DNase-seq | – | – | |
Remarks: aEpiMOLAS, ENCODE-WGBS, snakePipe, NGI-MethylSeq and Mint adopt Trim Galore/cutadapt in the workflow for adapter and base quality trimming
bEpiMOLAS, snakePipe, NGI-MethylSeq and Mint integrate FASTQC to report read quality. BAT includes BSeQC for BS-seq experiment quality assessment.ENCODE-WGBS collects samtools and Bismark metrics as quality reports
cEpiMOLAS, ENCODE-WGBS and Mint mainly use Bismark in the workflow for alignment and methylation extraction. BAT uses segemehl and in-houseBAT_calling tools. snakePipe adopts bwa-meth for alignment and MethylDackel for methylation calling. NGI-MethylSeq provides two data analysisworkflows for choices: Bismark and bwa-meth/MethylDackel
dEpiMOLAS uses simple method for DMGs. BAT and snakePipe include metilene for DMRs. Mint integrates DSS for DMRs
eEpiMOLAS includes a dedicated plugin viewer to explore the protein interaction network