| Literature DB >> 35317233 |
Lishi Li1,2, Yunyun An2, Li Ma2, Mengqi Yang2, Pengxiang Yuan2, Xiaojian Liu2, Xin Jin3,4, Yu Zhao5, Songfa Zhang6, Xin Hong7, Kun Sun2.
Abstract
DNA methylation is an important epigenetic regulator that plays crucial roles in various biological processes. Recent developments in experimental approaches and dramatic expansion of sequencing capacities have imposed new challenges in the analysis of large-scale, cross-species DNA methylation data. Hence, user-friendly toolkits with high usability and performance are in urgent need. In this work, we present Msuite2, an easy-to-use, all-in-one, and universal toolkit for DNA methylation data analysis and visualization with high flexibility, usability, and performance. Msuite2 is among the fastest tools in read alignment (in particular, it runs as much as 5x faster than its predecessor, Msuite1) with low computing resource usage. In addition, Msuite2 shows both balanced and high performance in terms of mapping efficiency and accuracy, demonstrating high potential to facilitate the investigation and application of large-scale DNA methylation analysis in various biomedical studies. Msuite2 is freely available at https://github.com/hellosunking/Msuite2/.Entities:
Keywords: Bisulfite treatment; CpG dinucleotide; Data visualization
Year: 2022 PMID: 35317233 PMCID: PMC8918723 DOI: 10.1016/j.csbj.2022.03.005
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Schematic workflow of Msuite2. Msuite2 has packaged sequencing read preprocessing, alignment, DNA methylation call and data visualization.
Comparison of major features between Msuite2 and current tools.
| Msuite2 | Msuite1 | Bismark | BWA-meth | |
|---|---|---|---|---|
| Underlying aligner | Bowtie2/Hisat2 | Bowtie2 | Bowtie2/Hisat2 | BWA |
| Align mode | 3-/4-letter | 3-/4-letter | 3-letter only | 3-letter only |
| Read preprocessing | Yes | Yes | No | No |
| Flexible read cycles | Yes | No | Partially$ | No |
| Quality control | Yes | Yes | No | No |
| Methylation call | Yes | Yes | Manually | No |
| Data visualization | Yes | Yes | No | No |
| Indel support | Yes | Yes | Yes | Yes |
| Multiple-file support | Yes | Yes | Yes | Yes |
| Sequencing mode | PE/SE | PE/SE | PE/SE | PE/SE |
| Output format | BAM | SAM/BAM | BAM | SAM |
| Parallelization | Yes | Yes | Yes | Yes |
Read preprocessing includes trimming of sequencing adaptors and low-quality cycles; $Bismark allows the users to skip the heading cycles.
Fig. 2Benchmark evaluation results of Msuite2 and current tools. (A) running time and peak memory usage (8 threads), (B) mapping accuracy and efficiency on 10 M in silico paired-end 100 bp reads, (C) accuracy and efficiency on 10 M in silico paired-end reads simulated in CT-rich regions. For BWA-meth, BSMAP, and GEM3, default and alterative parameters were both tested. The reads were simulated following TAPS protocol to enable the 4-letter mode of Msuite2 and Msuite1; results in 10 repeat experiments were averaged and shown. MAPQ stands for mapping quality score.
Fig. 3Output example of Msuite2 on real data. Msuite2 reports the key statistics of the analysis, as well as various figures to help the users inspect the quality of the data.
Fig. 4DNA methylation profiles of the liver tissue in human, chimpanzee, and macaque. (A) PRKACA gene, (B) EIF4EBP3 gene.