| Literature DB >> 28274198 |
Ricardo Olanda1, Mariano Pérez1, Juan M Orduña2, Joaquín Tárraga3, Joaquín Dopazo3.
Abstract
BACKGROUND: DNA methylation is an important mechanism of epigenetic regulation in development and disease. New generation sequencers allow genome-wide measurements of the methylation status by reading short stretches of the DNA sequence (Methyl-seq). Several software tools for methylation analysis have been proposed over recent years. However, the current trend is that the new sequencers and the ones expected for an upcoming future yield sequences of increasing length, making these software tools inefficient and obsolete.Entities:
Keywords: DNA methylation; High performance computing; Parallel pipeline
Mesh:
Year: 2017 PMID: 28274198 PMCID: PMC5343294 DOI: 10.1186/s12859-017-1574-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Alignment of a read sequence. Using a unidirectional BWT b bidirectional BWT
Fig. 2Parallel pipeline in HPG-Methyl tool
Fig. 3New parallel pipeline in HPG-Methyl2 tool
Sensitivities yielded for a synthetic dataset with a mutation rate of 1%
| Length | HPG-Methyl2 | HPG-Methyl | Bismark | |||
|---|---|---|---|---|---|---|
| (nt) | R | W | R | W | R | W |
| 75 | 95.50 | 0.69 | 93.37 | 0.62 | 88.30 | 0.1 |
| 150 | 99.01 | 0.46 | 96.87 | 0.80 | 94.59 | 0.08 |
| 400 | 99.75 | 0.18 | 97.55 | 0.48 | 97.55 | 0.10 |
| 800 | 99.93 | 0.06 | 97.58 | 0.43 | 98.45 | 0.08 |
| 1600 | 99.75 | 0.06 | 96.94 | 0.48 | – | – |
| 3200 | 99.68 | 0.08 | 96.42 | 0.49 | – | – |
Execution times (min.) for processing the synthetic dataset (1% mutation rate)
| Length (nt) | HPG-Methyl2 | HPG-Methyl | Bismark |
|---|---|---|---|
| 75 | 1.288 | 1.366 | 62.579 |
| 150 | 1.550 | 1.95 | 106.173 |
| 400 | 5.041 | 10.85 | 248.107 |
| 800 | 11.260 | 50.6 | 1246,89 |
| 1600 | 34.440 | 996.567 | – |
| 3200 | 164.593 | 7733.38 | – |
Sensitivities yielded for real datasets
| Dataset | HPG-Methyl2 | HPG-Methyl | Bismark |
|---|---|---|---|
| SRR309230_1 | 88.40 | 87.71 | 71.81 |
| SRR837425_1 | 84.34 | 82.75 | 68.42 |
Execution times (min.) for processing the real datasets
| Dataset | HPG-Methyl2 | HPG-Methyl | Bismark |
|---|---|---|---|
| SRR309230_1 | 11.333 | 12.053 | 82.120 |
| SRR837425_1 | 8,404 | 19.047 | 95.194 |