| Literature DB >> 28361696 |
Shaoliang Peng1, Shunyun Yang2, Ming Gao3, Xiangke Liao2, Jie Liu2, Canqun Yang2, Chengkun Wu4, Wenqiang Yu5.
Abstract
BACKGROUND: The increasing studies have been conducted using whole genome DNA methylation detection as one of the most important part of epigenetics research to find the significant relationships among DNA methylation and several typical diseases, such as cancers and diabetes. In many of those studies, mapping the bisulfite treated sequence to the whole genome has been the main method to study DNA cytosine methylation. However, today's relative tools almost suffer from inaccuracies and time-consuming problems.Entities:
Keywords: DNA methylation detection; Parallelized algorithm; Tianhe-2; Whole genome; Xeon Phi
Mesh:
Substances:
Year: 2017 PMID: 28361696 PMCID: PMC5374730 DOI: 10.1186/s12864-017-3497-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Target sequences format. The number before each sequence is an ID which must be consistent on the same chromosome. The third column represents the positive or negative chains of DNA. The fourth column represents the chromosome this sequence is located in. The last column represents the coordinates of the sequence in the original reference
Fig. 2The whole pipeline of P-Hint-Hunt. The pipeline is clear enough. We need to be aware of that the only dependency of the data is that target sequences with the same ID must be processed within the same thread to achieve the best result. Therefore, when a sufficiently high score is obtained, we still make a corresponding judgement before we process the next sequence
Computing environment in the test
| Hardware | Index |
|---|---|
| CPU architecture | x86_64 |
| CPU name | Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz |
| CPU frequency (MHz) | 2593.493 |
| CPU on each node | 2*16 cores |
| Memory on each node (GB) | 128 |
| Shared disk (TB) | 10 |
Results for multiple threads
| Number of threads | Time (s) | Speed-up |
|---|---|---|
| 2 | 37272.4 | 1.809 |
| 4 | 20259.4 | 3.548 |
| 8 | 10600.1 | 7.419 |
| 16 | 5375.5 | 15.112 |
| 32 | 3072.8 | 28.008 |
| 64 | 1828.4 | 47.070 |
Results on multiple computing nodes
| Sample | Nodes number | Threads number | Time (s) | Memory allocated (GB) |
|---|---|---|---|---|
| Sample 1 | 2 | 16 | 840.25 | 28.9 |
| 32 | 390.37 | 28.9 | ||
| 4 | 16 | 468.15 | 28.9 | |
| 32 | 202.12 | 28.9 | ||
| Sample 2 | 4 | 32 | 15806.23 | 28.9 |
| 6 | 32 | 11140.45 | 28.9 |
Results for coprocessors
| Sample | Number of coprocessor | Thread number | time (s) |
|---|---|---|---|
| Sample 1 | 1 | 112 | 2602 |
| 2 | 112 | 1469 | |
| Sample 2 | 1 | 112 | 139886 |
| 2 | 112 | 70397 |