| Literature DB >> 28785314 |
W B Langdon1, Brian Yee Hong Lam2.
Abstract
BACKGROUND: BarraCUDA is an open source C program which uses the BWA algorithm in parallel with nVidia CUDA to align short next generation DNA sequences against a reference genome. Recently its source code was optimised using "Genetic Improvement".Entities:
Keywords: Double-ended DNA sequence; GPGPU; Genetic improvement; Nextgen NGS; Parallel computing
Year: 2017 PMID: 28785314 PMCID: PMC5541657 DOI: 10.1186/s13040-017-0149-1
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Mean number of paired end sequences processed per second
| Prog | length | 12 core servera | GT 730b | 2 × K20 | K80 | Accuracy % | |
|---|---|---|---|---|---|---|---|
| bwa | 36 bp | 1900±50 | – | – | – | Mapped reads | 82.05 |
| bwa | 100 bp | 4500±20 | – | – | – | GCAT | 98.91 |
| 0.6.2 | 36 bp | – | 3270±2 (1.7±0.05) | 5300±110 (2.8±0.10) | 6500±180 (3.4±0.13) | Mapped reads | 83.17 |
| 0.6.2 | 100 bp | – | 1860±4 (0.4±0.002) | 8700±140 (1.9±0.03) | 11700±100 (2.6±0.02) | GCAT | 97.49 |
| 0.7.107 | 36 bp | – | 7600±6 (4.0±0.11) | 12900±160 (6.8±0.20) | 19900±500 (10.5±0.39) | Mapped reads | 83.01 |
| 0.7.107 | 100 bp | – | 2100±14 (0.5±0.004) | 8800±70 (2.0±0.02) | 12800±270 (2.8±0.06) | GCAT | 98.43 |
| Improvement ratio Barracuda 0.7.107 over 0.6.2 | |||||||
| 36 bp | – | 2.32±0.003 | 2.43±0.06 | 3.07±0.11 | Mapped reads | –0.16 | |
| 100 bp | – | 1.13±0.01 | 1.00±0.02 | 1.09±0.02 | GCAT | 1.60 | |
In (brackets) speed relative to bwa 0.7.12. ± gives standard deviation estimated from five runs. There was almost no variation in mapping rate or accuracy reported by GCAT
a2.60GHz, see “Darwin” in Table 2
bEstimated for two GT 730 GPUs
Parallel computer graphics hardware
| GPU | Compute level | MP | Total cores | Clock | Memory | ||
|---|---|---|---|---|---|---|---|
| GT 730 | 2014 | £54 | 2.1 | 2 × | 48 = 96 | 1.40 GHz | 4 GB 23 GB/s |
| Tesla K20 | 2012 | £2905 | 3.5 | 13 × | 192 = 2496 | 0.71 GHz | 5 GB 140 GB/s |
| Tesla K80a | 2014 | £6261 | 3.7 | 13 × | 192 = 2496 | 0.82 GHz | 11 GB 138 GB/s |
Fourth column is CUDA compute capability level. Each GPU chip contains 2 or 13 identical independent multiprocessors (MP, column 5). Each MP contains 48 or 192 stream processors (total column 7). Onboard memory size and bandwidth are given in the right most two columns. Technical report [36] has full details
aK80 is a dual GPU, Original total list price is followed by performance data for one half
Fig. 1Major components of Genetic Improvement (GI)
Fig. 2Processing paired end DNA sequences. “aln” is run two times (once per end), potentially in parallel, and its alignments are piped or passed via intermediate.sai files (dashed blue arrows) into “sampe” (sam (pe) paired end). “sampe” may be run in parallel. It also reads the index of the reference human genome and both ends of each DNA sequence in order to give the combined alignment in sam format. In the case of BarraCUDA, the two “aln” process each use a GPU and “sampe” uses multiple host threads. For bwa “aln” uses multiple host threads but “sampe” is single threaded
Computers. The desktop computer houses one GT 730. The servers are part of the Darwin Supercomputer of the University of Cambridge and hold multiple Tesla K20 or K80 GPUs
| Type | Intel x86 | Effective cores | Clock | Memory |
|---|---|---|---|---|
| Desktop | Core TM2 CPU 6700 | 2 | 2.66 GHz | 4 GB |
| Darwin | Xeon CPU E5-2630 v2 | 12 | 2.60 GHz | 62 GB |
| NVK80 | Xeon CPU E5-2670 v3 | 24 | 2.30 GHz | 125 GB |