| Literature DB >> 25398475 |
Abstract
BACKGROUND: Programs based on hash tables and Burrows-Wheeler are very fast for mapping short reads to genomes but have low accuracy in the presence of mismatches and gaps. Such reads can be aligned accurately with the Smith-Waterman algorithm but it can take hours and days to map millions of reads even for bacteria genomes.Entities:
Mesh:
Year: 2014 PMID: 25398475 PMCID: PMC4289234 DOI: 10.1186/1471-2164-15-969
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Overview of the MaxSSmap program. In this figure the genome is divided into six fragments which means six threads will run on the GPU. Thread with ID 0 maps the read to fragment 0, slides it across fragment 0, and stops when it has covered all of fragment 0. We account for junctions between fragments and ensure that the read is fully mapped to the genome.
Figure 2Genome sequence in transpose format to enable coalescent memory access. In MaxSSmap threads with IDs 0 through 3 would at the same time read characters A, G, G, and C of the transposed genome to compare against the read. Since the four characters are in consecutive memory locations and so are the thread IDs, our program makes just one read from global memory instead of four separate ones.
Architecture for each program compared in our study
| Program | Architecture |
|---|---|
| MaxSSmap | GPU |
| MaxSSmap_fast | GPU |
| CUDA-SW++ | GPU |
| SSW | SIMD single CPU (GPU unavailable) |
| NextGenMap | GPU |
| BWA-MEM | Multi-threaded 12 CPUs |
Stampy (version 1.0.22) parameters to simulate reads
| Genome | Stampy parameters |
|---|---|
|
| -G ecoli ecoli_K12_MG1665.fasta |
| (hash) | -g ecoli -H ecoli |
| (simulate) | -g ecoli -h ecoli -S SRR522163_1.fastq |
| Human (format genome) | -G hs_ref_GRCh37.p13_chr1 |
| hs_ref_GRCh37.p13_chr1.fa | |
| (hash) | -g hs_ref_GRCh37.p13_chr1 |
| -H hs_ref_GRCh37.p13_chr1 | |
| (simulate) | -g hs_ref_GRCh37.p13_chr1 |
| -h hs_ref_GRCh37.p13_chr1 | |
| -S ERR315985_to_ERR315997_1.fastq | |
| Divergence | |
| .1 | –substitutionrate=.1 |
| .2 | –substitutionrate=.2 |
| .3 | –substitutionrate=.3 |
| .1+gaps | –substitutionrate=.1 –simulate-minindellen=-30 |
| –simulate-maxindellen=30 –insertsize=250 | |
| –insertsd=25 | |
| .2+gaps | –substitutionrate=.2 –simulate-minindellen=-30 |
| –simulate-maxindellen=30 –insertsize=250 | |
| –insertsd=25 | |
| .3+gaps | –substitutionrate=.3 –simulate-minindellen=-30 |
| –simulate-maxindellen=30 –insertsize=250 | |
| –insertsd=25 |
Comparison of MaxSSmap and MaxSSmap_fast to a GPU and a SIMD high performance Smith-Waterman implementation
|
(a) Percent of 100,000 251 bp reads mapped correctly to the | ||||
|---|---|---|---|---|
|
|
|
|
|
|
|
| ||||
| .1 | 95 (0.4) | 96 (0.4) | 94 (0.9) | 97 (3) |
| .2 | 95 (0.6) | 95.3 (0.6) | 94 (1) | 97 (3) |
| .3 | 90 (1.1) | 94.2 (0.9) | 93 (1.3) | 96 (4) |
|
| ||||
| .1 | 92 (1.5) | 93.1 (1.9) | 94 (0.9) | 97 (3) |
| .2 | 90 (1.7) | 92.5 (2.1) | 92 (1) | 96 (4) |
| .3 | 81 (2.8) | 89.9 (3.5) | 92 (1.4) | 95 (5) |
|
| ||||
|
|
|
|
|
|
|
| ||||
| .1 | 20 | 28 | 164 | 1288 |
| .2 | 20 | 28 | 164 | 1275 |
| .3 | 20 | 28 | 164 | 1255 |
|
| ||||
| .1 | 20 | 28 | 163 | 1283 |
| .2 | 20 | 28 | 162 | 1266 |
| .3 | 20 | 28 | 162 | 1235 |
These are simulated Illumina reads and contain realistic base qualities generated from Illumina short reads. Each divergence represents the average percent of mismatches in the reads. So 0.1 means 10% mismatches on the average. The gaps are randomly chosen to occur in the read or the genome and are of length at most 30.
Comparison of meta-methods
|
(a) Percent of one million 251 bp reads mapped correctly to the | ||||
|---|---|---|---|---|
| reads and remaining are rejected | ||||
|
|
|
|
|
|
|
|
|
|
| |
|
| ||||
| .1 | 96 (1.3) | 97 (1.4) | 97 (1) | 97 (2.8) |
| .2 | 95 (1.5) | 96.7 (1.5) | 96 (1) | NA |
| .3 | 91 (1.4) | 94.8 (1.3) | 93 (1.5) | NA |
|
| ||||
| .1 | 92 (1.5) | 95.7 (1.4) | 97 (1) | 97.2 (2.8) |
| .2 | 92 (2.1) | 94 (2.5) | 95 (2) | NA |
| .3 | 82 (2.9) | 90.5 (3.5) | 92.5 (1.6) | NA |
|
| ||||
|
|
|
|
|
|
|
|
|
|
| |
|
| ||||
| .1 | 25 | 34 | 197 | 1397 |
| .2 | 57 | 77 | 537 | NA |
| .3 | 148 | 204 | 1343 | NA |
|
| ||||
| .1 | 38 | 60 | 413 | 2601 |
| .2 | 68 | 109 | 756 | NA |
| .3 | 162 | 222 | 1528 | NA |
See Table 3 caption for details about reads. NA denotes time greater than 48 hours which is our cutoff time on this data.
Comparison of meta-methods to NextGenMap and BWA
|
(a) Percent of one million 251 bp reads mapped correctly to the | ||||
|---|---|---|---|---|
|
|
|
|
|
|
|
|
| |||
|
| ||||
| .1 | 89 (1.1) | 87 (1) | 96 (1.3) | 97 (1.4) |
| .2 | 24 (0.5) | 72 (1) | 95 (1.5) | 96.7 (1.5) |
| .3 | 0.6 (0) | 26 (0.5) | 91 (1.4) | 94.8 (1.3) |
|
| ||||
| .1 | 85 (3) | 78 (1) | 92 (1.5) | 95.7 (1.4) |
| .2 | 20 (0.5) | 60 (1) | 92 (2.1) | 94 (2.5) |
| .3 | 0.5 (0) | 19 (0.4) | 82 (2.9) | 90.5 (3.5) |
|
| ||||
|
|
|
|
|
|
|
|
| |||
|
| ||||
| .1 | 0.7 | 1.2 | 25 | 34 |
| .2 | 0.5 | 1.9 | 57 | 77 |
| .3 | 0.4 | 2.1 | 148 | 204 |
|
| ||||
| .1 | 0.7 | 1.5 | 38 | 60 |
| .2 | 0.5 | 2.1 | 68 | 109 |
| .3 | 0.4 | 2.1 | 162 | 222 |
See Table 3 caption for details about reads.
Comparison of meta-methods to NextGenMap and BWA
| (a) Percent of one million 250 bp reads mapped correctly to the human chromosome one genome. Shown in parenthesis are incorrectly mapped reads and remaining are rejected | |||
|---|---|---|---|
|
|
|
|
|
|
| |||
| .1 | 96(3) | 99 (1) | 99 (1) |
| .2 | 33 (6) | 86 (6) | 94 (6.2) |
| .3 | 0.8 (6) | 37 (9) | 88 (10.5) |
|
| |||
| .1 | 89 (3) | 96 (2) | 98 (2) |
| .2 | 28 (5) | 79 (6) | 92 (6.5) |
| .3 | 0.7 (0.5) | 30 (7) | 80 (9.4) |
|
| |||
|
|
|
|
|
|
| |||
| .1 | 1.6 | 7.1 | 10.3 |
| .2 | 0.8 | 44 | 626 |
| .3 | 0.6 | 65 | 3992 |
|
| |||
| .1 | 1.5 | 9.8 | 176 |
| .2 | 0.7 | 51 | 1126 |
| .3 | 0.6 | 66 | 4611 |
See Table 3 caption for details about reads.
Percent of one million reads of lengths 36, 51, 76, 100, and 150 and of divergence 0.1 with gaps mapped correctly to the genome
| Read | BWA | NGM | NGM+ | NGM+ | NGM+ |
|---|---|---|---|---|---|
| length | MaxSSmap_fast | MaxSSmap | CUDA-SW++ | ||
| 36 | 1.2 (0) | 33.3 (10.5) | 34 (16.5) | 43.8 (18.1) | 53.2 (17.1) |
| 51 | 11.4 (0.2) | 50.5 (3) | 53.3 (8.8) | 71.2 (7.5) | 79.5 (6.2) |
| 76 | 38.4 (.7) | 70.2 (1.5) | 76.5 (4.3) | 92.9 (2.7) | 96.6 (1.8) |
| 100 | 62.4 (1) | 82.6 (1.4) | 91 (2.2) | 96.8 (1.9) | 98.1 (1.5) |
| 150 | 76 (1.1) | 82.5 (1.2) | 92 (2.2) | 93.8 (2.2) | 94.7 (2.1) |
|
| |||||
| 36 | 0.09 | 0.3 | 26.6 | 37.8 | 315.9 |
| 51 | 0.11 | 0.5 | 26.6 | 38.2 | 284.9 |
| 76 | 0.15 | 0.6 | 19.8 | 28.4 | 201.9 |
| 100 | 0.22 | 0.7 | 14.8 | 21 | 129.9 |
| 150 | 0.34 | 0.9 | 19.6 | 29.2 | 173.8 |
Shown in parenthesis are incorrectly mapped reads and remaining are rejected. We denote NextGenMap by NGM.
Percent of 100,000 ancient horse DNA reads (SRR111892) of length 76 bp mapped to the horse genome Equus_caballus EquCab2 (GCA_000002305.1) and human reference genome
| Horse genome | ||||
|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
| ||
| 2.2 | 16 | 20.5 | 23.1 | 26 (estimated) |
|
| ||||
| 0.6 | 2.4 | 1609.6 | 2836 | 14820 (estimated) |
|
| ||||
|
|
|
|
|
|
|
|
|
| ||
| 0.16 | 14 | 18.6 | 21 | NA |
|
| ||||
| 0.82 | 2.9 | 2375.5 | 4108 | NA |
We ran NextGenMap+CUDA-SW++ for a maximum of 168 hours and estimated the time to align all rejected reads. Also shown is time in minutes.
Percent of 100,000 paired human reads from NA12878 in 1000 genomes (SRR016607) of length 101 bp mapped concordantly to the human genome
| NextGenMap | NextGenMap+ | NextGenMap+ |
|---|---|---|
| MaxSSmap_fast | MaxSSmap | |
| 83.5 (0) | 85.8 (0.7) | 87.6 (1.2) |
|
| ||
| 1.5 | 1295.9 | 2242.4 |
Concordant reads are pairs that are mapped within 500 base pairs. Also shown in parenthesis are discordant reads (mapped positions at least 500 bp apart) and the time in minutes. In NextGenMap+MaxSSmap we re-align pairs with MaxSSmap where at least one read in the pair was unmapped by NextGenMap or the pair is discordant. Thus, NextGenMap shows zero discordant pairs because we re-align them with MaxSSmap.
Running time comparison of CUDA and OpenCL implementations of MaxSSmap (denoted by MSS)
|
(a) Time in minutes to map 100,000 251 bp reads to the | ||||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
| ||||
|
| ||||||
| .1 | 20 | 28 | 17 | 27 | 17 | 27 |
| .2 | 20 | 28 | 17 | 27 | 17 | 27 |
| .3 | 20 | 28 | 17 | 27 | 17 | 27 |
|
| ||||||
| .1 | 20 | 28 | 17 | 27 | 17 | 27 |
| .2 | 20 | 28 | 17 | 27 | 17 | 27 |
| .3 | 20 | 28 | 17 | 27 | 17 | 27 |
|
| ||||||
|
| ||||||
|
|
|
|
|
|
| |
|
|
|
|
|
|
| |
|
| s |
| ||||
| 1295.9 | 2242.4 | 1183.5 | 2092.7 | 1252.7 | 2159.9 | |
|
| ||||||
|
| ||||||
|
| ||||||
|
|
|
|
|
|
| |
|
|
|
|
|
|
| |
|
|
|
| ||||
| 1609.6 | 2836 | 1515 | 2689.8 | 1561.8 | 2736.8 | |
The output from the three methods give the same accuracies and errors as given earlier but the running times vary. We find the CUDA 6.0 implementation to have the lowest runtimes followed by OpenCL and CUDA 4.2.