| Literature DB >> 22962447 |
Yongchao Liu1, Bertil Schmidt.
Abstract
MOTIVATION: The explosive growth of next-generation sequencing datasets poses a challenge to the mapping of reads to reference genomes in terms of alignment quality and execution speed. With the continuing progress of high-throughput sequencing technologies, read length is constantly increasing and many existing aligners are becoming inefficient as generated reads grow larger.Entities:
Mesh:
Year: 2012 PMID: 22962447 PMCID: PMC3436841 DOI: 10.1093/bioinformatics/bts414
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Pipeline of our aligner for the SE and the PE alignment: the dashed lines show the additional two stages for the PE alignment
Alignment results on the simulated 200-bp datasets
| Aligner | 1% | 2% | 4% | |||
|---|---|---|---|---|---|---|
| Recall | Prec. | Recall | Prec. | Recall | Prec. | |
| SE | ||||||
| CUSHAW2 | ||||||
| BWA-SW | 90.30 | 97.74 | 90.03 | 97.50 | 88.92 | 96.43 |
| Bowtie2 | 89.99 | 97.45 | 89.44 | 96.97 | 87.41 | 96.03 |
| GASSST | 80.41 | 96.04 | 79.58 | 96.01 | 77.73 | 95.98 |
| PE | ||||||
| CUSHAW2 | ||||||
| BWA-SW | 90.51 | 97.97 | 90.42 | 97.90 | 90.19 | 97.51 |
| Bowtie2 | 90.82 | 98.32 | 90.48 | 98.03 | 89.16 | 97.58 |
Alignment results using Q30 on the simulated 200-bp datasets
| Aligner | 1% | 2% | 4% | |||
|---|---|---|---|---|---|---|
| Recall | Prec. | Recall | Prec. | Recall | Prec. | |
| SE | ||||||
| CUSHAW2 | 99.95 | 99.94 | 99.93 | |||
| BWA-SW | 85.80 | 99.94 | 85.04 | 99.94 | 82.33 | 99.93 |
| Bowtie2 | 80.76 | 76.96 | 71.59 | |||
| GASSST | 76.15 | 99.54 | 75.47 | 99.57 | 73.92 | 99.49 |
| PE | ||||||
| CUSHAW2 | 86.34 | 99.95 | 99.94 | 99.93 | ||
| BWA-SW | 99.95 | 86.09 | 99.94 | 84.25 | 99.93 | |
| Bowtie2 | 83.81 | 82.71 | 82.71 | |||
Alignment results using different percentages of indel errors
| Aligner | Measure | 20% | 40% | 60% | 80% |
|---|---|---|---|---|---|
| SE | |||||
| CUSHAW2 | Recall | ||||
| Prec. | |||||
| BWA-SW | Recall | 90.05 | 90.05 | 90.03 | 90.05 |
| Prec. | 97.49 | 97.52 | 97.50 | 97.49 | |
| Bowtie2 | Recall | 89.45 | 89.43 | 89.45 | 89.46 |
| Prec. | 96.96 | 96.97 | 96.99 | 96.97 | |
| GASSST | Recall | 79.61 | 79.55 | 79.57 | 79.59 |
| Prec. | 96.01 | 96.01 | 96.04 | 96.04 | |
| PE | |||||
| CUSHAW2 | Recall | ||||
| Prec. | |||||
| BWA-SW | Recall | 90.43 | 90.41 | 90.41 | 90.44 |
| Prec. | 97.89 | 97.90 | 97.90 | 97.90 | |
| Bowtie2 | Recall | 90.5 | 90.49 | 90.49 | 90.51 |
| Prec. | 98.03 | 98.05 | 98.04 | 98.04 | |
Real dataset information
| Type | Name | No of Reads | Max | Mean | Insert |
|---|---|---|---|---|---|
| 454 | SRX000001 | 1 026 049 | 849 | 192 ± 58 | – |
| SRX001829 | 2 790 032 | 4996 | 560 ± 165 | – | |
| Illunima | ERX009608 | 107 967 800 | 102 | 101 ± 1 | 311 |
| SRX028059 | 243 441 880 | 102 | 101 ± 1 | 510 | |
Fig. 2.Alignment results using the 454 datasets
Fig. 3.Alignment results using the Illumina datasets
Runtime comparison (in seconds) of the tested aligners
| Data Group | CUSHAW2 | BWA-SW | Bowtie2 | GASSST | |
|---|---|---|---|---|---|
| SE | |||||
| D100 | 108 | 116 | 64 | 3348 | |
| D200 | 238 | 302 | 147 | 3538 | |
| D500 | 1157 | 842 | 2,038 | 4574 | |
| SRX000001 | 57 | 87 | 40 | 3273 | |
| SRX001829 | 925 | 758 | 1796 | 3941 | |
| ERX009608 | 2499 | 3761 | 1909 | – | |
| SRX028059 | 11 551 | 16 731 | 8191 | – | |
| PE | |||||
| D100 | 110 | 123 | 77 | – | |
| D200 | 241 | 320 | 240 | – | |
| D500 | 1157 | 928 | 2179 | – | |
| ERX009608 | 2657 | 4053 | 2048 | – | |
| SRX028059 | 12 936 | 17 093 | 9741 | – | |
Fig. 4.Scalability comparison between all aligners for the SE and PE alignment