| Literature DB >> 25143290 |
Justin Chu1, Sara Sadeghi1, Anthony Raymond1, Shaun D Jackman1, Ka Ming Nip1, Richard Mar1, Hamid Mohamadi1, Yaron S Butterfield1, A Gordon Robertson1, Inanç Birol1.
Abstract
Large datasets can be screened for sequences from a specific organism, quickly and with low memory requirements, by a data structure that supports time- and memory-efficient set membership queries. Bloom filters offer such queries but require that false positives be controlled. We present BioBloom Tools, a Bloom filter-based sequence-screening tool that is faster than BWA, Bowtie 2 (popular alignment algorithms) and FACS (a membership query algorithm). It delivers accuracies comparable with these tools, controls false positives and has low memory requirements. Availability and implementaion: www.bcgsc.ca/platform/bioinfo/software/biobloomtools.Entities:
Mesh:
Year: 2014 PMID: 25143290 PMCID: PMC4816029 DOI: 10.1093/bioinformatics/btu558
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Performance comparisons of BBT against FACS, BWA and BT2. Receiver operator characteristic curves of BBT and FACS using simulated 100 bp SE reads from Homo sapiens mixed with (A) E.coli and (B) Mus musculus filtered against an H.sapiens Bloom filter using a k-mer size of 25 bp; (C) CPU time benchmark comparing BT2 (for a range of built-in settings), BWA (using aln and mem settings), FACS and BBT, on one lane of human 2 × 150 bp PE Illumina HiSeq 2500 reads
Benchmarking results using simulated paired end 2 × 150 bp reads
| Tool and Settings | FNR | FDR | FDR |
|---|---|---|---|
| ( | ( | ( | |
| BT2 very sensitive | 1.40 × 10−5 | 2.03 × 10−2 | 0 |
| BT2 sensitive | 7.52 × 10−4 | 9.08 × 10−3 | 0 |
| BT2 fast | 1.26 × 10−2 | 5.90 × 10−3 | 0 |
| BT2 very fast | 1.34 × 10−2 | 5.65 × 10−3 | 0 |
| BWA aln | 3.26 × 10−3 | 8.14 × 10−4 | 0 |
| BWA mem | 0 | 1.92 × 10−1 | 1.00 × 10−4 |
| FACS | 1.22 × 10−1 | 9.88 × 10−3 | 0 |
| BBT ( | 8.42 × 10−3 | 3.78 × 10−3 | 0 |
Note: All reads were treated as SE reads for FACS.