| Literature DB >> 23363699 |
Massimiliano Orsini1, Simone Carcangiu.
Abstract
BACKGROUND: Large-scale sequence studies requiring BLAST-based analysis produce huge amounts of data to be parsed. BLAST parsers are available, but they are often missing some important features, such as keeping all information from the raw BLAST output, allowing direct access to single results, and performing logical operations over them.Entities:
Year: 2013 PMID: 23363699 PMCID: PMC3571973 DOI: 10.1186/1751-0473-8-4
Source DB: PubMed Journal: Source Code Biol Med ISSN: 1751-0473
BlaSTorage comparative performances
| boulder (perl) | yes | yes | 213 ± 1.5 | - | - |
| Bio::SearchIO (perl) | no | yes | 27 ± 0.3 | 1495 ± 12 | 148506 |
| MuSeqBox (C++) | no | no | 0.73 ± 0.04 | 552.1 ± 1.5 | - |
| Zerg::Report | no | no | 1.47 ± 0.12 | 122.3 ± 0.8 | - |
| zerg (perl) | no | no | 0.36 ± 0.01 | 26.7 ± 0.16 | 341 ± 6 |
| Pyzerg | no | no | 0.35 ± 0.03 | 24.43 ± 0.3 | 312 ± 13 |
| Zerg C *)blastp output, STD over five replicates | no | no | 0.08 ± 0.0 | 2.01 ± 0.04 | 74 ± 7 |
In this table are shown results of some of the most popular stand-alone BLAST parsers. The time taken (seconds) to parse different blastp output files was measured in five separate runs using a laptop with 2 Gb RAM, 2 CPU, 1.83 Ghz and an intel centrino processor. Missing values are referred to tests where the program crashed or was terminated after 24 hours of unproductive work (extensive tests in supplementary material).