| Literature DB >> 30646844 |
Abstract
BACKGROUND: The blat is a widely used sequence alignment tool. It is especially useful for aligning long sequences and gapped mapping, which cannot be performed properly by other fast sequence mappers designed for short reads. However, the blat tool is single threaded and when used to map whole genome or whole transcriptome sequences to reference genomes this program can take days to finish, making it unsuitable for large scale sequencing projects and iterative analysis. Here, we present pblat (parallel blat), a parallelized blat algorithm with multithread and cluster computing support, which functions to rapidly fine map large scale DNA/RNA sequences against genomes.Entities:
Keywords: Cluster computing; Genome annotation; Parallel computing; Sequence alignment
Mesh:
Year: 2019 PMID: 30646844 PMCID: PMC6334396 DOI: 10.1186/s12859-019-2597-8
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Performance evaluation of pblat. a timing benchmarks of blat and pblat using different thread numbers (from 2 to 64). Each time represents the mean of three independent executions performed with the same arguments and on the same machine. b Speedup of pblat with different number of threads, compared to blat
Fig. 2Performance evaluation of pblat-cluster. a timing benchmarks of pblat-cluster using different number of computing nodes (from 1 to 15), with 12 threads per node. Each time represents the mean of three independent executions performed with the same arguments and on the same cluster. b Speedup of pblat-cluster with different number of computing nodes, compared to pblat with one node 12 threads