| Literature DB >> 29293576 |
Sven Warris1,2, N Roshan N Timal3, Marcel Kempenaar1, Arne M Poortinga1, Henri van de Geest2, Ana L Varbanescu3, Jan-Peter Nap1,2.
Abstract
BACKGROUND: Our previously published CUDA-only application PaSWAS for Smith-Waterman (SW) sequence alignment of any type of sequence on NVIDIA-based GPUs is platform-specific and therefore adopted less than could be. The OpenCL language is supported more widely and allows use on a variety of hardware platforms. Moreover, there is a need to promote the adoption of parallel computing in bioinformatics by making its use and extension more simple through more and better application of high-level languages commonly used in bioinformatics, such as Python.Entities:
Mesh:
Year: 2018 PMID: 29293576 PMCID: PMC5749749 DOI: 10.1371/journal.pone.0190279
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Options in PyPaSWAS for selecting and filtering the alignments.
| Filter name | Value range | Default | SAM descriptor | Description |
|---|---|---|---|---|
| lower_limit_score | 0.0 < x < = 1.0 | 1.0 | Allows for more hits per alignment. All hits with a score within this fraction of the maximum score found are reported. Used during the backtracing procedure for reducing the number of alignments to be processed. | |
| minimum_score | 0 < x | 30 | AS:i: | Minimum score of an alignment. Used during the backtracing procedure for reducing the number alignments to be processed.. |
| filter_factor | 0.0 < x < = 1.0 | 0.2 | AS:i: | For each alignment the theoretical maximum score is calculated: length of the shortest sequence times the maximum score for a match (eg. the score for a perfect alignment). Only alignments with a score above filter_factor times this theoretical maximum score are returned. |
| query_coverage | 0.0 < = x < = 1.0 | 0.2 | QC:f: | Minimum fraction of the query covered in the alignment |
| query_identity | 0.0 < = x < = 1.0 | 0.2 | QI:f: | Minimum fraction of matches relative to the query |
| relative_score | 0.0 < x < = score match | 2.0 | RS:f: | Minimum score relative to the shortest sequence. A full match will give a relative score of the match score, for DNA/RNA sequences the default is 5.0 |
| base_score | 0.0 < x < = score match | 2.0 | BS:f: | Score of the alignment divided by the length of the alignment. |
*Filter name: all parameters available for filtering;
** value range: the boundaries for the settings of the corresponding parameter.
Configurations for testing the performance of pyPaSWAS.
| Configu-ration | Hardware | Parallel device | Code optimized for | Nr. of cores | Language | Time for 2720 alignments (s) | GCUPS | Speedup compared to F |
|---|---|---|---|---|---|---|---|---|
| Intel i7 | CPU | CPU | 1 | OpenCL | 119.2 | 0.70 | 0.21 | |
| 8 | 106.4 | 0.82 | 0.21 | |||||
| GPU | 1 | 812.6 | 0.10 | 0.03 | ||||
| 8 | 192.3 | 0.44 | 0.12 | |||||
| NVIDIA GTX 1070 | GPU | GPU | 1920 | OpenCL | 57.8 | 1.48 | 0.36 | |
| CUDA | 17.6 | 4.64 | 1.00 |
*Configuration (F) is equivalent to the earlier PaSWAS [1], and is therefore used as reference here. The last two columns give the amount of time spent on the largest set of alignments in the performance analysis and the speedup compared to the configuration (F).
**GCUPS: giga cell updates per second.
Fig 1Performance of six different configurations for pyPaSWAS in Smith Waterman (SW) alignments.
The time required (Y-axis) for processing an incremental number of alignments (X-axis) is plotted. For details of the different configurations A-F see Table 2.