| Literature DB >> 27402792 |
Marco S Nobile, Paolo Cazzaniga, Andrea Tangherloni, Daniela Besozzi.
Abstract
Several studies in Bioinformatics, Computational Biology and Systems Biology rely on the definition of physico-chemical or mathematical models of biological systems at different scales and levels of complexity, ranging from the interaction of atoms in single molecules up to genome-wide interaction networks. Traditional computational methods and software tools developed in these research fields share a common trait: they can be computationally demanding on Central Processing Units (CPUs), therefore limiting their applicability in many circumstances. To overcome this issue, general-purpose Graphics Processing Units (GPUs) are gaining an increasing attention by the scientific community, as they can considerably reduce the running time required by standard CPU-based software, and allow more intensive investigations of biological systems. In this review, we present a collection of GPU tools recently developed to perform computational analyses in life science disciplines, emphasizing the advantages and the drawbacks in the use of these parallel architectures. The complete list of GPU-powered tools here reviewed is available at http://bit.ly/gputools.Entities:
Keywords: CUDA; bioinformatics; computational biology; graphics processing units; high-performance computing; systems biology
Mesh:
Year: 2017 PMID: 27402792 PMCID: PMC5862309 DOI: 10.1093/bib/bbw058
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
High-performance computing architectures: advantages and drawbacks
| HPC type | Architecture | Advantages | Drawbacks | Computing paradigm |
|---|---|---|---|---|
| Computer cluster | Set of interconnected computers controlled by a centralized scheduler | Require minimal changes to the existing source code of CPU programs, with the exception of possible modifications necessary for message passing | Expensive, characterized by relevant energy consumption and requires maintenance | MIMD |
| Grid computing | Set of geographically distributed and logically organized (heterogeneous) computing resources | Require minimal changes to the existing source code of CPU programs, with the exception of possible modifications necessary for message passing | Generally based on ‘volunteering’: computer owners donate resources (e.g. computing power, storage) to a specific project; no guarantee about the availability of remote computers: some allocated tasks could never be processed and need to be reassigned; remote computers might not be completely trustworthy | MIMD |
| Cloud computing | Pool of computation resources (e.g. computers, storage) offered by private companies, attainable on demand and ubiquitously over the Internet | Mitigate some problems like the costs of the infrastructure and its maintenance | Data are stored on servers owned by private companies; issues of privacy, potential piracy, espionage, international legal conflicts, continuity of the service (e.g. owing to some malfunctioning, DDoS attacks, or Internet connection problems) | MIMD |
| GPU | Dedicated parallel co-processor, formerly devoted to real-time rendering of computer graphics, nowadays present in every common computer | High number of programmable computing units allow the execution of thousands simultaneous threads. Availability of high-performance local memories | Based on a modified SIMD computing paradigm: conditional branches imply serialization of threads’ execution. GPU’s peculiar architecture generally requires code rewriting and algorithms redesign | SIMD (although temporary divergence is allowed) |
| MIC | Dedicated parallel co-processor installable in common desktop computers, workstations and servers | Similar to GPUs but based on the conventional ×86 instructions set: existing CPU code, in principle, might be ported without any modification. All cores are independent | Fewer cores with respect to latest GPUs. To achieve GPU-like performances, modification of existing CPU code to exploit vector instructions are required | MIMD |
| FPGA | Integrated circuits containing an array of programmable logic blocks | Able to implement a digital circuit, which directly performs purpose-specific tasks (unlike general-purpose software tools). Such tasks are executed on a dedicated hardware without any computational overhead (e.g. those related to the operating system) | Generally programmed using a descriptive language (e.g. VHDL, Verilog [ | Dedicated hardware |
GPU-powered tools for sequence alignment, along with the speed-up achieved and the solutions used for code parallelization
| Sequence alignment | ||||
|---|---|---|---|---|
| Tool name | Speed-up | Parallel solution | Reference | |
| Sequence alignment based on BWT | BarraCUDA | – | GPU | [ |
| Sequence alignment based on BWT | CUSHAWGPU | – | GPU | [ |
| Sequence alignment based on BWT | GPU-BWT | – | GPU | [ |
| Sequence alignment based on BWT | SOAP3 | – | CPU-GPU | [ |
| Sequence alignment based on hash table | SARUMAN | – | GPU | [ |
| Sequence alignment with gaps based on BWT | SOAP3-dp | – | CPU-GPU | [ |
| Tool to map SNP exploiting SOAP3-dp | G-SNPM | – | CPU-GPU | [ |
| Sequence alignment exploiting SOAP3-dp | G-CNV | 18× | CPU-GPU | [ |
| Alignment of gapped short reads with Bowtie2 algorithm | nvBowtie | 8× | GPU | [ |
| Alignment of gapped short reads with Bowtie2 algorithm | MaxSSmap | – | GPU | [ |
| Reads assembly exploiting the de Bruijn approach | GPU-Euler | 5× | GPU | [ |
| Reads assembly exploiting the de Bruijn approach | MEGAHIT | 2× | GPU | [ |
| Sequence alignment (against database) tool | – | 2× | GPU | [ |
| Sequence alignment (against database) tool | CUDA-BLASTP | 6× | GPU | [ |
| Sequence alignment (against database) tool | G-BLASTN | 14.8× | GPU | [ |
| Sequence alignment with Smith-Waterman method | SW# | – | GPU | [ |
| Sequence alignment based on suffix tree | MUMmerGPU 2.0 | 4× | GPU | [ |
| Sequence similarity detection | GPU CAST | 10× | GPU | [ |
| Sequence similarity detection based on profiled Hidden Markov Models | CUDAMPF | 11–37× | GPU | [ |
| Multiple sequence alignment with Clustal | CUDAClustal | 2× | GPU | [ |
| Multiple sequence alignment with Clustal | GPU-REMuSiC | – | GPU | [ |
GPU-powered tools for molecular dynamics, along with the speed-up achieved and the solutions used for code parallelization
| Molecular dynamics | ||||
|---|---|---|---|---|
| Tool name | Speed-up | Parallel solution | Reference | |
| Non-bonded short-range interactions | – | 11× | GPU | [ |
| Explicit solvent using the particle mesh Ewald scheme for the long-range electrostatic interactions | – | 2–5× | GPU | [ |
| Non-Ewald scheme for long-range electrostatic interactions | – | 100× | multi-GPU | [ |
| Standard covalent and non-covalent interactions with implicit solvent | OpenMM | – | GPU | [ |
| Non-bonded and bonded interactions, charge equilibration procedure | PuReMD | 16× | GPU | [ |
| Energy conservation for explicit solvent models | MOIL-opt | 10× | CPU-GPU | [ |
| Electrostatics and generalized Born implicit solvent model | LTMD | 5.8× | CPU-GPU | [ |
GPU-powered tools for molecular docking, along with the speed-up achieved and the solutions used for code parallelization
| Molecular docking | ||||
|---|---|---|---|---|
| Tool name | Speed-up | Parallel solution | Reference | |
| – | 45× | CPU-GPU | [ | |
| Conformation generation and scoring function for rigid and flexible molecules | – | 50× | CPU-GPU | [ |
| High accuracy flexible molecular docking with differential evolution | MolDock | 27.4× | GPU | [ |
| Large-scale protein structure alignment | ppsAlign | 39× | CPU-GPU | [ |
| Protein-DNA docking with Monte Carlo simulation and simulated annealing | – | 28× | GPU | [ |
| Katchalski-Katzir algorithm with traditional Fast Fourier transform rigid- docking scheme | MEGADOCK | – | GPU | [ |
| Docking approach using Ray Casting | – | 27× | CPU-GPU | [ |
GPU-powered tools to predict molecular structures, along with the speed-up achieved and the solutions used for code parallelization
| Prediction and searching of molecular structures | ||||
|---|---|---|---|---|
| Tool name | Speed-up | Parallel solution | Reference | |
| RNA secondary structure with dynamic programming | – | 17× | GPU | [ |
| RNA secondary structure with Zucker algorithm | – | 6.75–15.93× | CPU-GPU | [ |
| Molecular distance geometry problem with a memetic algorithm | memHPG | – | CPU-GPU | [ |
| Protein alignment | GPU-CASSERT | 180× | GPU | [ |
| Protein alignment based on Simulated Annealing | – | – | GPU | [ |
GPU-powered tools for dynamic simulation, along with the speed-up achieved and the solutions used for code parallelization
| Simulation of the spatio-temporal dynamics and applications in Systems Biology | ||||
|---|---|---|---|---|
| Tool name | Speed-up | Parallel solution | Reference | |
| Coarse-grain deterministic simulation with Euler method | – | 63× | GPU | [ |
| Coarse-grain deterministic simulation with LSODA | cupSODA | 86× | GPU | [ |
| Coarse-grain deterministic and stochastic simulation with LSODA and SSA | cuda-sim | 47× | GPU | [ |
| Coarse-grain stochastic simulation with SSA (with CUDA implementation of Mersenne-Twister RNG) | – | 50× | GPU | [ |
| Coarse- and fine-grain stochastic simulation with SSA | – | 130× | GPU | [ |
| Coarse-grain stochastic simulation with SSA | – | – | GPU | [ |
| Fine-grain stochastic simulation of large scale models with SSA | GPU-ODM | – | GPU | [ |
| Fine-grain stochastic simulation with τ-leaping | – | 60× | GPU | [ |
| Coarse-grain stochastic simulation with τ-leaping | cuTauLeaping | 1000× | GPU | [ |
| RD simulation with SSA | – | – | GPU | [ |
| Spatial τ-leaping simulation for crowded compartments | STAUCC | 24× | GPU | [ |
| Particle-based methods for crowded compartments | – | 200× | GPU | [ |
| Particle-based methods for crowded compartments | – | 135× | GPU | [ |
| ABM for cellular level dynamics | FLAME | – | GPU | [ |
| ABM for cellular level dynamics | – | 100× | GPU | [ |
| Coarse-grain deterministic simulation of blood coagulation cascade | coagSODA | 181× | GPU | [ |
| Simulation of large-scale models with LSODA | cupSODA*L | – | GPU | [ |
| Parameter estimation with multi-swarm PSO | – | 24× | GPU | [ |
| Reverse engineering with Cartesian Genetic Programming | cuRE | – | GPU | [ |
| Parameter estimation and model selection with approximate Bayesian computation | ABC-SysBio | – | GPU | [ |
Figure 1With the advances in the manufacturing processes, the architectural features of both CPUs (red dots) and GPUs (green squares) continuously improve. This figure shows the trends for both architectures by comparing the following characteristics: (A) the performances in terms of GFLOPS when performing double precision floating point operations; (B) the power consumption; (C) the GPWR; (D) the number of cores per unit; (E) the core working frequencies. The GPUs considered in this figure are reported in Supplementary File 3, while the CPUs are the Intel Core i7 processors released in the same years (namely, from the Westmere up to the Haswell microarchitectures). A colour version of this figure is available at BIB online: https://academic.oup.com/bib.