| Literature DB >> 30386172 |
Omid Abbaszadeh1, Ali Reza Khanteymoori1, Ali Azarpeyvand1.
Abstract
System biology problems such as whole-genome network construction from large-scale gene expression data are sophisticated and time-consuming. Therefore, using sequential algorithms are not feasible to obtain a solution in an acceptable amount of time. Today, by using massively parallel computing, it is possible to infer large-scale gene regulatory networks. Recently, establishing gene regulatory networks from large-scale datasets have drawn the noticeable attention of researchers in the field of parallel computing and system biology. In this paper, we attempt to provide a more detailed overview of the recent parallel algorithms for constructing gene regulatory networks. Firstly, fundamentals of gene regulatory networks inference and large-scale datasets challenges are given. Secondly, a detailed description of the four parallel frameworks and libraries including CUDA, OpenMP, MPI, and Hadoop is discussed. Thirdly, parallel algorithms are reviewed. Finally, some conclusions and guidelines for parallel reverse engineering are described.Entities:
Keywords: CUDA; Gene regulatory network; Hadoop; MPI; OpenMP; Parallel algorithms; Parallel processing; Reverse engineering
Year: 2018 PMID: 30386172 PMCID: PMC6194435 DOI: 10.2174/1389202919666180601081718
Source DB: PubMed Journal: Curr Genomics ISSN: 1389-2029 Impact factor: 2.236
Parallel framework comparison.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| CUDA | SIMD | Fair | C/C++ | Moderately | More | Low |
| OpenMP | Multi-thread | Low | Most Languages | Easy | Few | Low |
| MPI | SIMD/MIMD | Fair | Most Languages | Poor | More | Medium |
| Hadoop | Distributed | High | Java | Poor | More | High |
Note: Framework complexity refers to the difficulty in using different frameworks.
Some related libraries and projects on CUDA, MPI, OpenMP, and Hadoop.
|
|
|
|
|---|---|---|
| Spark | An open-source cluster-computing framework on Hadoop | |
| Pig | A query language based on Hadoop for basic calculations over large datasets | |
| Mahout | A distributed machine learning and data mining library on Hadoop | |
| OpenMPI | Most used implementation of the MPI model. Open MPI 1.7 and later is CUDA-aware | |
| MVAPICH | CUDA-aware MPI implementation. It helps to run CUDA+MPI | |
| Mars | A Map-Reduce framework on graphics processors | |
| CuBLAS | An implementation of basic linear algebra subprograms on CUDA framework | |
| JCUDA | Java bindings for CUDA libraries. It helps to run Hadoop Map task on GPUs | |
| omp4j | An OpenMP like library for Java programming language | |
| mpi4py | A library for MPI programming in python | |
| PyCUDA | A library for integrating CUDA in python |
Advantages and disadvantages of computational methods.
|
|
|
|
|---|---|---|
| Bayesian network | • Facilitate the incorporation of prior knowledge and experimental data | • Feedback regulations not allowed |
| Information theory | • Easy to parallelize | • Can have a high rate of false positives in high dimensional data |
| Differential equation | • Suitable for time series and steady-state data | • Difficult to find optimal parameter values |
Parallel GRN inference algorithms.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| [ | Discrete | Information Theory | CUDA | GPU | Known as CUDA-MI | |
| [ | Continuous | Information Theory | CUDA | GPU | Known as FastGCN | |
| [ | Discrete | Information Theory | CUDA | GPU | Known as CMIP | |
| [ | Continuous | Differential Equation | CUDA | GPU | - | - |
| [ | Discrete | Information Theory | CUDA-OpenMP | GPU | - | - |
| [ | Continuous | Information Theory | OpenMP | - | Known as PARMIGENE | |
| [ | Continuous | Differential Equation | MPI | - | - | - |
| [ | Continuous | Differential Equation | MPI | - | - | Known as LSGPA |
| [ | Continuous | Bayesian Network | MPI | - | Known as fastBMA | |
| [ | Continuous | B-S-L1 | MPI | - | Known as SiGN | |
| [ | Discrete | Information Theory | MPI | Intel Xeon Phi | - | Based on TINGe |
| [ | Discrete | Bayesian Network | MPI | Intel Xeon/ Intel Xeon Phi | - | - |
| [ | Continuous | Differential Equation | MPI | Intel Xeon | - | Known as DEEP |
| [ | Continuous | Information Theory | MPI | - | Known as TINGe | |
| [ | Discrete | Information Theory | MPI | Intel Xeon | - | - |
| [ | Continuous | Differential Equation | MPI | - | - | Known as Parallel NIR |
| [ | Discrete | Bayesian Network | MPI | Intel Xeon | - | |
| [ | Discrete | Bayesian Network | MPI | Cray AMD | - | - |
| [ | Continuous | Differential Equation | Hadoop | - | - | - |
| [ | Continuous | Information Theory | Hadoop | - | - | - |
| [ | Continuous | - | - | Known as LegumeGRN |
1B-S-L: Bayesian Network, State Space Model, L1-regularization
2A software which have implemented multiple well-known reverse engineering algorithms
3 https://sites.google.com/site/liuweiguohome/cuda-mi
4 http://ibi.zju.edu.cn/software/FastGCN/
5 http://www.picb.ac.cn/CMIP/
6 https://cran.r-project.org/web/packages/parmigene/index.html
7 https://github.com/lhhunghimself/fastBMA, fastBMA is a parallel implementation of ScanBMA
8 https://www.bioconductor.org/
9 http://sign.hgc.jp/
10 http://aluru-sun.ece.iastate.edu/doku.php?id=tinge_gena
11 http://bonsai.hgc.jp/~tamada/hgc/suppl/GWGN/index.html
12 https://legumegrn.noble.org/cc.html