| Literature DB >> 25602758 |
Meimei Liang1, Futao Zhang1, Gulei Jin1, Jun Zhu1.
Abstract
Gene co-expression networks comprise one type of valuable biological networks. Many methods and tools have been published to construct gene co-expression networks; however, most of these tools and methods are inconvenient and time consuming for large datasets. We have developed a user-friendly, accelerated and optimized tool for constructing gene co-expression networks that can fully harness the parallel nature of GPU (Graphic Processing Unit) architectures. Genetic entropies were exploited to filter out genes with no or small expression changes in the raw data preprocessing step. Pearson correlation coefficients were then calculated. After that, we normalized these coefficients and employed the False Discovery Rate to control the multiple tests. At last, modules identification was conducted to construct the co-expression networks. All of these calculations were implemented on a GPU. We also compressed the coefficient matrix to save space. We compared the performance of the GPU implementation with those of multi-core CPU implementations with 16 CPU threads, single-thread C/C++ implementation and single-thread R implementation. Our results show that GPU implementation largely outperforms single-thread C/C++ implementation and single-thread R implementation, and GPU implementation outperforms multi-core CPU implementation when the number of genes increases. With the test dataset containing 16,000 genes and 590 individuals, we can achieve greater than 63 times the speed using a GPU implementation compared with a single-thread R implementation when 50 percent of genes were filtered out and about 80 times the speed when no genes were filtered out.Entities:
Mesh:
Year: 2015 PMID: 25602758 PMCID: PMC4300192 DOI: 10.1371/journal.pone.0116776
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Running time (in seconds) for the GPU implementation, Multi-core CPU implementation, Single-thread C/C++ implementation and Single-thread R implementation when 50% of the genes were filtered out during the data preprocessing stage (individual number = 590).
|
|
|
|
|
|
|
|---|---|---|---|---|---|
|
| 0.624 | 0.094 | 0.375 | 1.622 | 1.295 |
|
| 0.671 | 0.218 | 1.138 | 5.020 | 5.101 |
|
| 0.811 | 0.39 | 2.403 | 14.102 | 15.163 |
|
| 0.967 | 0.655 | 4.447 | 25.032 | 25.740 |
|
| 1.202 | 0.889 | 7.301 | 41.769 | 42.260 |
|
| 1.388 | 1.419 | 10.811 | 69.685 | 69.420 |
|
| 1.747 | 1.965 | 14.633 | 94.731 | 95.220 |
|
| 2.122 | 3.010 | 20.369 | 130.627 | 134.100 |
Multi-core CPU version ran on 16 CPU threads running on 16 CPU cores.
Running time (in seconds) for the GPU implementation, Multi-core CPU implementation, Single-thread C/C++ implementation and Single-thread R implementation when no genes were filtered out during data preprocessing stage (individual number = 590).
|
|
|
|
|
|
|
|---|---|---|---|---|---|
|
| 0.655 | 0.218 | 1.295 | 4.529 | 3.947 |
|
| 0.858 | 0.592 | 4.384 | 24.250 | 24.020 |
|
| 1.295 | 1.591 | 10.311 | 63.971 | 65.400 |
|
| 1.950 | 2.902 | 19.906 | 127.037 | 131.148 |
|
| 3.089 | 4.68 | 31.31 | 220.112 | 221.040 |
|
| 4.321 | 8.236 | 46.863 | 322.876 | 326.34 |
|
| 6.771 | 10.968 | 72.758 | 480.106 | 484.56 |
|
| 8.003 | 15.600 | 85.597 | 618.084 | 632.04 |
Multi-core CPU version ran on 16 CPU threads running on 16 CPU cores.
Figure 1Curves of speedups against Single-thread R CPU implementation.
(a) Speedup curves when 50% genes were filtered out at data preprocessing stage. (b) Speedup curves when no gene was filtered out at data preprocessing stage.
Figure 2The co-expression network reconstructed by FastGCN using BRCA expression data with HTLV-I infection pathway.