| Literature DB >> 25574130 |
Sungyoung Lee1, Min-Seok Kwon1, Taesung Park2.
Abstract
In genome-wide association studies (GWAS), regression analysis has been most commonly used to establish an association between a phenotype and genetic variants, such as single nucleotide polymorphism (SNP). However, most applications of regression analysis have been restricted to the investigation of single marker because of the large computational burden. Thus, there have been limited applications of regression analysis to multiple SNPs, including gene-gene interaction (GGI) in large-scale GWAS data. In order to overcome this limitation, we propose CARAT-GxG, a GPU computing system-oriented toolkit, for performing regression analysis with GGI using CUDA (compute unified device architecture). Compared to other methods, CARAT-GxG achieved almost 700-fold execution speed and delivered highly reliable results through our GPU-specific optimization techniques. In addition, it was possible to achieve almost-linear speed acceleration with the application of a GPU computing system, which is implemented by the TORQUE Resource Manager. We expect that CARAT-GxG will enable large-scale regression analysis with GGI for GWAS data.Entities:
Keywords: GPU; GWAS; gene–gene interaction; graphics processing unit; logistic regression
Year: 2014 PMID: 25574130 PMCID: PMC4263399 DOI: 10.4137/CIN.S16349
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1Execution flow of CARAT-GxG. The blue arrow indicates the execution flow between CPU and GPU. The upper and lower sides indicate the task of CPU and GPU, respectively. Data transfer between CPU and GPU is illustrated by an arrow, with description. Both tasks at the same time point can be executed concurrently.
Figure 2Results of CARAT-GxG performance assessment. (A) The dotted line indicates the theoretical acceleration folds by adding a graphics card. The solid line indicates measured acceleration folds in two against one (green) and three against one (blue) graphics cards. (B) Execution time between CARAT-GxG and CPU implementations in a single SNP test. (C) Execution time of CARAT-GxG according to the number of threads and blocks with the dataset including 1,000 samples with 500 SNPs.
Execution time of CARAT-GxG and CPU implementation in two-way testing (UNIT: SECOND).
| SNPs | 100 | 300 | 500 | 1K | 3K | 5K |
|---|---|---|---|---|---|---|
| SAMPLES | ||||||
| 100 | 0.6239 (75.2351) | 3.3846 (771.0163) | 9.5516 (3147.0682) | 38.1707 (16763.22) | 327.1889 (465504.78) | 976.2125 (3636110.1325) |
| 300 | 1.564 (98.3813) | 9.1754 (980.2416) | 24.2124 (3420.8945) | 102.0229 (28763.7075) | 883.1818 (965256.6405) | 2606.218 (3908843.075) |
| 500 | 2.5504 (143.2877) | 15.4253 (2030.7632) | 44.5972 (6609.8788) | 168.7734 (40383.0765) | 1644.9804 (1008473.73) | 4593.415 (5729728.825) |
| 1,000 | 4.9348 (260.6621) | 30.0753 (2732.666) | 84.1235 (8333.9238) | 327.5836 (42924.5325) | 2995.5481 (1052311.613) | 8662.1345 (5924214.92) |
Notes: All time units are seconds. The upper and lower ones indicate the execution times of CARAT-GxG and CPU implementations, respectively.
Figure 3The red, blue, and green lines indicate the number of combinations that change in rank, vanish, and do not change in rank as the number of iterations increases, respectively.
A list of top five significant two-way combinations from AMD data.
| COMBINATION | PAPER | |
|---|---|---|
| rs994542, rs9298846 | 2.95 × 10−14 | [20] |
| rs380390, rs2402053 | 1.09 × 10−13 | [19, 21] |
| rs380390, rs3775640 | 4.37 × 10−13 | [19] |
| rs380390, rs10511130 | 6.25 × 10−13 | [19] |
| rs380390, rs2125743 | 7.42 × 10−13 | [19] |