| Literature DB >> 25733797 |
Abstract
The importance of haplotype association and gene-environment interactions (GxE) in the context of rare variants has been underlined in voluminous literature. Recently, a software based on logistic Bayesian LASSO (LBL) was proposed for detecting GxE, where G is a rare (or common) haplotype variant (rHTV)-it is called LBL-GxE. However, it required relatively long computation time and could handle only one environmental covariate with two levels. Here we propose an improved version of LBL-GxE, which is not only computationally faster but can also handle multiple covariates, each with multiple levels. We also discuss details of the software, including input, output, and some options. We apply LBL-GxE to a lung cancer dataset and find a rare haplotype with protective effect for current smokers. Our results indicate that LBL-GxE, especially with the improvements proposed here, is a useful and computationally viable tool for investigating rare haplotype interactions.Entities:
Keywords: GWAS; GxE; MCMC; logistic Bayesian LASSO; rHTV; rare variants; retrospective likelihood
Year: 2015 PMID: 25733797 PMCID: PMC4332044 DOI: 10.4137/CIN.S17290
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Analysis of lung cancer data. The haplotype frequency estimates are obtained using Hapassoc.
| TYPE | OVERALL FREQ | CASE FREQ | CONTROL FREQ | OR | BF |
|---|---|---|---|---|---|
| AT | 0.0024 | 0.0026 | 0.0022 | 1.17 | 0.48 |
| GC | 0.0943 | 0.0938 | 0.0948 | 0.91 | 0.12 |
| GT | 0.0960 | 0.0899 | 0.1019 | 0.95 | 0.10 |
| Former smoker | 0.4300 | 0.4600 | 0.4000 | 3.53 | >100 |
| Current smoker | 0.4000 | 0.4600 | 0.3500 | 4.18 | >100 |
| AT X former smoker | – | – | – | 1.41 | 0.61 |
| GC X former smoker | – | – | – | 1.14 | 0.17 |
| GT X former smoker | – | – | – | 0.88 | 0.18 |
| AT X current smoker | – | – | – | 0.25 | 3.28 |
| GC X current smoker | – | – | – | 1.02 | 0.11 |
| GT X current smoker | – | – | – | 0.93 | 0.13 |
Note:
BF >2.
Abbreviation: Freq, frequency.
Comparison of performance between the original (Version 1.0) and the improved (Version 1.1) versions of LBL-GxE. Mean and SD are mean and standard deviation (over 100 replicates) of the difference in regression estimates from the two versions. %(BF >2) is the difference in powers (for effects with OR >1) or type I error rates (for effects with OR = 1); it is the difference in the percentages of replicates in which each regression coefficient is found to be significant (BF >2) for the two versions.
| SETTING 1 | SETTING 2 | SETTING 3 | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EFFECT | OR | MEAN | SD | %(BF >2) | EFFECT | OR | MEAN | SD | %(BF >2) | EFFECT | OR | MEAN | SD | %(BF >2) |
| h1 | 1 | 0.00 | 0.05 | 0.01 | h1 | 1 | 0.00 | 0.05 | 0.00 | h1 | 1 | 0.00 | 0.05 | 0.00 |
| h2 | 3 | 0.01 | 0.25 | 0.00 | h2 | 1 | 0.00 | 0.04 | 0.00 | h2 | 1 | 0.00 | 0.06 | 0.01 |
| h3 | 1 | 0.00 | 0.14 | 0.01 | h3 | 3 | 0.00 | 0.05 | 0.00 | h3 | 1 | 0.00 | 0.06 | −0.01 |
| h4 | 1 | 0.00 | 0.03 | 0.00 | h4 | 1 | 0.01 | 0.14 | −0.01 | h4 | 1 | 0.00 | 0.03 | 0.00 |
| h5 | 1 | 0.00 | 0.05 | 0.00 | h5 | 1 | 0.00 | 0.09 | 0.01 | h5 | 1 | 0.00 | 0.03 | 0.00 |
| E | 1 | 0.00 | 0.06 | 0.00 | h6 | 1 | 0.00 | 0.05 | 0.00 | h6 | 1 | 0.00 | 0.05 | 0.00 |
| h1×E | 1 | 0.00 | 0.06 | 0.00 | h7 | 1 | 0.00 | 0.06 | 0.00 | h7 | 3 | 0.00 | 0.12 | −0.02 |
| h2×E | 1 | 0.00 | 0.10 | 0.01 | h8 | 1 | 0.00 | 0.05 | 0.01 | h8 | 1 | 0.00 | 0.17 | 0.00 |
| h3×E | 3 | 0.00 | 0.10 | 0.03 | E | 1 | 0.00 | 0.04 | 0.00 | h9 | 1 | 0.00 | 0.05 | 0.00 |
| h4×E | 1 | 0.00 | 0.05 | 0.01 | h1×E | 1 | 0.00 | 0.05 | 0.00 | h10 | 1 | 0.00 | 0.04 | 0.00 |
| h5×E | 1 | 0.00 | 0.03 | 0.00 | h2×E | 1 | 0.00 | 0.04 | 0.00 | h11 | 1 | 0.00 | 0.03 | 0.00 |
| – | – | – | – | – | h3×E | 1 | 0.00 | 0.09 | 0.01 | E | 1 | 0.00 | 0.05 | −0.01 |
| – | – | – | – | – | h4×E | 1 | 0.00 | 0.18 | 0.00 | h1×E | 1 | 0.00 | 0.05 | 0.00 |
| – | – | – | – | – | h5×E | 3 | 0.00 | 0.19 | 0.00 | h2×E | 1 | 0.00 | 0.04 | 0.00 |
| – | – | – | – | – | h6×E | 1 | 0.00 | 0.04 | 0.00 | h3×E | 1 | 0.00 | 0.03 | 0.00 |
| – | – | – | – | – | h7×E | 1 | 0.00 | 0.04 | 0.00 | h4×E | 1 | 0.00 | 0.04 | −0.01 |
| – | – | – | – | – | h8×E | 1 | 0.00 | 0.05 | −0.01 | h5×E | 1 | 0.00 | 0.08 | 0.00 |
| – | – | – | – | – | – | – | – | – | – | h6×E | 1 | 0.00 | 0.03 | 0.00 |
| – | – | – | – | – | – | – | – | – | – | h7×E | 1 | 0.00 | 0.11 | 0.00 |
| – | – | – | – | – | – | – | – | – | – | h8×E | 3 | 0.00 | 0.10 | 0.02 |
| – | – | – | – | – | – | – | – | – | – | h9×E | 1 | 0.00 | 0.03 | 0.00 |
| – | – | – | – | – | – | – | – | – | – | h10×E | 1 | 0.00 | 0.03 | 0.00 |
| – | – | – | – | – | – | – | – | – | – | h11×E | 1 | 0.00 | 0.07 | 0.00 |
Comparison of computation time (in seconds) between the original (Version 1.0) and improved (Version 1.1) versions of LBL-GxE.
| DATA | SAMPLE SIZE | # HAPLOTYPES | VERSION 1.0 | VERSION 1.1 |
|---|---|---|---|---|
| Lung cancer data: two-level smoking | 5549 | 4 | 758 | 218 |
| Lung cancer data: three-level smoking | 5549 | 4 | – | 312 |
| Lung cancer data: three-level smoking and sex | 5549 | 4 | – | 387 |
| Simulated data 1: two-level covariate | 2000 | 6 | 341 | 127 |
| Simulated data 2: two-level covariate | 2000 | 9 | 906 | 200 |
| Simulated data 3: two-level covariate | 2000 | 12 | 2123 | 308 |
Note:
Version 1.0 can only handle one covariate with two levels.
| AFFECTED | SMOKE | M1.1 | M1.2 | M2.1 | M2.2 | M3.1 | M3.2 | M4.1 | M4.2 | M5.1 | M5.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
| 1 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 |
| 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
| 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
| 1 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 |