| Literature DB >> 34695175 |
Kenneth E Westerman1,2,3, Duy T Pham4, Liang Hong4, Ye Chen1, Magdalena Sevilla-González1,2,3, Yun Ju Sung5,6, Yan V Sun7,8, Alanna C Morrison4, Han Chen4,9, Alisa K Manning1,2,3.
Abstract
MOTIVATION: Gene-environment interaction (GEI) studies are a general framework that can be used to identify genetic variants that modify the effects of environmental, physiological, lifestyle or treatment effects on complex traits. Moreover, accounting for GEIs can enhance our understanding of the genetic architecture of complex diseases and traits. However, commonly used statistical software programs for GEI studies are either not applicable to testing certain types of GEI hypotheses or have not been optimized for use in large samples.Entities:
Mesh:
Year: 2021 PMID: 34695175 PMCID: PMC8545347 DOI: 10.1093/bioinformatics/btab223
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.931
Comparison of methods and software features implemented in GEI software tools
| Tool | Version | Algorithmic approach | Multiple interactions | Interaction covariates | Robust SEs | Multithreading available |
|---|---|---|---|---|---|---|
| GEM | 1.2 | Matrix projection | Yes | Yes | Yes | Yes |
| ProbABEL | 0.5.0 | Classical linear/logistic model | No | No | Yes | No |
| QUICKTEST | 0.95 | Classical linear model | No | No | Yes | No |
| PLINK2 | Alpha 2.3 | Classical linear/logistic model | Yes | Yes | No | Yes |
| SUGEN | 8.11 | Classical linear/logistic model | Yes | No | Yes | No |
| SPAGE | 0.1.5 | Matrix projection | No | No | No | No |
Type I error rates of GEM at = 5 10−8
| Simulated exposure distribution | Interaction test | Joint test | ||
|---|---|---|---|---|
| Type I error (model-based SEs) | Type I error (robust SEs) | Type I error (model-based SEs) | Type I error (robust SEs) | |
| Binary exposure | 3.14 | 3.04 | 3.93 | 3.93 |
| Continuous exposure | 3.58 | 3.53 | 4.38 |
|
| Log-normal exposure | 4.98 |
|
| 2.64 |
| Continuous exposure with quadratic effect | 3.29 | 3.43 |
|
|
Fig. 1.Power of the interaction test from the GEM method. Statistical power is shown on the y-axis reflecting the fraction of interaction tests with P < 5 10−8 (calculated based on 1000 tests). (a) Total interaction effect (x-axis), in terms of phenotypic variance explained, is partitioned equally among K exposures (K = 1, 2, 5 and 10), and an interaction test for q exposures jointly, the exact set of K exposures, is performed (q = K). (b) One exposure is responsible for the full interaction effect (K = 1), and q is varied (q = 1, 2, 5 and 10). (c) Total interaction effect is partitioned equally among 10 exposures (K = 10), and q is varied within subsets of these 10 exposures (q = 1, 2, 5 and 10). (d) As in (a), with a single exposure simulated and tested, but varying the strength of the genetic main effect. GEI, gene–environment interaction; % V.E., percent variance explained
Fig. 2.Benchmarking of GEM and other tools for GEI. Runtime (a, b) and maximum memory footprint (c, d) are shown as a function of sample size (N) and interaction testing program, using 100 000 simulated variants with the number of covariates held constant at three. The single exposure and outcome for each run were randomly simulated, with the outcome being either continuous (a, c) or binary with a case-control ratio of 1:3 (b, d). Circles and triangles correspond to programs compiled without or with Intel MKL, respectively. ‘GEM-opt’ refers to GEM runs using optimal parameters for speed, including compilation with MKL and pgen file inputs. All programs were run using a single thread and without robust standard errors. Results for ProbABEL at N > 100k were excluded because memory usage exceeded 100 GB
Fig. 3.Results from genome-wide interaction analysis of WHR in the UK Biobank. (a) Two-sided Manhattan plot displays association strengths for the interaction test (here, interaction; top) and the marginal genetic effect test (from a model with no interaction; bottom). x-axis represents genomic position and y-axis represents the negative logarithm of the P-value for association at that locus. (b) Comparison of marginal and joint association strengths. The x-axis and y-axis show the negative logarithm of the association P-value using the marginal test (with no interaction) and the joint test, respectively. Dotted line corresponds to y = x. For both panels, dashed lines denote genome-wide significance thresholds (P < 5 10−8). Variants shown in orange passed a genome-wide significance threshold for both interaction and marginal effects. Variants shown in purple passed a genome-wide significance threshold for interaction effect, but not the marginal effect. Variants shown in green passed a genome-wide significance threshold using the joint test, but not for interaction nor marginal effects. For visualization purposes, variants with P < 1 10−100 were excluded from the Manhattan plot (from a single locus on chromosome 6 only), and variants with P < 1 10−50 were excluded from the joint plot