| Literature DB >> 30326636 |
Bethany J Wolf1, Paula S Ramos2,3, J Madison Hyer4, Viswanathan Ramakrishnan5, Gary S Gilkeson6, Gary Hardiman7,8,9,10, Paul J Nietert11, Diane L Kamen12.
Abstract
Development and progression of many human diseases, such as systemic lupus erythematosus (SLE), are hypothesized to result from interactions between genetic and environmental factors. Current approaches to identify and evaluate interactions are limited, most often focusing on main effects and two-way interactions. While higher order interactions associated with disease are documented, they are difficult to detect since expanding the search space to all possible interactions of p predictors means evaluating 2p - 1 terms. For example, data with 150 candidate predictors requires considering over 1045 main effects and interactions. In this study, we present an analytical approach involving selection of candidate single nucleotide polymorphisms (SNPs) and environmental and/or clinical factors and use of Logic Forest to identify predictors of disease, including higher order interactions, followed by confirmation of the association between those predictors and interactions identified with disease outcome using logistic regression. We applied this approach to a study investigating whether smoking and/or secondhand smoke exposure interacts with candidate SNPs resulting in elevated risk of SLE. The approach identified both genetic and environmental risk factors, with evidence suggesting potential interactions between exposure to secondhand smoke as a child and genetic variation in the ITGAM gene associated with increased risk of SLE.Entities:
Keywords: candidate genes; gene–environment interactions; logic forest; systemic lupus erythematosus
Year: 2018 PMID: 30326636 PMCID: PMC6211136 DOI: 10.3390/genes9100496
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Flowchart of the proposed analytic approach. AA: African American; SLE: Systemic lupus erythematosus; CTD: Comparative Toxicogenomics Database, and SNP: Single nucleotide polymorphism.
Figure 2Example of a logic regression tree. White boxes represent the predictor, in the case of SNPs, the recessive effect of the minor allele, and black boxes represent the complement of that predictor (e.g., for a SNP, this means the dominant effect of the major allele). There are three independent predictors/predictor interactions identified within the tree: (1) exposure to passive smoking as a child and having at least one copy of the major allele of rs2359661 (A) in ITGAM; (2) having two copies of the minor allele of rs4632147 (T) in ITGAX; and (3) having two copies of the minor allele of rs11761199 (G) in IRF5.
Participant characteristics by SLE status.
| Characteristic | Control ( | SLE ( | |
|---|---|---|---|
| Age (Mean ± Std Dev) | 42.6 ± 11.7 | 38.6 ± 13.4 | 0.022 |
| Female ( | 87 (83.6) | 88 (88.0) | 0.491 |
| Passive Smoke Exposure as a Child ( | 28 (26.9) | 41 (41.0) | 0.048 |
| Passive Smoke Exposure as an Adult ( | 18 (17.3) | 20 (20.0) | 0.754 |
| Ever Smoker ( | 24 (23.1) | 24 (24.0) | 1.000 |
| Current Smoker ( | 13 (12.5) | 17 (17.0) | 0.478 |
* p-values reported in the table for the association with SLE status are based on a two-sample t-test for age and chi-square test for all categorical variables.
Figure 3Predictor frequency by normalized predictor importance score for all predictors in the Logic Forest (LF) model. Points highlighted in red represent the predictors that have the largest combination of frequency and importance score.
Figure 4Interaction frequency by normalized interaction importance score for all interactions identified in the LF model. Points highlighted in red represent the interactions that have the largest combination of frequency and importance score. Points in green represent additional interaction terms identified in the forest that include passive smoke exposure as a child with at least one SNP.
Odds ratios with 95% confidence intervals (CI) from a series of logistic regression models. The implied reference category for each odds ratio is the complement of the effect defined in the first column.
| Effect | Gene | Odds Ratio (95% CI) | Unadjusted |
|---|---|---|---|
| Passive Smoke Exposure as Child (PSC) | 1.88 (1.01, 3.55) | 0.039 | |
| 2 copies of the minor allele of rs4632147 (T) |
| 3.09 (1.09, 10.1) | 0.023 |
| 2 copies of the minor allele of rs58408589 (C) |
| 2.96 (1.23, 7.75) | 0.011 |
| 2 copies of the minor allele of rs11761199 (G) |
| 7.69 (1.01, 352) | 0.033 |
| 2 copies of the minor allele of rs11770589 (A) |
| 1.65 (0.81, 3.42) | 0.179 |
| PSC & |
| 2.28 (1.18, 4.48) | 0.009 |
| PSC & |
| 2.46 (1.25, 4.92) | 0.005 |
| PSC & |
| 2.37 (1.23, 4.66) | 0.006 |