| Literature DB >> 36091972 |
Benjamin W Domingue1, Sam Trejo2, Emma Armstrong-Carter1, Elliot M Tucker-Drob3.
Abstract
Interest in the study of gene-environment interaction has recently grown due to the sudden availability of molecular genetic data-in particular, polygenic scores-in many long-running longitudinal studies. Identifying and estimating statistical interactions comes with several analytic and inferential challenges; these challenges are heightened when used to integrate observational genomic and social science data. We articulate some of these key challenges, provide new perspectives on the study of gene-environment interactions, and end by offering some practical guidance for conducting research in this area. Given the sudden availability of well-powered polygenic scores, we anticipate a substantial increase in research testing for interaction between such scores and environments. The issues we discuss, if not properly addressed, may impact the enduring scientific value of gene-environment interaction studies.Entities:
Keywords: gene–environment interaction; polygenic score
Year: 2020 PMID: 36091972 PMCID: PMC9455807 DOI: 10.15195/v7.a19
Source DB: PubMed Journal: Sociol Sci ISSN: 2330-6696
Figure 1:The challenge of studying GxE when using dichotomous outcomes. (A) True association between PGS and continuously varying outcome ϒ*. Densities show distributions above horizontal blue line for those in high and low environments. (B) True associations and those estimated using either a logistic regression model or a linear probability model when ϒ* has been dichotomized prior to analysis (ϒ = 1 when Y > λ). The linear probability model (lpm) produces the misimpression of GxE (nonparallel regression lines). The logistic regression model does not suffer from this bias but may still suffer from large standard errors and low power when we observe low variability in the dichotomous ϒ variable in one of the environments.
Figure 2:Reduction in power as a function of measurement error in both PGS and ENV. Left and right panels focus on relatively low (alpha = 0.25) and high (alpha = 0.5) reliability polygenic scores. Data-generating equation is shown in left-hand panel.
Figure 3:Distributions of two key environmental variables (household socioeconomic status [SES] and neighborhood disadvantage) taken from Wave I of Add Health (Harris et al. 2019). Note the reduction in variation of the distribution for the analytic sample (in red) versus that of the full sample (in blue). Reductions in the standard deviation are 11 percent for SES and 14 percent for neighborhood disadvantage; in variance terms, the reductions are 20 percent and 26 percent, respectively.