| Literature DB >> 29686100 |
Thomas A DiPrete1, Casper A P Burik2, Philipp D Koellinger3.
Abstract
Identifying causal effects in nonexperimental data is an enduring challenge. One proposed solution that recently gained popularity is the idea to use genes as instrumental variables [i.e., Mendelian randomization (MR)]. However, this approach is problematic because many variables of interest are genetically correlated, which implies the possibility that many genes could affect both the exposure and the outcome directly or via unobserved confounding factors. Thus, pleiotropic effects of genes are themselves a source of bias in nonexperimental data that would also undermine the ability of MR to correct for endogeneity bias from nongenetic sources. Here, we propose an alternative approach, genetic instrumental variable (GIV) regression, that provides estimates for the effect of an exposure on an outcome in the presence of pleiotropy. As a valuable byproduct, GIV regression also provides accurate estimates of the chip heritability of the outcome variable. GIV regression uses polygenic scores (PGSs) for the outcome of interest which can be constructed from genome-wide association study (GWAS) results. By splitting the GWAS sample for the outcome into nonoverlapping subsamples, we obtain multiple indicators of the outcome PGSs that can be used as instruments for each other and, in combination with other methods such as sibling fixed effects, can address endogeneity bias from both pleiotropy and the environment. In two empirical applications, we demonstrate that our approach produces reasonable estimates of the chip heritability of educational attainment (EA) and show that standard regression and MR provide upwardly biased estimates of the effect of body height on EA.Entities:
Keywords: causal effects; genetic instrumental variables; genome-wide association studies; pleiotropy; polygenic scores
Mesh:
Year: 2018 PMID: 29686100 PMCID: PMC5984483 DOI: 10.1073/pnas.1707388115
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Illustrative results from simulations, estimated coefficient for T
| Model | Parameter | OLS | MR | GIV-C | GIV-U |
| A | Pleiotropy only | 1.1004 | 1.5040 | 1.0131 | 0.9419 |
| (0.0001) | (0.0012) | (0.0001) | (0.0001) | ||
| B | Pleiotropy only | 1.2011 | 1.5024 | 1.0604 | 0.8575 |
| (0.0001) | (0.0004) | (0.0001) | (0.0001) | ||
| C | Pleiotropy only | 1.3016 | 1.5020 | 1.1573 | 0.7263 |
| (0.0001) | (0.0002) | (0.0001) | (0.0002) | ||
| D | Pleiotropy only | 1.5004 | 3.4922 | 1.0776 | 0.8689 |
| (0.0005) | (0.0093) | (0.0002) | (0.0002) | ||
| E | Genetic Confounds | 1.3106 | 1.4609 | 1.1922 | 0.6422 |
| (0.0001) | (0.0002) | (0.0001) | (0.0002) | ||
| F | Nongenetic confounds | 1.4520 | 1.5032 | 1.4259 | 1.1193 |
| (0.0001) | (0.0002) | (0.0001) | (0.0001) | ||
| G | Nongenetic confounds with control | 1.3643 | 1.5064 | 1.3346 | 0.9587 |
| (0.0001) | (0.0002) | (0.0001) | (0.0001) |
Shown are mean estimated coefficient for T and SE within parentheses of 20 simulations using different estimation methods for several models. For all models the genetic correlation () was 0.5 and the coefficient for T () was 1. and are the heritability parameters of y and T. is the correlation between and the genetic confound for . is the correlation between and the genetic confound for . is the correlation between the nongenetic confound and . is the share of the confound that is controlled for in terms of variance of the confound. These results are a selection from ; see for all results. A, Table S2; B, Table S3; C, Table S4; D, Table S7; E, Table S9; F, Table S11; and G, Table S14. See row 2 of each table. See for more details on the parameters, variance, and standardized effect size.
Effects of the polygenic score for educational attainment (PGS EA) on (residualized) educational attainment in the Health and Retirement Study (HRS)
| Variable | OLS | IV1 | IV2 |
| PGS EA UKB | 0.259*** | 0.523*** | |
| (0.0183) | (0.0385) | ||
| PGS EA SSGAC | 0.530*** | ||
| (0.0389) | |||
| NA | 0.134 | 0.138 | |
| NA | (0.0197) | (0.0202) | |
| 2,751 | 2,751 | 2,751 |
*, **, ***. We regress the residual of EA on the different PGSs and calculate the implied heritability estimates. SEs are in parentheses. All variables have been standardized. EA is measured in years of schooling needed to obtain the highest achieved educational degree according to International Standard Classification of Education (ISCED) classifications. We use the residual of EA after a regression on birth year, birth year squared, gender, and the first 20 principal components in the genetic data. PGS EA SSGAC: PGS for EA using meta-analysis from ref. 15, excluding data from 23andMe, UK Biobank (UKB), and HRS; PGS EA UKB, PGS for EA using UKB data. IV1 uses PGS EA SSGAC as instrument and IV2 uses PGS EA UKB as instrument. NA, not applicable.
Estimates of the effect of height on educational attainment (EA)
| Variable | OLS | MR | GIV-C | GIV-U |
| Height | 0.136*** | 0.160*** | 0.168*** | 0.110*** |
| (0.0262) | (0.0481) | (0.0264) | (0.0262) | |
| PGS EA cond. UKB | 0.396*** | |||
| (0.0367) | ||||
| PGS EA uncond. UKB | 0.384*** | |||
| (0.0354) | ||||
| 2,751 | 2,751 | 2,751 | 2,751 |
*, **, ***. Standardized effect sizes and SEs are in parentheses. Birth year, birth year squared, gender, EA mother, EA father, and the first 20 principal components are included as control variables. For MR, a PGS for height from UK Biobank (UKB) data was used as instrument for height. For GIV-C and GIV-U PGS EA SSGAC was used as an instrument.
Estimates of the effect of educational attainment (EA) on height
| Variable | OLS | MR | GIV-C | GIV-U |
| EA | 0.072*** | 0.179** | 0.050*** | 0.040*** |
| (0.0138) | (0.0543) | (0.0119) | (0.0120) | |
| PGS height cond. UKB | 0.448*** | |||
| (0.0174) | ||||
| PGS height unc. UKB | 0.446*** | |||
| (0.0175) | ||||
| 2,751 | 2,751 | 2,751 | 2,751 |
*, **, ***. Standardized effect sizes and SEs are in parentheses. Birth year, birth year squared, gender, EA mother, EA father, and the first 20 principal components are included as control variables. For MR, a PGS for EA from UK Biobank (UKB) data was used as an instrument for EA. For GIV-C and GIV-U PGS height GIANT was used as an instrument.