| Literature DB >> 35583096 |
Siyang Cai1, April Hartley2, Osama Mahmoud3, Kate Tilling2,4, Frank Dudbridge1.
Abstract
Genome-wide association studies have provided many genetic markers that can be used as instrumental variables to adjust for confounding in epidemiological studies. Recently, the principle has been applied to other forms of bias in observational studies, especially collider bias that arises when conditioning or stratifying on a variable that is associated with the outcome of interest. An important case is in studies of disease progression and survival. Here, we clarify the links between the genetic instrumental variable methods proposed for this problem and the established methods of Mendelian randomisation developed to account for confounding. We highlight the critical importance of weak instrument bias in this context and describe a corrected weighted least-squares procedure as a simple approach to reduce this bias. We illustrate the range of available methods on two data examples. The first, waist-hip ratio adjusted for body-mass index, entails statistical adjustment for a quantitative trait. The second, smoking cessation, is a stratified analysis conditional on having initiated smoking. In both cases, we find little effect of collider bias on the primary association results, but this may propagate into more substantial effects on further analyses such as polygenic risk scoring and Mendelian randomisation.Entities:
Keywords: Mendelian randomisation; ascertainment bias; index event bias; selection bias
Mesh:
Year: 2022 PMID: 35583096 PMCID: PMC9544531 DOI: 10.1002/gepi.22455
Source DB: PubMed Journal: Genet Epidemiol ISSN: 0741-0395 Impact factor: 2.344
Figure 1Directed acyclic graph showing the assumed causal structure between a single‐nucleotide polymorphism of interest , instrumental variable , mediating covariate and outcome , with confounder . Parameters associated with each pairwise association are shown next to the corresponding edges. Conditioning on is represented by the box and induces moral edges connecting , and , shown by dashed lines, creating additional paths to .
Properties of some instrumental variable methods to adjust for collider bias
| Condition on covariate | Selected sample | Main assumptions | Summary statistics | Total | Conditional | Very weak instruments | |
|---|---|---|---|---|---|---|---|
| mtCOJO | Yes | No | (Default) InCLUDE, zero mean direct effect | Yes | Yes | No | No |
| CWLS | Yes | Yes | InCLUDE, zero mean direct effect | Yes | No | Yes | Yes |
| MR‐RAPS | Yes | Yes | InCLUDE, zero mean direct effect | Yes | Yes | Yes | Yes |
| Slope‐hunter | Yes | Yes | ZEMRA | Yes | No | Yes | No |
| MR‐Mix | Yes | Yes | ZEMRA | Yes | Yes | Yes | No |
| Weighted mode | Yes | Yes | ZEMRA | Yes | Yes | Yes | No |
| Weighted median | Yes | Yes | Weighted majority valid | Yes | Yes | Yes | No |
| Heckman selection model | No | Yes | Probit model of selection | No | No | No | No |
Note: The columns indicate (1) whether a method can be applied in GWAS conditioning on a covariate, and (2) in GWAS of selected samples; (3) the main assumptions of each method; (4) whether a method can use summary statistics rather than individual‐level data; (5) whether it requires total or (6) conditional effects of G on Y; (7) whether it accounts for bias from very weak instruments. mtCOJO, CWLS, and Slope‐hunter are developed specifically for collider bias correction and are described according to their current implementations. MR‐RAPS, MR‐Mix, weighted median, and weighted mode are developed for MR, but can be applied to collider bias correction, either by adjusting the marginal effect of G on Y, as per mtCOJO, or the conditional effect, as per CWLS and Slope‐hunter.
Abbreviations: CWLS, corrected weighted least squares; GWAS, genome‐wide association studies; MR‐RAPS, Mendelian randomisation using the robust adjusted profile score; mtCOJO, multitrait conditional and joint analysis; ZEMRA, zero modal residual assumption.
Error rates compared between methods
| Genetic correlation | Method | Type‐1 error | Power | FDR |
|---|---|---|---|---|
| 0 | Unadjusted | 0.0724 | 0.202 | 0.263 |
| CWLS | 0.0505 | 0.164 | 0.236 | |
| MR‐RAPS | 0.0502 | 0.164 | 0.235 | |
| Slope‐hunter | 0.0573 | 0.183 | 0.238 | |
| MR‐Egger | 0.0754 | 0.205 | 0.269 | |
| 0.45 | Unadjusted | 0.125 | 0.101 | 0.553 |
| CWLS | 0.087 | 0.122 | 0.416 | |
| MR‐RAPS | 0.0872 | 0.121 | 0.419 | |
| Slope‐hunter | 0.0903 | 0.121 | 0.427 | |
| MR‐Egger | 0.128 | 0.0995 | 0.563 | |
| −0.45 | Unadjusted | 0.0538 | 0.204 | 0.209 |
| CWLS | 0.0676 | 0.0842 | 0.445 | |
| MR‐RAPS | 0.0677 | 0.0842 | 0.446 | |
| Slope‐hunter | 0.0503 | 0.165 | 0.234 | |
| MR‐Egger | 0.0556 | 0.212 | 0.207 |
Note: Type‐1 error, power (both at α ≤ 0.05) and FDR for tests of SNPs with effects on the conditioning covariate X, over 1000 simulations of quantitative X and Y with parameters given in the main text. FDR was calculated as the ratio of Type‐1 error to the sum of Type‐1 error and power, as there were 5000 SNPs both with and without direct effects on Y. Unadjusted, tests based on the conditional effects . MR‐Egger, alleles coded to have positive effects on X, that is, .
Abbreviations: CWLS, corrected weighted least squares; FDR, false discovery rate; MR‐RAPS, Mendelian randomisation using the robust adjusted profile score; SNP, single‐nucleotide polymorphism.
Summaries of the estimated regression slope in the simulations of Table 2
| Genetic correlation | True slope | Method | Mean | Empirical SD | Mean SE |
|---|---|---|---|---|---|
| 0 | −0.4 | CWLS | −0.404 | 0.063 | 0.040 |
| MR‐RAPS | −0.400 | 0.0414 | 0.0416 | ||
| Slope‐hunter | −0.202 | 0.140 | N/A | ||
| MR‐Egger | 0.0258 | 0.004 | 0.002 | ||
| 0.45 | −0.5125 | CWLS | −0.178 | 0.0402 | 0.0338 |
| MR‐RAPS | −0.176 | 0.0338 | 0.0335 | ||
| Slope‐hunter | −0.184 | 0.146 | N/A | ||
| MR‐Egger | 0.0116 | 0.003 | 0.002 | ||
| −0.45 | −0.2875 | CWLS | −0.630 | 0.0861 | 0.0421 |
| MR‐RAPS | −0.624 | 0.0471 | 0.048 | ||
| Slope‐hunter | −0.175 | 0.0665 | N/A | ||
| MR‐Egger | 0.040 | 0.0036 | 0.003 |
Note: Genetic correlation, correlation between SNP effects on covariate X and outcome Y. Mean, mean estimated slope over 1000 simulations. Slope‐hunter computes a bootstrap SE, which we omitted from the simulation owing to time constraints. However, it was observed to be close to the empirical SD in some randomly selected replicates.
Abbreviations: CWLS, corrected weighted least squares; empirical SD, standard deviation of the estimated slope; mean SE, mean of the estimated standard error; MR‐RAPS, Mendelian randomisation using the robust adjusted profile score; N/A, not available; SNP, single‐nucleotide polymorphism.
Proportions of SNPs whose effects are nominally significant in the opposite direction after adjustment for collider bias
| Genetic correlation | Method | Correct change ( | Correct change ( | Incorrect change ( | Incorrect change ( |
|---|---|---|---|---|---|
| 0 | CWLS | 0.064 | 0.051 | 0.065 | 0.058 |
| MR‐RAPS | 0.065 | 0.051 | 0.067 | 0.059 | |
| Slope‐hunter | 0.028 | 0.021 | 0.019 | 0.017 | |
| MR‐Egger | 0.121 | 0.117 | 0.371 | 0.379 | |
| 0.45 | CWLS | 0.056 | 0.040 | 0.016 | 0.019 |
| MR‐RAPS | 0.057 | 0.041 | 0.016 | 0.019 | |
| Slope‐hunter | 0.035 | 0.025 | 0.009 | 0.010 | |
| MR‐Egger | 0.166 | 0.134 | 0.335 | 0.366 | |
| −0.45 | CWLS | 0.031 | 0.040 | 0.159 | 0.117 |
| MR‐RAPS | 0.031 | 0.040 | 0.164 | 0.120 | |
| Slope‐hunter | 0.007 | 0.01 | 0.029 | 0.022 | |
| MR‐Egger | 0.113 | 0.117 | 0.383 | 0.378 |
Note: Correct (incorrect) change, the adjusted effect is in the same (opposite) direction as the true effect. X and Y (Y only), SNP has a true effect on X and Y (Y only), leading to the presence (absence) of collider bias.
Abbreviations: CWLS, corrected weighted least squares; MR‐RAPS, Mendelian randomisation using the robust adjusted profile score; SNP, single‐nucleotide polymorphism.
Regression slopes for WHR adjusted for BMI
|
|
|
|
| |
|---|---|---|---|---|
| IVW | 0.00416 (0.00248) | −0.0511 (0.0114) | −0.0807 (0.0169) | −0.0872 (0.0237) |
| CWLS | 0.00848 (0.00506) | −0.053 (0.0118) | −0.0826 (0.0173) | −0.0888 (0.0242) |
| MR‐RAPS | 0.0355 (0.00615) | −0.0353 (0.0129) | −0.0672 (0.0159) | −0.0576 (0.0231) |
| Slope‐hunter | 0.324 (0.0335) | −0.165 (0.0495) | −0.135 (0.070) | −0.217 (0.0447) |
Note: Estimated slope (s.e.) of the regression of SNP effects on WHR adjusted for BMI on SNP effects on BMI, using SNPs selected by different p value thresholds for association with BMI.
Abbreviations: BMI, body mass index; CWLS, corrected weighted least squares; IVW, inverse variance weighted; MR‐RAPS, Mendelian randomisation using the robust adjusted profile score; SNP, single‐nucleotide polymorphism; WHR, waist–hip ratio.
Regression slopes for smoking cessation
|
|
|
|
|
| |
|---|---|---|---|---|---|
| IVW | 0.180 (0.015) | 0.179 (0.016) | 0.206 (0.019) | 0.258 (0.029) | 0.225 (0.033) |
| CWLS | 0.044 (0.004) | 0.051 (0.003) | 0.114 (0.007) | 0.194 (0.014) | 0.227 (0.028) |
| MR‐RAPS | 0.198 (0.018) | 0.198 (0.018) | 0.226 (0.021) | 0.287 (0.030) | 0.222 (0.033) |
| Slope‐hunter | −0.604 (0.093) | 0.353 (0.024) | 0.335 (0.039) | 0.487 (0.026) | 0.558 (0.065) |
| MR‐Mix | 0.050 (0.200) | 0.110 (0.175) | 0.120 (0.107) | 0.430 (0.249) | 2.01 × 10−17 (0.705) |
Note: Estimated slope (s.e.) of the regression of SNP effects on smoking cessation on SNP effects on smoking initiation, using SNPs selected by different p value thresholds for association with smoking initiation.
Abbreviations: CWLS, corrected weighted least squares; IVW, inverse variance weighted; MR‐RAPS, Mendelian randomisation using the robust adjusted profile score; SNP, single‐nucleotide polymorphism.