Stephen Burgess1, Neil M Davies, Simon G Thompson. 1. From the aCardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom; and bMedical Research Council Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Bristol, United Kingdom.
Abstract
BACKGROUND: Instrumental variable methods can estimate the causal effect of an exposure on an outcome using observational data. Many instrumental variable methods assume that the exposure-outcome relation is linear, but in practice this assumption is often in doubt, or perhaps the shape of the relation is a target for investigation. We investigate this issue in the context of Mendelian randomization, the use of genetic variants as instrumental variables. METHODS: Using simulations, we demonstrate the performance of a simple linear instrumental variable method when the true shape of the exposure-outcome relation is not linear. We also present a novel method for estimating the effect of the exposure on the outcome within strata of the exposure distribution. This enables the estimation of localized average causal effects within quantile groups of the exposure or as a continuous function of the exposure using a sliding window approach. RESULTS: Our simulations suggest that linear instrumental variable estimates approximate a population-averaged causal effect. This is the average difference in the outcome if the exposure for every individual in the population is increased by a fixed amount. Estimates of localized average causal effects reveal the shape of the exposure-outcome relation for a variety of models. These methods are used to investigate the relations between body mass index and a range of cardiovascular risk factors. CONCLUSIONS: Nonlinear exposure-outcome relations should not be a barrier to instrumental variable analyses. When the exposure-outcome relation is not linear, either a population-averaged causal effect or the shape of the exposure-outcome relation can be estimated.
BACKGROUND: Instrumental variable methods can estimate the causal effect of an exposure on an outcome using observational data. Many instrumental variable methods assume that the exposure-outcome relation is linear, but in practice this assumption is often in doubt, or perhaps the shape of the relation is a target for investigation. We investigate this issue in the context of Mendelian randomization, the use of genetic variants as instrumental variables. METHODS: Using simulations, we demonstrate the performance of a simple linear instrumental variable method when the true shape of the exposure-outcome relation is not linear. We also present a novel method for estimating the effect of the exposure on the outcome within strata of the exposure distribution. This enables the estimation of localized average causal effects within quantile groups of the exposure or as a continuous function of the exposure using a sliding window approach. RESULTS: Our simulations suggest that linear instrumental variable estimates approximate a population-averaged causal effect. This is the average difference in the outcome if the exposure for every individual in the population is increased by a fixed amount. Estimates of localized average causal effects reveal the shape of the exposure-outcome relation for a variety of models. These methods are used to investigate the relations between body mass index and a range of cardiovascular risk factors. CONCLUSIONS: Nonlinear exposure-outcome relations should not be a barrier to instrumental variable analyses. When the exposure-outcome relation is not linear, either a population-averaged causal effect or the shape of the exposure-outcome relation can be estimated.
Most methods for estimating causal effects using instrumental variables (IVs) make the assumption that the relation between the exposure and outcome is linear.[1]Although this may be approximately true in many cases, especially after transforming the exposure or outcome, in some situations, the exposure–outcome relation will be nonlinear. In this case, the shape of the relation may be a target for investigation. For example, the observed relation between body mass index (BMI) and mortality is highly nonlinear, with mortality increasing sharply as BMI increases. However, an increased risk of mortality has also been observed for individuals with low BMI.[2] It is unclear whether this merely reflects reverse causation (sick people lose weight) or confounding (underweight individuals differ in other risk factors from those of average weight) or whether there is a causal effect of low BMI on increased mortality.[3]In a randomized trial where the exposure is the treatment received and the IV is treatment assignment, an IV analysis estimates a local average treatment effect.[4,5]This is the average change in the outcome resulting from a change in the exposure among those patients for whom treatment assignment influences the treatment received. In a trial context, such patients are known as compliers, and the local average treatment effect is also known as a complier-averaged causal effect.[6] Consistency of the IV estimator is subject to the assumption that any effect of the IV on the exposure is in the same direction for all persons (known as the monotonicity assumption).In an observational study, the IV and the exposure may be continuous rather than dichotomous. Here, the monotonicity assumption is that the exposure is a nondecreasing function of the IV for all persons (or, equivalently, a nonincreasing function for all persons). This is plausible in the context of Mendelian randomization—the use of genetic variants as IVs—because the biological effects of genetic variants are likely to be in the same direction in each person.[7] The IV estimate can then be viewed as a weighted average of partial derivatives of the relation of the outcome with the exposure.[8] In the discrete case, these derivatives can be interpreted as local average treatment effects at different values of the exposure and the IV.In this study, we explore the implications of nonlinear exposure–outcome relations for IV analyses, particularly in the context of Mendelian randomization. We initially consider the consequences of using linear IV models to estimate the effect of an exposure on an outcome when the true causal relation is nonlinear. We then introduce a novel approach for estimating localized average causal effects, which are IV estimates (local average treatment effects) estimated for strata of the population defined by the value of the exposure. These can provide evidence of a nonlinear effect of the exposure on the outcome. We discuss the findings and implications of our results and compare the approach introduced in this study with other parametric and nonparametric approaches to nonlinear IV analysis. We assume that the exposure and outcome are continuous; issues relating to binary outcomes are reserved for the discussion.This study is illustrated using data on 8090 subcohort participants from the multicenter case-cohort study European Prospective Investigation into Cancer and Nutrition (EPIC)-InterAct, the diabetes-focused component of the EPIC.[9] We use data on BMI (kg/m2) and a range of cardiovascular risk factors: systolic blood pressure (mmHg), C-reactive protein (mg/L, log-transformed), uric acid (μmol/L), glycated hemoglobin (HbA1c, %), total cholesterol (mmol/L), and triglycerides (mmol/L, log-transformed). Increases in BMI have been shown to have causal effects on each of these factors in previous Mendelian randomization studies.[10-12] The observational association of each of the risk factors with BMI in a linear regression model, and with BMI and BMI-squared in a quadratic regression model, is given in Table 1. (BMI is centered before analysis, adjustment is made for age, sex, and center.) The mean levels and 95% confidence intervals (CIs) of the risk factors for each quintile of BMI are shown in Figure 1. The observational relations of BMI with several of the risk factors are nonlinear, although this does not necessarily imply that the causal relations will be nonlinear.
TABLE 1.
Coefficients from Observational Analysis of the Association of Body Mass Index (BMI) with a Range of Cardiovascular Risk Factors in Linear and Quadratic Regression Models
FIGURE 1.
Mean level of cardiovascular risk factors stratified by quintile of body mass index against mean value of body mass index in quintile (lines are ±1.96 standard errors).
Coefficients from Observational Analysis of the Association of Body Mass Index (BMI) with a Range of Cardiovascular Risk Factors in Linear and Quadratic Regression ModelsMean level of cardiovascular risk factors stratified by quintile of body mass index against mean value of body mass index in quintile (lines are ±1.96 standard errors).
LINEAR INSTRUMENTAL VARIABLE ANALYSIS WITH NONLINEAR RELATIONS
It has been suggested that the use of linear models for IV analysis may have some value even if the underlying exposure–outcome model is nonlinear.[13] We here perform a simulation study to investigate the interpretation of linear IV estimates of nonlinear relations.
Simulation Study
We simulated data for 4000 persons on an IV G, a continuous exposure X taking positive values, a continuous outcome Y, and a confounder U. Five shapes of causal relation were considered between X and Y: a linear relation, a quadratic relation, a J-shaped relation (persons with lower levels of the exposure have slightly increased average outcomes), a U-shaped relation (persons with lower levels of the exposure have considerably increased average outcomes), and a threshold relation. Graphs showing the nonlinear relations between the exposure and mean level of the outcome are given in Figure 2.
FIGURE 2.
Nonlinear relationships between exposure and outcome for quadratic, J-shaped, U-shaped, and threshold relationship models.
Nonlinear relationships between exposure and outcome for quadratic, J-shaped, U-shaped, and threshold relationship models.The data-generating model for individual i is:where f (x) is the function associating the exposure and outcome:The distribution of the IV (G = 0, 1, 2) was chosen to represent the number of variant alleles for a single nucleotide polymorphism with minor allele frequency 0.3. The distribution of the exposure was simulated as positively skewed, with increased values corresponding to greater values of the outcome being less common (95th percentile: 3.7). The IV explained on average 2.4% of the variance in the exposure corresponding to an average F statistic of 49.6. These simulations are repeated in the eAppendix (http://links.lww.com/EDE/A818), altering the data-generating model to allow the effect of the IV on the exposure to vary among individuals and the effect of the exposure on the outcome to vary among individuals.For each of 10,000 simulated data sets, we calculated the ratio IV estimate (also called the Wald estimate) for the causal effect of the exposure on the outcome.[14] This is calculated as the coefficient for the association of the IV with the outcome divided by the coefficient for the association of the IV with the exposure. For functions f2 to f5, the effect of a fixed increase in the exposure differs across the distribution of the exposure. Therefore, we considered the average effects on the outcome of 2 interventions in the exposure: first, to increase the exposure for every individual by 1 unit; and second, to increase the exposure in every individual in the population by 0.25 units (the effect of a unit increase in the IV on the exposure). These quantities have been called population-averaged causal effects[15] or average partial effects,[16] as they average not only across individuals but also across the distribution of the exposure. As the data are simulated, the corresponding outcomes at these counterfactual values of the exposure can be calculated for each individual person.
Results
We present the mean values across simulations of the ratio IV estimate for a 1-unit increase in the exposure, a scaled ratio estimate for a 0.25-unit increase in the exposure, and the mean values of the average changes in the outcome (population-averaged causal effects) for 1-unit and 0.25-unit increases in the exposure (Table 2). Similar results were observed when the effects of the IV on the exposure and of the exposure on the outcome were allowed to vary among individual persons (eAppendix, eTables 1, 3, and 5, http://links.lww.com/EDE/A818).
TABLE 2.
Mean Values Across Simulations of the Ratio Instrumental Variable Estimate Assuming a Linear Exposure–Outcome Relationship and the Population-averaged Causal Effect for 1-unit and 0.25-unit Increases in the Exposure
Mean Values Across Simulations of the Ratio Instrumental Variable Estimate Assuming a Linear Exposure–Outcome Relationship and the Population-averaged Causal Effect for 1-unit and 0.25-unit Increases in the ExposureIn the linear case, the ratio estimates scaled to a 1-unit and a 0.25-unit increase in the exposure were both equal to the corresponding population-averaged causal effect. In the nonlinear cases, the population-averaged causal effect was similar to the ratio estimate for a 0.25-unit increase but was considerably different for a 1-unit increase. This difference is especially apparent for the U-shaped relation, where a 0.25-unit increase in the exposure led to a fall in the average level of the outcome, but a 1-unit increase led to a rise. Although the approximate equality may not hold for extreme shapes of the exposure–outcome relation, in the examples presented, the ratio IV estimate is close to the population-averaged causal effect for an increase in the exposure distribution of similar size to that associated with a change in the IV if the increase is small. We provide a theoretical motivation for this finding in the eAppendix (http://links.lww.com/EDE/A818). Previous theoretical results have been derived relating the linear IV estimate to a weighted average of derivatives[8]; this finding is similarly motivated, but the interpretation as a population-averaged causal effect is likely to be more familiar to an audience of applied researchers.In general, the population-averaged causal effect cannot be generalized to the effect of an increase in the exposure for an individual.[17] Even in the linear case, under heterogeneity in the exposure–outcome model, the ratio estimate represents an average causal effect across the population.[18] In the nonlinear case (such as with the threshold relation), a large proportion of the population will have no increase in the outcome associated with a small increase in the exposure, as the increased exposure will not exceed the threshold level. With the J-shaped and U-shaped relations, a small increase in the exposure will lead to increases in the outcome for some persons and decreases for others.
POSSIBLE APPROACHES TO NONLINEAR INSTRUMENTAL VARIABLE ANALYSIS
Although in some cases ignoring any nonlinearity in the exposure–outcome relation and estimating a population-averaged causal effect may be sufficient, doing so does not give the investigator any insight into the shape of the relation. We discuss why standard IV approaches are not generally useful for investigating the shape of nonlinear relations, and we introduce a novel method for the estimation of localized average causal effects at different levels of the exposure distribution.
Unsuitability of Instrumental Variable Approaches for Estimating Nonlinear Relations
When values across the range of the distribution of BMI are compared, there is heterogeneity in the association between BMI and mortality. For example, in an observational setting, comparisons are commonly made and nonlinearities observed between groups of persons with BMI (kg/m2) in the ranges below 18.5 (underweight), 18.5 to 25 (normal weight), 25 to 30 (overweight), and over 30 (obese). In contrast, taking the genetic variant with the greatest association with BMI in the EPIC-InterAct data set (rs1421085, a variant in the FTO gene, R2 = 0.39%), persons with no BMI-increasing alleles (major homozygotes) have an average BMI of 26.3, those with 1 allele (heterozygotes) an average BMI of 26.7, and those with 2 alleles (minor homozygotes) an average BMI of 27.1 (Figure 3). Thus, although an observational analysis can compare groups of persons with nonoverlapping distributions of BMI and marked differences in their average BMI values (standard deviation of BMI = 4.33), a standard IV analysis using rs1421085 compares groups of persons with overlapping distributions of BMI and slight differences in their average BMI values (standard deviation of fitted values of BMI conditional on IV = 0.26).
FIGURE 3.
Distribution of body mass index in subgroups defined by genetic variant rs1421085: solid line, major homozygotes; dashed line, heterozygotes; dotted line, minor homozygotes. Densities are smoothed using a kernel-density method with a common bandwidth of 0.8.
Distribution of body mass index in subgroups defined by genetic variant rs1421085: solid line, major homozygotes; dashed line, heterozygotes; dotted line, minor homozygotes. Densities are smoothed using a kernel-density method with a common bandwidth of 0.8.Although approaches have been proposed for nonlinear IV analysis,[19-22] information for estimating the effect of the exposure on the outcome is available (without additional assumptions) only for predicted values of exposure based on the IV. Nonlinearity for this reduced range of the exposure distribution is unlikely to be observed. Moreover, this range does not increase as the sample size increases. The shape of the relation outside this range can be estimated by the specification of a parametric model for the exposure–outcome relation; however, inference based on such models has been shown to be highly sensitive to the choice of parametrization.[16,22] We consider these issues further in relation to specific nonparametric IV methods in the discussion.
Stratification of the Exposure and Localized Average Causal Effects
Rather than estimating a population-averaged causal effect, we may wish to estimate an average causal effect for a stratum of the exposure distribution. These localized average causal effects will be constant in expectation if the relation is linear but will give insight into the shape of the exposure–outcome relation if it is nonlinear. As the IV is associated with the exposure, by stratifying on the exposure distribution, an association between the IV and outcome may be induced even if this was not present in the original data.[23] This can be avoided by stratifying based on the residual variation in the exposure conditional on the IV. Under the assumption that there is no heterogeneity in the average effect of the IV on the exposure at different levels of the exposure, this is equivalent to subtracting the effect of the IV from the exposure, and then stratifying individuals based on their IV-free exposure level. The IV-free exposure level represents the expected value of the exposure, which would be observed if a person’s IV value was zero. Although the IV-free exposure may appear to have a counterfactual interpretation, it is not intended to be a counterfactual variable. Rather, it is a function of the observed data. A counterfactual interpretation would require the stronger assumption that the effect of the IV on the exposure was constant for all individuals and is only necessary for the variable that is contrasted in the causal effect and not for the IV-free exposure that takes the role of a stratifying covariate.Using the same simulated data sets as in the previous section, we stratified individuals based on their IV-free exposure using categories below 1, 1 to 2, 2 to 3, and above 3. For each of these groups, we estimated the (stratum-specific) localized average causal effect of the exposure on the outcome using the ratio method (the association of the IV with the outcome in the stratum divided by the association of the IV with the exposure in the population). The IV-free exposure was calculated based on the estimated value of the association of the IV with the exposure in the whole population. Table 3 shows that the true shape of the exposure–outcome relation is apparent from comparing the mean values of these effects across the simulated data sets.
TABLE 3.
Mean Values Across Simulations of the Localized Average Causal Effects of the Exposure on the Outcome Within Strata Defined by the IV-free Exposure Level
Mean Values Across Simulations of the Localized Average Causal Effects of the Exposure on the Outcome Within Strata Defined by the IV-free Exposure LevelWe emphasize the importance of stratifying based on the IV-free exposure; when the same effects were estimated across strata of the exposure without prior subtraction of the effect of the IV, in the linear case, the corresponding estimates were −0.150, −0.052, −0.001, and −0.003. In the directed acyclic graph of Figure 4, conditioning on the exposure X induces an association between G and U, which are both parents of X (termed moralization). In contrast, conditioning on the IV-free exposure X0 = X − αG does not induce such an association.
FIGURE 4.
Directed acyclic graph of relationships among instrumental variable (IV) G, exposure X, IV-free exposure X0, confounder U, and outcome Y.
Directed acyclic graph of relationships among instrumental variable (IV) G, exposure X, IV-free exposure X0, confounder U, and outcome Y.We calculated Cochran’s Q statistic to examine possible heterogeneity in the 4 estimates. This nonparametric test assesses whether differences between the estimates in the strata are more different than would be expected by chance. We also conducted a trend test by performing a meta-regression of the estimates on the mean values of the exposure in each stratum. This is a form of weighted regression where the variance of each value of the response variable is assumed to be known.[24] The proportions of data sets where the Cochran’s Q test rejected the null hypothesis (P < 0.05) were as follows: linear 5.2%, quadratic 25.2%, J-shaped 67.0%, U-shaped 94.7%, and threshold 85.8%. The same proportions for the trend test were linear 4.1%, quadratic 31.8%, J-shaped 73.9%, U-shaped 94.7%, and threshold 65.6%. Similar results were observed when the effects of the IV on the exposure and of the exposure on the outcome were allowed to vary among persons provided that the effect heterogeneities in the IV–exposure and exposure–outcome associations were not correlated (eAppendix, eTables 2, 4, 6, and 7, http://links.lww.com/EDE/A818). We conclude, based on this simulation example, that the heterogeneity and trend tests are at least sometimes able to detect nonlinearities in causal relations. The trend test has greater empirical power to detect nonlinearities, except for in the case of the threshold relation.A generalization of this approach is to estimate the localized average causal effect across the distribution of the IV-free exposure using a sliding window. This is performed by first ordering individuals according to their IV-free exposure. Using an example window size of 1,000, the localized average causal effect is estimated for the first 1000 people, then for the individuals numbered 2 through to 1,001, then 3 to 1,002, and so on. The estimates can then be plotted against the median exposure value for each window. The range of the exposure distribution over which the localized average causal effect is estimated depends on the window size; however, for a fixed window size, the range expands as the sample increases. In the next section, we explore the impact of the choice of how wide to make the sliding window in a real data example.
APPLIED EXAMPLE: EFFECT OF BODY MASS INDEX ON CARDIOVASCULAR RISK FACTORS
In this section, we apply the linear and localized IV methods to the EPIC-InterAct data set for the cardiovascular risk factors previously listed. Data were available on 29 genetic variants previously shown to be associated with BMI in a large meta-analysis.[25] Details of the variants are given in the eAppendix (http://links.lww.com/EDE/A818) and their plausibility as IVs assessed by overidentification tests in each center (eTable 8, http://links.lww.com/EDE/A818). An allele score comprising all the variants weighted by their reported association with BMI in the meta-analysis is used as an IV.[26] If an individual i has g copies of variant allele k, that person’s allele score is , where w is the weight of the kth variant. The score explains 0.77% of the variance in BMI.For each outcome, we fit a linear IV model using the ratio method, adjusting for age, sex, and center in the regressions of the outcome on the allele score and of the exposure on the allele score. We estimated the localized average causal effects of BMI on each risk factor within quintiles of the distribution of BMI after the genetic component is subtracted (the IV-free BMI) and performed heterogeneity and trend tests on these values (Table 4). To account for the multiple centers, we standardized the BMI measure used to stratify the data by regressing BMI on age, sex, center, and allele score and stratifying individuals based on their residual value from this regression. Again, adjustment was made in the estimation of the causal effects for age, sex, and center. A more general sliding window approach was also used for estimating localized average causal effects, initially for HbA1c, and then for all the outcomes. Figure 5 displays the estimates for HbA1c using window sizes of 500, 1,000, 1,500, 2,000, 3,000, and 4,000. Figure 6 displays the estimates for all the outcomes using a window size of 2,000. For interpretability, the x-axis is the unstandardized BMI value corresponding to the BMI of the individual in the middle of the window. Previous work has shown that genetic associations with BMI are similar in extremely overweight individuals to those in the general population.[27] To assess the assumption that the effect of the IV on BMI is similar across the exposure distribution, we estimated the association of the allele score with BMI in each of the quintiles of IV-free BMI. The association estimates were 0.80, 1.02, 1.02, 0.98, 1.07; standard errors 0.08, 0.03, 0.03, 0.04, 0.19; heterogeneity test P = 0.11; trend test P = 0.26. There is not sufficient evidence in this data set to reject this assumption.
TABLE 4.
Linear IV Estimates of BMI on Risk Factor Outcome from Ratio Method; Localized Average Causal Effect Estimates of BMI on Each Risk Factor Within Quintiles of Participants Stratified by Their IV-free BMI; P Values for Heterogeneity (pQ) and Trend (ptr) of Estimates
FIGURE 5.
Localized average causal effect estimates of body mass index on glycated hemoglobin at various levels of body mass index from EPIC-InterAct data set: sliding window approach with window sizes 500, 1,000, 1,500, 2,000, 3,000, and 4,000 (top-left to bottom-right). Gray lines represent point wise 95% CIs.
FIGURE 6.
Localized average causal effect estimates of body mass index on cardiovascular risk factors at various levels of body mass index from EPIC-InterAct data set: sliding window approach with window size 2,000. Gray lines represent point wise 95% CIs.
Linear IV Estimates of BMI on Risk Factor Outcome from Ratio Method; Localized Average Causal Effect Estimates of BMI on Each Risk Factor Within Quintiles of Participants Stratified by Their IV-free BMI; P Values for Heterogeneity (pQ) and Trend (ptr) of EstimatesLocalized average causal effect estimates of body mass index on glycated hemoglobin at various levels of body mass index from EPIC-InterAct data set: sliding window approach with window sizes 500, 1,000, 1,500, 2,000, 3,000, and 4,000 (top-left to bottom-right). Gray lines represent point wise 95% CIs.Localized average causal effect estimates of body mass index on cardiovascular risk factors at various levels of body mass index from EPIC-InterAct data set: sliding window approach with window size 2,000. Gray lines represent point wise 95% CIs.The graphs showing localized average causal effect estimates for HbA1c demonstrate the trade-off in choosing a window size (Figure 5). With a smaller window size, there is increased detail in the estimate of the shape of the exposure–outcome relation and estimates are obtained across a wider range of the exposure distribution. With a larger window size, precision of the estimates is increased. However, the possible threshold feature of the model, whereby the causal estimate is close to zero for low values of BMI, is lost as the window size increases. In addition, the shape of the graphs becomes closer to a straight line. It seems likely that much of the variability in the estimates with a small window size reflects random fluctuation rather than interesting information, hence the use of a moderately large window size in Figure 6. Estimates for total cholesterol and triglycerides show a similar pattern, with positive point estimates for median BMI levels less than 26, and estimates around or below zero for median BMI greater than 28. Larger sample sizes are required for more definitive applied conclusions.
DISCUSSION
In this study, we have discussed the prospects for IV analyses with a nonlinear exposure–outcome relation. Specifically, we have provided an interpretation of linear IV estimates in the presence of nonlinearity and have proposed a novel method to investigate whether causal effects depend on the level of the exposure.
Linear Estimation Approach
Simple linear IV estimators with a nonlinear exposure–outcome relation estimate a parameter that approximates a population-averaged causal effect. This is the average effect resulting from an uniform increase in the exposure for all persons in the population. However, the approximation may be poor for an IV with a large effect on the exposure and will break down if the population-averaged causal effect is considered for a change in the exposure much greater than that associated with the IV. If the exposure–outcome relation is monotonic (i.e., increasing or decreasing in the outcome for all values of the exposure), then the parameter estimated by a linear IV method will be in the same direction as the causal effect for each individual in the population. Nonlinear exposure–outcome relations should therefore not be seen as a barrier to IV analyses, such as Mendelian randomization investigations. However, if the exposure–outcome relation is not monotonic, a causal effect of the exposure on the outcome may be obscured by a linear IV estimate (potentially such as in the applied example of the effect of BMI on log triglycerides).
Parametric and Nonparametric Approaches
If the shape of the exposure–outcome relation is the subject of investigation, then a nonlinear parametric or nonparametric IV analysis can be performed. Inference from nonlinear parametric models has been previously shown to be sensitive to the choice of parametric model in a range of real scenarios.[22] Inference from nonparametric models is limited when the IV does not explain much of the variation in the exposure. For example, the series method of Newey and Powell[19] is a 2-stage nonparametric IV method. The first stage constructs a basis set of nonlinear functions (such as a set of orthogonal polynomials or B-splines) of the exposure conditional on the IV, and the second stage fits the outcome as a linear combination of these nonlinear functions. These estimates will not be reliable outside the range of the fitted values of the exposure conditional on the IV. Similar arguments can be made for the kernel and series methods of Hall and Horowitz.[20]An alternative approach to nonlinear IV analysis is the quantile regression approach of Chernozhukov and Hansen.[21] In this case, the identifying assumption of the method is that the unmeasured confounders can be represented by a single variable that has a monotone effect on the outcome.[28] This assumption is not only restrictive and unrealistic but also inherently unverifiable, and even sensitivity analyses to investigate the assumption cannot be performed. Such assumptions should not be relied on for identifying causal effects.
Localized Estimation Approach
By stratifying the population based on the IV-free exposure, localized average causal effects of the exposure on the outcome in each of the strata can reveal the shape of the exposure–outcome relation. In practice, such estimates in the strata may be imprecise, meaning that the true shape of the relation is obscured. This approach can be extended using a sliding window approach to provide a continuous estimate of the exposure–outcome relation across a wide range of the exposure distribution.Unlike with other nonparametric IV approaches, this range widens as the sample size increases. Either stratum-specific or sliding window estimates can equally be estimated with a binary outcome by using a log-linear or logistic analysis model to estimate a relative risk or odds ratio parameter. A localized average causal effect is simply an IV estimate estimated for a particular stratum of the population as defined by the IV-free exposure. The same provisos about the approximation of an IV estimate to a population-averaged causal effect being valid only for small changes in the exposure and for IVs with a small effect on the exposure therefore apply equally for localized average causal effects.
Measurement Error
In observational analyses, coefficients in a linear regression model are affected by regression dilution bias if the exposure suffers from classical measurement error (ie, error uncorrelated with the true value of the exposure).[29] Estimates of the shape of nonlinear exposure–outcome relations can be distorted.[30] In contrast, IV estimates under such an error model are unbiased.[31] In the same way, since localized average causal effects are IV estimates for a given stratum of the population, the shape of relation estimated in a localized IV analysis will not be affected by classical measurement error in the exposure.
ACKNOWLEDGMENTS
We thank all EPIC participants and staff for their contribution to the study. We thank staff from the Technical, Field Epidemiology and Data Functional Group Teams of the MRC Epidemiology Unit in Cambridge, UK, for carrying out sample preparation, DNA provision and quality control, genotyping and data-handling work. The EPIC-InterAct study received funding from the European Union (Integrated Project LSHM-CT-2006-037197 in the Framework Programme 6 of the European Community).
Authors: Andrew D Johnson; Robert E Handsaker; Sara L Pulit; Marcia M Nizzari; Christopher J O'Donnell; Paul I W de Bakker Journal: Bioinformatics Date: 2008-10-30 Impact factor: 6.937
Authors: C Langenberg; S Sharp; N G Forouhi; P W Franks; M B Schulze; N Kerrison; U Ekelund; I Barroso; S Panico; M J Tormo; J Spranger; S Griffin; Y T van der Schouw; P Amiano; E Ardanaz; L Arriola; B Balkau; A Barricarte; J W J Beulens; H Boeing; H B Bueno-de-Mesquita; B Buijsse; M D Chirlaque Lopez; F Clavel-Chapelon; F L Crowe; B de Lauzon-Guillan; P Deloukas; M Dorronsoro; D Drogan; P Froguel; C Gonzalez; S Grioni; L Groop; C Groves; P Hainaut; J Halkjaer; G Hallmans; T Hansen; J M Huerta Castaño; R Kaaks; T J Key; K T Khaw; A Koulman; A Mattiello; C Navarro; P Nilsson; T Norat; K Overvad; L Palla; D Palli; O Pedersen; P H Peeters; J R Quirós; A Ramachandran; L Rodriguez-Suarez; O Rolandsson; D Romaguera; I Romieu; C Sacerdote; M J Sánchez; A Sandbaek; N Slimani; I Sluijs; A M W Spijkerman; B Teucher; A Tjonneland; R Tumino; D L van der A; W M M Verschuren; J Tuomilehto; E Feskens; M McCarthy; E Riboli; N J Wareham Journal: Diabetologia Date: 2011-06-30 Impact factor: 10.122
Authors: Tove Fall; Sara Hägg; Reedik Mägi; Alexander Ploner; Krista Fischer; Momoko Horikoshi; Antti-Pekka Sarin; Gudmar Thorleifsson; Claes Ladenvall; Mart Kals; Maris Kuningas; Harmen H M Draisma; Janina S Ried; Natalie R van Zuydam; Ville Huikari; Massimo Mangino; Emily Sonestedt; Beben Benyamin; Christopher P Nelson; Natalia V Rivera; Kati Kristiansson; Huei-Yi Shen; Aki S Havulinna; Abbas Dehghan; Louise A Donnelly; Marika Kaakinen; Marja-Liisa Nuotio; Neil Robertson; Renée F A G de Bruijn; M Arfan Ikram; Najaf Amin; Anthony J Balmforth; Peter S Braund; Alexander S F Doney; Angela Döring; Paul Elliott; Tõnu Esko; Oscar H Franco; Solveig Gretarsdottir; Anna-Liisa Hartikainen; Kauko Heikkilä; Karl-Heinz Herzig; Hilma Holm; Jouke Jan Hottenga; Elina Hyppönen; Thomas Illig; Aaron Isaacs; Bo Isomaa; Lennart C Karssen; Johannes Kettunen; Wolfgang Koenig; Kari Kuulasmaa; Tiina Laatikainen; Jaana Laitinen; Cecilia Lindgren; Valeriya Lyssenko; Esa Läärä; Nigel W Rayner; Satu Männistö; Anneli Pouta; Wolfgang Rathmann; Fernando Rivadeneira; Aimo Ruokonen; Markku J Savolainen; Eric J G Sijbrands; Kerrin S Small; Jan H Smit; Valgerdur Steinthorsdottir; Ann-Christine Syvänen; Anja Taanila; Martin D Tobin; Andre G Uitterlinden; Sara M Willems; Gonneke Willemsen; Jacqueline Witteman; Markus Perola; Alun Evans; Jean Ferrières; Jarmo Virtamo; Frank Kee; David-Alexandre Tregouet; Dominique Arveiler; Philippe Amouyel; Marco M Ferrario; Paolo Brambilla; Alistair S Hall; Andrew C Heath; Pamela A F Madden; Nicholas G Martin; Grant W Montgomery; John B Whitfield; Antti Jula; Paul Knekt; Ben Oostra; Cornelia M van Duijn; Brenda W J H Penninx; George Davey Smith; Jaakko Kaprio; Nilesh J Samani; Christian Gieger; Annette Peters; H Erich Wichmann; Dorret I Boomsma; Eco J C de Geus; TiinaMaija Tuomi; Chris Power; Christopher J Hammond; Tim D Spector; Lars Lind; Marju Orho-Melander; Colin Neil Alexander Palmer; Andrew D Morris; Leif Groop; Marjo-Riitta Järvelin; Veikko Salomaa; Erkki Vartiainen; Albert Hofman; Samuli Ripatti; Andres Metspalu; Unnur Thorsteinsdottir; Kari Stefansson; Nancy L Pedersen; Mark I McCarthy; Erik Ingelsson; Inga Prokopenko Journal: PLoS Med Date: 2013-06-25 Impact factor: 11.069
Authors: Elizabeth K Speliotes; Cristen J Willer; Sonja I Berndt; Keri L Monda; Gudmar Thorleifsson; Anne U Jackson; Hana Lango Allen; Cecilia M Lindgren; Jian'an Luan; Reedik Mägi; Joshua C Randall; Sailaja Vedantam; Thomas W Winkler; Lu Qi; Tsegaselassie Workalemahu; Iris M Heid; Valgerdur Steinthorsdottir; Heather M Stringham; Michael N Weedon; Eleanor Wheeler; Andrew R Wood; Teresa Ferreira; Robert J Weyant; Ayellet V Segrè; Karol Estrada; Liming Liang; James Nemesh; Ju-Hyun Park; Stefan Gustafsson; Tuomas O Kilpeläinen; Jian Yang; Nabila Bouatia-Naji; Tõnu Esko; Mary F Feitosa; Zoltán Kutalik; Massimo Mangino; Soumya Raychaudhuri; Andre Scherag; Albert Vernon Smith; Ryan Welch; Jing Hua Zhao; Katja K Aben; Devin M Absher; Najaf Amin; Anna L Dixon; Eva Fisher; Nicole L Glazer; Michael E Goddard; Nancy L Heard-Costa; Volker Hoesel; Jouke-Jan Hottenga; Asa Johansson; Toby Johnson; Shamika Ketkar; Claudia Lamina; Shengxu Li; Miriam F Moffatt; Richard H Myers; Narisu Narisu; John R B Perry; Marjolein J Peters; Michael Preuss; Samuli Ripatti; Fernando Rivadeneira; Camilla Sandholt; Laura J Scott; Nicholas J Timpson; Jonathan P Tyrer; Sophie van Wingerden; Richard M Watanabe; Charles C White; Fredrik Wiklund; Christina Barlassina; Daniel I Chasman; Matthew N Cooper; John-Olov Jansson; Robert W Lawrence; Niina Pellikka; Inga Prokopenko; Jianxin Shi; Elisabeth Thiering; Helene Alavere; Maria T S Alibrandi; Peter Almgren; Alice M Arnold; Thor Aspelund; Larry D Atwood; Beverley Balkau; Anthony J Balmforth; Amanda J Bennett; Yoav Ben-Shlomo; Richard N Bergman; Sven Bergmann; Heike Biebermann; Alexandra I F Blakemore; Tanja Boes; Lori L Bonnycastle; Stefan R Bornstein; Morris J Brown; Thomas A Buchanan; Fabio Busonero; Harry Campbell; Francesco P Cappuccio; Christine Cavalcanti-Proença; Yii-Der Ida Chen; Chih-Mei Chen; Peter S Chines; Robert Clarke; Lachlan Coin; John Connell; Ian N M Day; Martin den Heijer; Jubao Duan; Shah Ebrahim; Paul Elliott; Roberto Elosua; Gudny Eiriksdottir; Michael R Erdos; Johan G Eriksson; Maurizio F Facheris; Stephan B Felix; Pamela Fischer-Posovszky; Aaron R Folsom; Nele Friedrich; Nelson B Freimer; Mao Fu; Stefan Gaget; Pablo V Gejman; Eco J C Geus; Christian Gieger; Anette P Gjesing; Anuj Goel; Philippe Goyette; Harald Grallert; Jürgen Grässler; Danielle M Greenawalt; Christopher J Groves; Vilmundur Gudnason; Candace Guiducci; Anna-Liisa Hartikainen; Neelam Hassanali; Alistair S Hall; Aki S Havulinna; Caroline Hayward; Andrew C Heath; Christian Hengstenberg; Andrew A Hicks; Anke Hinney; Albert Hofman; Georg Homuth; Jennie Hui; Wilmar Igl; Carlos Iribarren; Bo Isomaa; Kevin B Jacobs; Ivonne Jarick; Elizabeth Jewell; Ulrich John; Torben Jørgensen; Pekka Jousilahti; Antti Jula; Marika Kaakinen; Eero Kajantie; Lee M Kaplan; Sekar Kathiresan; Johannes Kettunen; Leena Kinnunen; Joshua W Knowles; Ivana Kolcic; Inke R König; Seppo Koskinen; Peter Kovacs; Johanna Kuusisto; Peter Kraft; Kirsti Kvaløy; Jaana Laitinen; Olivier Lantieri; Chiara Lanzani; Lenore J Launer; Cecile Lecoeur; Terho Lehtimäki; Guillaume Lettre; Jianjun Liu; Marja-Liisa Lokki; Mattias Lorentzon; Robert N Luben; Barbara Ludwig; Paolo Manunta; Diana Marek; Michel Marre; Nicholas G Martin; Wendy L McArdle; Anne McCarthy; Barbara McKnight; Thomas Meitinger; Olle Melander; David Meyre; Kristian Midthjell; Grant W Montgomery; Mario A Morken; Andrew P Morris; Rosanda Mulic; Julius S Ngwa; Mari Nelis; Matt J Neville; Dale R Nyholt; Christopher J O'Donnell; Stephen O'Rahilly; Ken K Ong; Ben Oostra; Guillaume Paré; Alex N Parker; Markus Perola; Irene Pichler; Kirsi H Pietiläinen; Carl G P Platou; Ozren Polasek; Anneli Pouta; Suzanne Rafelt; Olli Raitakari; Nigel W Rayner; Martin Ridderstråle; Winfried Rief; Aimo Ruokonen; Neil R Robertson; Peter Rzehak; Veikko Salomaa; Alan R Sanders; Manjinder S Sandhu; Serena Sanna; Jouko Saramies; Markku J Savolainen; Susann Scherag; Sabine Schipf; Stefan Schreiber; Heribert Schunkert; Kaisa Silander; Juha Sinisalo; David S Siscovick; Jan H Smit; Nicole Soranzo; Ulla Sovio; Jonathan Stephens; Ida Surakka; Amy J Swift; Mari-Liis Tammesoo; Jean-Claude Tardif; Maris Teder-Laving; Tanya M Teslovich; John R Thompson; Brian Thomson; Anke Tönjes; Tiinamaija Tuomi; Joyce B J van Meurs; Gert-Jan van Ommen; Vincent Vatin; Jorma Viikari; Sophie Visvikis-Siest; Veronique Vitart; Carla I G Vogel; Benjamin F Voight; Lindsay L Waite; Henri Wallaschofski; G Bragi Walters; Elisabeth Widen; Susanna Wiegand; Sarah H Wild; Gonneke Willemsen; Daniel R Witte; Jacqueline C Witteman; Jianfeng Xu; Qunyuan Zhang; Lina Zgaga; Andreas Ziegler; Paavo Zitting; John P Beilby; I Sadaf Farooqi; Johannes Hebebrand; Heikki V Huikuri; Alan L James; Mika Kähönen; Douglas F Levinson; Fabio Macciardi; Markku S Nieminen; Claes Ohlsson; Lyle J Palmer; Paul M Ridker; Michael Stumvoll; Jacques S Beckmann; Heiner Boeing; Eric Boerwinkle; Dorret I Boomsma; Mark J Caulfield; Stephen J Chanock; Francis S Collins; L Adrienne Cupples; George Davey Smith; Jeanette Erdmann; Philippe Froguel; Henrik Grönberg; Ulf Gyllensten; Per Hall; Torben Hansen; Tamara B Harris; Andrew T Hattersley; Richard B Hayes; Joachim Heinrich; Frank B Hu; Kristian Hveem; Thomas Illig; Marjo-Riitta Jarvelin; Jaakko Kaprio; Fredrik Karpe; Kay-Tee Khaw; Lambertus A Kiemeney; Heiko Krude; Markku Laakso; Debbie A Lawlor; Andres Metspalu; Patricia B Munroe; Willem H Ouwehand; Oluf Pedersen; Brenda W Penninx; Annette Peters; Peter P Pramstaller; Thomas Quertermous; Thomas Reinehr; Aila Rissanen; Igor Rudan; Nilesh J Samani; Peter E H Schwarz; Alan R Shuldiner; Timothy D Spector; Jaakko Tuomilehto; Manuela Uda; André Uitterlinden; Timo T Valle; Martin Wabitsch; Gérard Waeber; Nicholas J Wareham; Hugh Watkins; James F Wilson; Alan F Wright; M Carola Zillikens; Nilanjan Chatterjee; Steven A McCarroll; Shaun Purcell; Eric E Schadt; Peter M Visscher; Themistocles L Assimes; Ingrid B Borecki; Panos Deloukas; Caroline S Fox; Leif C Groop; Talin Haritunians; David J Hunter; Robert C Kaplan; Karen L Mohlke; Jeffrey R O'Connell; Leena Peltonen; David Schlessinger; David P Strachan; Cornelia M van Duijn; H-Erich Wichmann; Timothy M Frayling; Unnur Thorsteinsdottir; Gonçalo R Abecasis; Inês Barroso; Michael Boehnke; Kari Stefansson; Kari E North; Mark I McCarthy; Joel N Hirschhorn; Erik Ingelsson; Ruth J F Loos Journal: Nat Genet Date: 2010-10-10 Impact factor: 38.330
Authors: Rachel M Freathy; Nicholas J Timpson; Debbie A Lawlor; Anneli Pouta; Yoav Ben-Shlomo; Aimo Ruokonen; Shah Ebrahim; Beverley Shields; Eleftheria Zeggini; Michael N Weedon; Cecilia M Lindgren; Hana Lango; David Melzer; Luigi Ferrucci; Giuseppe Paolisso; Matthew J Neville; Fredrik Karpe; Colin N A Palmer; Andrew D Morris; Paul Elliott; Marjo-Riitta Jarvelin; George Davey Smith; Mark I McCarthy; Andrew T Hattersley; Timothy M Frayling Journal: Diabetes Date: 2008-03-17 Impact factor: 9.461
Authors: Tom M Palmer; Børge G Nordestgaard; Marianne Benn; Anne Tybjærg-Hansen; George Davey Smith; Debbie A Lawlor; Nicholas J Timpson Journal: BMJ Date: 2013-07-18
Authors: Jie Zheng; Yuemiao Zhang; Humaira Rasheed; Venexia Walker; Yuka Sugawara; Jiachen Li; Yue Leng; Benjamin Elsworth; Robyn E Wootton; Si Fang; Qian Yang; Stephen Burgess; Philip C Haycock; Maria Carolina Borges; Yoonsu Cho; Rebecca Carnegie; Amy Howell; Jamie Robinson; Laurent F Thomas; Ben Michael Brumpton; Kristian Hveem; Stein Hallan; Nora Franceschini; Andrew P Morris; Anna Köttgen; Cristian Pattaro; Matthias Wuttke; Masayuki Yamamoto; Naoki Kashihara; Masato Akiyama; Masahiro Kanai; Koichi Matsuda; Yoichiro Kamatani; Yukinori Okada; Robin Walters; Iona Y Millwood; Zhengming Chen; George Davey Smith; Sean Barbour; Canqing Yu; Bjørn Olav Åsvold; Hong Zhang; Tom R Gaunt Journal: Int J Epidemiol Date: 2021-10-20 Impact factor: 7.196
Authors: Chi Gao; Chirag J Patel; Kyriaki Michailidou; Ulrike Peters; Jian Gong; Joellen Schildkraut; Fredrick R Schumacher; Wei Zheng; Paolo Boffetta; Isabelle Stucker; Walter Willett; Stephen Gruber; Douglas F Easton; David J Hunter; Thomas A Sellers; Christopher Haiman; Brian E Henderson; Rayjean J Hung; Christopher Amos; Brandon L Pierce; Sara Lindström; Peter Kraft Journal: Int J Epidemiol Date: 2016-07-17 Impact factor: 7.196
Authors: Stephen Burgess; Deborah J Thompson; Jessica M B Rees; Felix R Day; John R Perry; Ken K Ong Journal: Genetics Date: 2017-08-23 Impact factor: 4.562
Authors: Shubhabrata Mukherjee; Stefan Walter; John S K Kauwe; Andrew J Saykin; David A Bennett; Eric B Larson; Paul K Crane; M Maria Glymour Journal: Alzheimers Dement Date: 2015-06-12 Impact factor: 21.566
Authors: Alice R Carter; Eleanor Sanderson; Gemma Hammerton; Rebecca C Richmond; George Davey Smith; Jon Heron; Amy E Taylor; Neil M Davies; Laura D Howe Journal: Eur J Epidemiol Date: 2021-05-07 Impact factor: 8.082