Literature DB >> 33221880

A directed acyclic graph for interactions.

Anton Nilsson^1,2, Carl Bonander³, Ulf Strömberg^3,4, Jonas Björk^1,5.

Abstract

BACKGROUND: Directed acyclic graphs (DAGs) are of great help when researchers try to understand the nature of causal relationships and the consequences of conditioning on different variables. One fundamental feature of causal relations that has not been incorporated into the standard DAG framework is interaction, i.e. when the effect of one variable (on a chosen scale) depends on the value that another variable is set to. In this paper, we propose a new type of DAG-the interaction DAG (IDAG), which can be used to understand this phenomenon.
METHODS: The IDAG works like any DAG but instead of including a node for the outcome, it includes a node for a causal effect. We introduce concepts such as confounded interaction and total, direct and indirect interaction, showing that these can be depicted in ways analogous to how similar concepts are depicted in standard DAGs. This also allows for conclusions on which treatment interactions to account for empirically. Moreover, since generalizability can be compromised in the presence of underlying interactions, the framework can be used to illustrate threats to generalizability and to identify variables to account for in order to make results valid for the target population.
CONCLUSIONS: The IDAG allows for a both intuitive and stringent way of illustrating interactions. It helps to distinguish between causal and non-causal mechanisms behind effect variation. Conclusions about how to empirically estimate interactions can be drawn-as well as conclusions about how to achieve generalizability in contexts where interest lies in estimating an overall effect.

Entities: Chemical Disease

Keywords: Causal inference; external validity; generalizability; interaction; internal validity; mediation

Year: 2021 PMID： 33221880 PMCID： PMC8128466 DOI： 10.1093/ije/dyaa211

Source DB: PubMed Journal: Int J Epidemiol ISSN： 0300-5771 Impact factor: 7.196

Directed acyclic graphs (DAGs) are useful in epidemiology, but the standard framework offers no way of displaying whether interactions are present (on the scale of interest). We present a new type of DAG—the interaction DAG (IDAG)—which can be used to analyse interactions. We define concepts such as confounded interaction and total, direct and indirect interaction, and show how these can easily be displayed with the IDAG. An applied researcher can use the IDAG to determine which treatment interactions to account for empirically. The IDAG can also be used to shed light on mechanisms that compromise generalizability and to determine which variables to account for in order to make results valid for the target population.

Background

Directed acyclic graphs (DAGs) are frequently used in epidemiology to shed light on causal relationships. Being composed of nodes, representing variables, and arrows, representing direct causal effects of one variable on another, DAGs can be used to illustrate concepts such as confounding, selection bias and the distinction between total, direct, and indirect effects. In turn, DAGs are used to determine which variables to condition on in empirical analyses. Whereas DAGs are powerful tools, a fundamental feature of causal relations which has not been incorporated into the standard framework is interaction, i.e. when the effect of some variable (on a chosen scale) depends on the value to which another variable is set., Several articles have discussed interaction with reference to DAGs. The standard DAG is nonparametric and as a result, it is of no relevance for the construction of the graph whether the determinants of an outcome interact with each other. There are some proposals on how interaction could intuitively be incorporated into DAGs, but these lack theoretical foundations. In this article, we propose a new type of DAG, the interaction DAG (IDAG). The IDAG is both intuitive and well founded in theory for causal inference. In brief, the IDAG works like any DAG but instead of depicting how different variables influence the outcome, the IDAG depicts how different variables influence the size of a chosen effect measure. We describe the approach and discuss several concepts that naturally follow from the framework, such as confounded interaction and direct, indirect and total interaction. For readers unfamiliar with standard DAGs, we refer to Greenland, who provides an accessible introduction.

The IDAG

The concept of interaction employed in this article is similar to that in previous literature,,, and refers to a joint effect. Whereas there are different ways of defining an ‘effect’, the general idea behind interaction is that the effect of one variable (on some scale) depends on the level to which another variable is set. Here, we will focus on a binary treatment that may interact with one or several other binary variables, such as and . Definitions of interaction are often expressed with potential outcomes., In structural causal models, a potential (or ‘counterfactual’) outcome is an outcome that, for a full set of predetermined background factors which characterize individual , prevails when forcing one or several variables in the model to assume particular values. When defining interactions, at least two variables must be forced to particular values. If the outcome is continuous, we can say that there is additive interaction between and in individual if the following inequality holds between differences of potential outcomes: Notice that we here define interaction at the individual level, in some contrast with previous literature, which focuses on the expected population level. The left-hand side of equation (1) is a measure of the causal effect of on for , and the right-hand side is the same measure for . Interaction between and is thus present if the size of this causal effect depends on . The size of the interaction is given by the difference between the left-hand and right-hand sides of (1). When outcomes are binary, focus normally lies on the probability of a positive outcome. Assuming probabilistic potential outcomes, we can say that additive interaction between and is present in individual if the following inequality holds in this individual: Again, the left-hand side is a measure of the causal effect of on for , and the right-hand side for ; the interaction is present if the size of this causal effect depends on . Henceforth, we will denote a causal effect of on by . Since is a variable that may depend causally on other variables, it can be included in a causal graph. We refer to a graph including as an IDAG. If there is an interaction between some variable and , there is a directed arrow (or path) from this variable to . In contrast, effect measure modification only corresponds to an association between some variable and , possibly arising through unblocked backdoor paths. In the Supplementary Appendix, available as Supplementary data at IJE online, we discuss more technical details related to the IDAG, such as d-separation, and work through examples based on structural equations. The IDAG is quite similar to the standard DAG, except that the outcome node has been replaced by a node representing a causal effect, and that the node representing the treatment variable is not included. Both figures display causal relationships between variables, and the causal effect of one variable on another is not dependent on the graph. Like any DAG, the IDAG will normally be drawn based on previous literature, which in the case of the IDAG will have to include evidence on which treatment interactions are present. Causal effects can be measured on different scales; for example, although equations (1) and (2) defined interaction on additive scales (based on differences), multiplicative scales (based on ratios) could be used as well. Whether an interaction is present may depend on the scale and, in fact, two variables that influence an outcome will always interact on some scales.,, The appearance of the IDAG thus depends on the scale chosen, and certain variables may point to in some versions of the IDAG but not in others. In general, the additive scale is preferred if the goal is to evaluate interaction in a ‘mechanistic’ sense.,, For simplicity, we will assume that there are no interactions not involving (on the chosen scale), and for this reason we only consider and not, for example, . We will also assume that interactions are constant across individuals, so the individual-level interactions defined from equations (1) and (2) are equal to conventional population-level interactions.

Examples

We now present several examples of IDAGs, explaining their interpretation and connection to standard DAGs. First, in Figure 1A, we provide a standard DAG. The outcome , say ischaemic stroke, is assumed to be influenced by a treatment and also (say, warfarin and smoking), and we want to display whether these two variables interact (say, on an additive scale). Indeed, whether there is such an interaction between the variables is not visible from the standard DAG. This, however, can be seen in the IDAG in Figure 1B, according to which the effects of are influenced by .

Figure 1

An example of a standard directed acyclic graph (DAG) (panel A) and two possible interaction DAGs (IDAGs) (panels B and C). Variables A (warfarin) and Q (smoking) influence Y (ischaemic stroke). Panel B suggests that Q also influences the effect of A on Y, whereas panel C suggests that this is not the case The graph in Figure 1B is not the only possible IDAG to accompany the standard DAG in Figure 1A. One could also conceive of an IDAG without an arrow from to , i.e. a scenario with no interaction between and . We show this alternative in Figure 1C (in practice, could have been omitted from this figure). As can be noticed, a node with an arrow pointing to in the standard DAG does not necessarily have an arrow pointing to in the IDAG. On the other hand, there can be no arrow from to in the IDAG unless points to in the standard DAG. This follows because the treatment effect depends on the outcomes, so only if a variable directly influences the outcomes may it also directly influence the effect size. Another example of a standard DAG and an accompanying IDAG is given by Figure 2. We consider a scenario where a (perhaps naïve) researcher is asking whether there is an interaction between a treatment, such as bariatric surgery, , and hair colour, , on weight loss (on an additive scale). There is an unobserved variable (genotype) that influences the outcome and that also interacts with treatment—the latter illustrated by an arrow to the causal effect in the IDAG. also influences hair colour, which does not itself influence the outcome. The relationship between and is indicated in both the standard DAG and the IDAG. Notably, since is influenced by , the effects of will vary by even though there is no interaction between and . The phenomenon has been referred to as ‘effect modification by proxy’ and is an instance of confounded interaction, since a simple analysis of a possible interaction between and will give biased estimates due to the interaction between and .

Figure 2

Confounded interaction or ‘effect modification by proxy’. A standard directed acyclic graph (DAG) is given in panel A and an interaction DAG (IDAG) in panel B. Variables X (genotype) and A (bariatric surgery) influence Y (weight loss), with an interaction present. The effect of A is modified by Q (hair colour), but there is no interaction between A and Q Further examples of standard DAGs and IDAGs are given in Figure 3, where is assumed to influence the outcome. could represent education and smoking; again is a treatment and the disease outcome. We are interested in whether the benefits of treatment (on an additive scale) depend on smoking or education (i.e. interactions between treatment and smoking or education), and whether the potential impact of education on the benefits of treatment are due to the fact that education influences smoking. In Figure 3A, we assume that has no direct impact on the outcome, whereas such an impact is allowed for in Figure 3B. Figure 3C shows an IDAG compatible with either of the two standard DAGs. Here, it becomes clear that and interact; the arrow from to indicates direct interaction. However, there is only indirect interaction with respect to the variable ; once is fixed, it makes no difference for the causal effect what value assumes. Changing educational levels would only influence the benefits of treatment to the extent smoking is influenced.

Figure 3

Two examples of standard directed acyclic graphs (DAGs) (left) and two interaction DAGs (IDAGs) (right). The variable Y (a disease) is directly influenced by A (treatment), Q (smoking) and potentially also X (education). The DAG in panel A is compatible with the IDAG in panel C, whereas the DAG in panel B is compatible with either of the IDAGs in panels C and D An alternative IDAG is displayed in Figure 3D. Here, there is direct interaction with respect to both and . As for the effects of , we can distinguish between direct and total interaction, where the latter operates both directly and indirectly. Increasing educational levels could both influence the benefit of treatment indirectly by reducing smoking, and directly, through other mechanisms omitted from the graph (e.g. adherence). Treatment decisions should here take both the individual’s educational level and smoking status into account, whereas in scenario 3 C it would be enough to take smoking into consideration. Figure 3D is compatible with the DAG in Figure 3B but not with the one in Figure 3A, as in Figure 3A there is no direct impact of on the outcome.

Estimation

Typical approaches to estimate an interaction between two variables ( and ) include stratification and estimation of one regression on the full data, including the product term . When used together with the standard DAG, the IDAG provides guidance on how to carry out estimations. Regarding confounding, a sufficient criterion for unconfoundedness in interaction models is that both interacting variables are unconfounded., For simplicity, our figures have so far ignored the possibility of confounding of the variable , but in general, variables will need to be conditioned on to make sure as well as is unconfounded. Conclusions about which variables to condition on can be drawn from the standard DAG. However, the standard DAG is uninformative as to what extent stratification or inclusion of product terms is necessary, as opposed to simply controlling for main effects. To illustrate this point, consider the standard DAG in Figure 3B. In order to estimate the joint effect of and , it is generally necessary to account for , for example by controlling for it in a regression model, at least including a main term. However, whether it is also necessary to stratify on or include a product term between and depends on whether influences the causal effect of on (conditional on ). In the IDAG in Figure 3D, causal effects depend on , giving rise to a backdoor path between and through . An analysis examining the interaction between and also needs to account for the interaction between and ; failure to do so would result in confounded interaction. In the IDAG in Figure 3C, however, causal effects do not depend on conditional on , so it would be enough to control for with a main term. This is reflected by the absence of a backdoor path between and . The reasoning is similar to standard DAG logic; we refer to the Supplementary Appendix, available as Supplementary data at IJE online, for more details and elaborations. Conclusions about what to condition on to estimate total or direct effects follow from both the standard DAG and IDAG. In Figure 3, for example, one must not account for (i.e. must omit and the interaction between and ) to estimate the total effect of and the total interaction between and . In contrast, if interest lies in the direct effect of and the direct interaction between and , one must include as well as a product term between and in addition to that between and – or, alternatively, stratify not only on but also on .

IDAGs and generalizability

We now consider the situation where an investigator is not interested in examining interaction per se, but instead in determining an overall effect, such as an average causal effect. If interactions are nevertheless present, sample selection will often cause problems of generalizability, as the average causal effect in the selected sample may differ from that in the target population. In general, this problem will arise if selection depends on variables that influence the causal effect under study. Standard DAGs can be used to show how sample selection potentially undermines the generalizability of estimates. For instance Hernan, and also Westreich et al., considered a scenario where censoring depended on an unobserved variable that influenced the outcome, and provided DAGs with a selection node for illustration. These standard DAGs are informative about biases that could arise due to non-random sampling, regardless of the chosen effect measure. However, they are not informative about whether, for a chosen effect measure, there actually are interactions with respect to the variables that selection depends on, and thus whether generalizability is in fact compromised. In Figure 4, we reproduce the DAGs from Hernan and from Westreich et al. and display two alternative IDAGs. The treatment of interest is given by . The first IDAG, shown in Figure 4B, makes it clear that selection on would compromise generalizability, a conclusion that follows since and are not d-separated. Selected individuals would tend to have different values on compared with non-selected individuals, and thus have different causal effects . In contrast, this selection issue is not present in Figure 4C. Although and are not d-separated in the DAG, and are d-separated in the IDAG, as is not influenced by . The estimate from the study sample would here be valid for the target population.

Figure 4

Sample selection potentially compromising generalizability. Individuals are selected based on S. X may represent socioeconomic status, A some treatment, and Y a disease. The standard directed acyclic graph (DAG) in panel A is compatible either with the interaction DAG (IDAG) in panel B or the one in panel C, where generalizability is only compromised in the scenario depicted in panel B To restore generalizability, weighting methods are typically applied, where weights are based on the set of variables which (together with ) block all paths between and For a given effect measure, this set of variables may however be larger than necessary, as not all of these variables may be related to the effect size. This point has been highlighted by a few recent studies,, but these did not provide a graphical framework for understanding the phenomenon. With our presentation, Figure 4B makes it clear that weighting needs to be done with respect to , whereas in the scenario displayed in Figure 4C, no weighting is necessary. As noted, any path involving is considered ‘automatically’ blocked in the generalizability framework. For instance, a path would not compromise validity. Conveniently, in the IDAG is not included and this issue becomes irrelevant.

Discussion

Standard DAGs are highly informative but lack the ability to depict whether interactions are present on the scale of interest. As a result, their usefulness is limited in terms of understanding the reasons why causal effects vary across individuals, and which interactions to account for. This article introduced a new version of DAGs, the IDAG, to be used for these purposes. Conclusions from the IDAG can be used to achieve internal validity in the sense of unconfounded interaction estimates, and also external validity in the sense of generalizability of estimates of overall effects. Our framework is distinct from previous attempts to incorporate interactions into DAGs. For example, Weinberg proposed illustrating interactions by letting arrows point to other arrows or merging arrows in the standard DAG. Although intuitive, this approach is not theoretically consistent with DAG theory. Another previous approach only applies to synergistic interaction (‘mechanistic’ interaction based on sufficient causes) and yet another one relies on a mediator between treatment and outcome. Whereas standard DAGs are nonparametric, we note that the IDAG is parametric in the sense that the absence of an interaction corresponds to a choice of functional form. This makes the IDAG somewhat less general than the standard DAG. However, a functional form is inevitably imposed when conducting (parametric) estimation, and we believe it is rather an advantage that the IDAG narrows the gap between theory and estimation. As for any DAG, assumptions on how the variables in the IDAG are related must be made based on previous evidence. Several simplifying assumptions were made in this article, in particular that there were no interactions not involving the variable, and that interactions were constant across individuals. It will be an interesting avenue for future work to elaborate on more general scenarios, where these assumptions are not fulfilled.

Conclusion

DAGs are useful tools in epidemiology, but one feature of causal relationships which has not been incorporated into the standard framework is interaction. However, interactions can be viewed as ‘effects on effects’ and are therefore conveniently depicted by the IDAG. We expect that our framework will be useful to guide conversations about interaction analyses and to understand whether estimated interactions have a causal interpretation. Describing and guiding analyses in scenarios where sample selection causes lack of generalizability is another benefit.

Supplementary data

Supplementary data are available at IJE online.

Funding

This work was supported by Forskningsrådet för hälsa, arbetsliv och välfärd (FORTE) [grant number 2017–00414 to U.S.] and Vetenskapsrådet (VR) [grant number 2019–00198 to J.B.].

Author contributions

A.N. conceived the initial concept and wrote the manuscript. J.B. conceived the idea of using the framework to illustrate generalizability. C.B. contributed with theoretical insights. C.B., U.S. and J.B. contributed to the phrasing of the manuscript.

Conflict of interest

None declared. Click here for additional data file.

22 in total

1. Can DAGs clarify effect modification?

Authors: Clarice R Weinberg
Journal: Epidemiology Date: 2007-09 Impact factor: 4.822

2. Causal diagrams for epidemiologic research.

Authors: S Greenland; J Pearl; J M Robins
Journal: Epidemiology Date: 1999-01 Impact factor: 4.822

3. Effect measure modification conceptualized using selection diagrams as mediation by mechanisms of varying population-level relevance.

Authors: Priscilla M Lopez; S V Subramanian; C Mary Schooling
Journal: J Clin Epidemiol Date: 2019-05-20 Impact factor: 6.437

4. Invited Commentary: Selection Bias Without Colliders.

Authors: Miguel A Hernán
Journal: Am J Epidemiol Date: 2017-06-01 Impact factor: 4.897

5. Target Validity and the Hierarchy of Study Designs.

Authors: Daniel Westreich; Jessie K Edwards; Catherine R Lesko; Stephen R Cole; Elizabeth A Stuart
Journal: Am J Epidemiol Date: 2019-02-01 Impact factor: 4.897

6. The use of propensity scores to assess the generalizability of results from randomized trials.

Authors: Elizabeth A Stuart; Stephen R Cole; Catherine P Bradshaw; Philip J Leaf
Journal: J R Stat Soc Ser A Stat Soc Date: 2001-04-01 Impact factor: 2.483

7. Invariants and noninvariants in the concept of interdependent effects.

Authors: S Greenland; C Poole
Journal: Scand J Work Environ Health Date: 1988-04 Impact factor: 5.024

8. Generalizing evidence from randomized clinical trials to target populations: The ACTG 320 trial.

Authors: Stephen R Cole; Elizabeth A Stuart
Journal: Am J Epidemiol Date: 2010-06-14 Impact factor: 4.897

9. Evidence synthesis for constructing directed acyclic graphs (ESC-DAGs): a novel and systematic method for building directed acyclic graphs.

Authors: Karl D Ferguson; Mark McCann; Srinivasa Vittal Katikireddi; Hilary Thomson; Michael J Green; Daniel J Smith; James D Lewsey
Journal: Int J Epidemiol Date: 2020-02-01 Impact factor: 9.685

10. A new approach for investigation of person-environment interaction effects in research involving health outcomes.

Authors: Björn Slaug; Susanne Iwarsson; Jonas Björk
Journal: Eur J Ageing Date: 2018-06-12

5 in total

1. The value of combining individual and small area sociodemographic data for assessing and handling selective participation in cohort studies: Evidence from the Swedish CardioPulmonary bioImage Study.

Authors: Carl Bonander; Anton Nilsson; Jonas Björk; Anders Blomberg; Gunnar Engström; Tomas Jernberg; Johan Sundström; Carl Johan Östgren; Göran Bergström; Ulf Strömberg
Journal: PLoS One Date: 2022-03-08 Impact factor: 3.240

Review 2. Addressing Social Determinants of Health and Mitigating Health Disparities Across the Lifespan in Congenital Heart Disease: A Scientific Statement From the American Heart Association.

Authors: Keila N Lopez; Carissa Baker-Smith; Glenn Flores; Michelle Gurvitz; Tara Karamlou; Flora Nunez Gallegos; Sara Pasquali; Angira Patel; Jennifer K Peterson; Jason L Salemi; Clyde Yancy; Shabnam Peyvandi
Journal: J Am Heart Assoc Date: 2022-04-07 Impact factor: 6.106

3. Jumping on the Bandwagon: The Role of Voters' Social Class in Poll Effects in the Context of the 2021 German Federal Election.

Authors: Fabienne Unkelbach; Melvin John; Vera Vogel
Journal: Polit Vierteljahresschr Date: 2022-08-10

4. A proposal for capturing interaction and effect modification using DAGs.

Authors: John Attia; Elizabeth Holliday; Christopher Oldmeadow
Journal: Int J Epidemiol Date: 2022-08-10 Impact factor: 9.685

5. Tutorial on directed acyclic graphs.

Authors: Jean C Digitale; Jeffrey N Martin; Medellena Maria Glymour
Journal: J Clin Epidemiol Date: 2021-08-08 Impact factor: 6.437

5 in total