Literature DB >> 24779654

Estimating peer effects in longitudinal dyadic data using instrumental variables.

A James O'Malley¹, Felix Elwert², J Niels Rosenquist³, Alan M Zaslavsky⁴, Nicholas A Christakis⁵.

Abstract

The identification of causal peer effects (also known as social contagion or induction) from observational data in social networks is challenged by two distinct sources of bias: latent homophily and unobserved confounding. In this paper, we investigate how causal peer effects of traits and behaviors can be identified using genes (or other structurally isomorphic variables) as instrumental variables (IV) in a large set of data generating models with homophily and confounding. We use directed acyclic graphs to represent these models and employ multiple IV strategies and report three main identification results. First, using a single fixed gene (or allele) as an IV will generally fail to identify peer effects if the gene affects past values of the treatment. Second, multiple fixed genes/alleles, or, more promisingly, time-varying gene expression, can identify peer effects if we instrument exclusion violations as well as the focal treatment. Third, we show that IV identification of peer effects remains possible even under multiple complications often regarded as lethal for IV identification of intra-individual effects, such as pleiotropy on observables and unobservables, homophily on past phenotype, past and ongoing homophily on genotype, inter-phenotype peer effects, population stratification, gene expression that is endogenous to past phenotype and past gene expression, and others. We apply our identification results to estimating peer effects of body mass index (BMI) among friends and spouses in the Framingham Heart Study. Results suggest a positive causal peer effect of BMI between friends.

Entities: CellLine Chemical Disease Gene Species

Keywords: Body‐mass index; Causality; Directed acyclic graphs; Dyad; Genes; Homophily; Instrumental variable; Longitudinal; Mendelian randomization; Peer effect; Social network; Two‐stage least squares

Mesh：

Year: 2014 PMID： 24779654 PMCID： PMC4213357 DOI： 10.1111/biom.12172

Source DB: PubMed Journal: Biometrics ISSN： 0006-341X Impact factor: 2.571

1. Introduction

We develop instrumental variable (IV) methods for the estimation of causal peer effects using longitudinal dyadic data from a social network. A peer effect (social contagion, induction) occurs when a behavior, trait, or characteristic of an individual's peers (those to whom she is connected, or alters) affects her own (the ego's) health behavior. While evidence exists of associations of observed traits (phenotypes and behaviors) among groups of individuals (such as obesity (Christakis and Fowler, 2007), smoking (Christakis and Fowler, 2008), and alcohol use (Rosenquist et al., 2010)), experiments to prove that such associations are causal are often difficult or impossible due to practical or ethical limitations on randomization, albeit with a few exceptions (Wing and Jeffery, 1999; Centola, 2010; Fowler and Christakis, 2010). Observational analyses may suffer from selection bias due to non-random assignment of treatment. The challenges are magnified in network contexts as confounding takes several structurally different forms. In addition to the spread of health traits because of peer influence, clusters of similar individuals may form due to both homophily (“birds of a feather flock together”) and unmeasured common causes affecting socially connected individuals (confounding). Because each of these phenomena may lead to correlations between the phenotypes of connected individuals (Christakis and Fowler, 2007; Shalizi and Thomas, 2011), methods to parse these associations apart are required. One approach to causal inference with observational data emulates randomized trials by using an instrumental variable (IV), a variable that influences exposure but, conditional on the exposure, has no influence on the outcome (Angrist, Imbens, and Rubin, 1996). However, the literature on the use of IVs to estimate peer effects is limited. Randomized dorm-room assignments have been used to estimate peer effects among college students (Sacerdote, 2001) and military recruits (Carrell, Fullerton, and West, 2009). In other settings, covariates averaged over neighboring observations (contextual variables) have been used as IVs for peer effects (Fletcher, 2008). Directed acyclic graphs (DAGs) can clarify the identification problems of IV analysis for peer effects by focusing attention on the causal relationships among variables to better align the identification strategy with scientific judgments (Pearl, 2009). We use DAGs to (1) identify subtle dependencies that complicate estimation of peer effects, (2) succinctly notate causal data generating models, and (3) prove theorems about identifiability conditions for causal peer effects. We illustrate our methods using networks with a simple structure consisting of disjoint pairs of individuals (dyads), with no influence (interference) between dyads. Our motivating application concerns peer effects in the Framingham Heart Study (FHS) (Christakis and Fowler, 2007), specifically the utility of using recently sequenced genetic data to develop IVs for peer effects on body mass index among friends and spouses. The appeal of genes as IVs is that they are inherently randomized by a naturally occurring process, are assigned at conception, and are not directly visible and hence, unlikely to directly influence other individuals. Several recent methodological papers discuss Mendelian randomization as IVs (Didelez, Meng, and Sheehan, 2010; Vansteelandt et al., 2011; Palmer et al., 2012) but none consider peer effects. Our paper explores promises as well as pitfalls facing the use of Mendelian randomization as IVs in the study of peer effects. In Sections Directed Acyclic Graphs (DAGs)–Causal Models for Peer Effects in Dyads, we introduce DAGs to develop several increasingly general causal models for peer effects involving IVs to account for latent homophily and unmeasured confounding. Our models accommodate several other features often considered obstacles to identifying peer effects, including pleiotropy (genes affecting multiple individual characteristics), population stratification, and gene-based homophily. Section Potential Outcomes Representation outlines the potential outcomes representation of our preferred causal model. Estimation of these models of peer effects using longitudinal dyadic network data is described in Section Dyadic Instrumental Variables Analysis. Section Friend and Spouse Peer Effect Analysis of the FHS Network describes the FHS network of friend and spouse ties and evaluates the linked genetic alleles as potential IVs for peer effects. Section Conclusion concludes with a discussion.

2. Directed Acyclic Graphs (DAGs)

We use DAGs to encode the structural (i.e., causal) assumptions of our causal models and prove their identifiability. DAGs represent variables as nodes and the direct causal effects between them as edges. Missing edges denote sharp null hypotheses of no direct causal effect. All DAGs considered in this paper are so-called causal DAGs (Pearl, 2009), which are assumed to contain all observed and unobserved common causes in the process. Paths are non-intersecting sequences of adjacent edges, regardless of the direction of the arrows. Causal paths between a treatment and an outcome contain only edges that point away from treatment and toward the outcome. All other paths are noncausal, or spurious, paths. Variable M is a collider on a path if the path contains the formation X → M ← Y (i.e., both edges point to M). All variables directly or indirectly caused by a given variable are called its descendants. Brackets around a variable indicate that the variable has been conditioned on; for example, [M]. The d-separation rule (Pearl, 1988) translates between the causal assumptions encoded in the DAG and the associations observable in data. A path is said to be d-separated or blocked if (1) it contains a non-collider variable that has been conditioned on, such as M in X → [M] ← Y (where M is a mediator) or X ← [M] → Y (where M is a common cause or confounder), or if (2) it contains a collider variable, X → M ← Y, and neither the collider nor any of its descendants has been conditioned on. Paths that are not d-separated are said to be d-connected, unblocked, or open. In causal DAGs, variables that are d-separated along all paths are statistically independent; and variables that are d-connected along at least one path may be associated (Verma and Pearl, 1988). The crucial point is that conditioning on a non-collider blocks the flow of association along a path, whereas conditioning on a collider or one of its descendants may induce an association. Under conventional axioms (Pearl, 2009; Richardson and Robins, 2013), causal DAGs and potential outcomes are equivalent notational systems for predicting statistical associations and identifying the causal effects of an intervention. Since IV is principally an identification strategy for linear models, we henceforth assume that the DAG represents a linear model, making no assumptions about the distribution of the variables (e.g., joint normality).

3. Graphical IV Criteria

We apply versions of the graphical criteria for detecting IVs for the total causal effect of treatment (variable) T on outcome (variable) Y in linear models developed by Brito and Pearl (2002). Single-IV Criterion: Let denote the DAG that represents the assumed causal model, and let be after removing all edges emanating from T ( represents the null hypothesis of no treatment effect). Then G is an IV for the total causal effect of T on Y conditional on a set of variables Z (the so-called conditioning set, which may be empty) if: Z contains no descendant of T in . There is an unblocked path between G and T in after conditioning on Z. There is no unblocked path between G and Y in after conditioning on Z. The first and third conditions give the exclusion restriction: except for the causal effect of T on Y, the IV G must be independent of Y given Z. (However, these conditions do not imply that G is independent of Y conditional on (T,Z)—in the presence of an unmeasured cause of T and Y, conditioning on T opens a path from G to Y (Hernán and Robins, 2006).) The IV criterion generalizes to multiple treatments and multiple IVs (IV-sets). IV-Set Criterion: For multivariate T=(T1,…,T), let be after removing all edges emanating from T. Then a multivariate G=(G1,…,G) is an IV-set for the joint causal effect of T on Y conditional on a set of variables Z if: Z contains no descendant of T in . For every l ϵ {1,…,L} there exists, for some k, an unblocked path, called path, between G ϵ G and T ϵ T in after conditioning on Z, such that {path1,…,pathhave no nodes in common. For k ϵ {1,…,K} there are no unblocked paths between G ϵ G and Y in after conditioning on Z. It follows from condition 2 that K ≥ L for an IV-set G. Importantly, an IV-set G may exist for T even if no variable G ϵ G individually is a valid IV for any single variable T ϵ T (Brito, 2010). Note that IV sets identify not only the joint effect of T on Y but also the direct effect of each T on Y not mediated by , which may coincide with the total causal effect of T on Y.

4. Causal Models for Peer Effects in Dyads

We first present the common core of our causal models for peer effects (on BMI for illustration) to explicate the two central identification challenges: common cause confounding and homophily bias. We then discuss a series of more realistic models for peer effects and evaluate conditions under which each model can be identified via IV analysis.

4.1. The Two Identification Problems: Confounding and Homophily Bias

Figure1 gives the core of our causal models for a longitudinally observed population of independent dyads including individuals 1 and 2. Let Y denote BMI, the phenotype of interest for individual k=1,2 at time t and let q denote the number of periods before the present that the tie was formed (Figure1 depicts the case when q=2). Current BMI may affect the same individual's subsequent BMI: Y → Y, k=1,2,t=1,…q. Additionally, each individual's present BMI may affect the other's subsequent BMI (peer effect); Y2( → Y1(, and Y1( → Y2(, t=1,…q. We assume there were no effects of 1 and 2 on each other prior to tie-formation.

Figure 1

Directed acyclic graph (DAG) representing the common core of causal models for peer effects with observational data. The target of interest is the total causal effect of individual 2's (the alter's) phenotype on individual 1's (the ego's) subsequent phenotype, Y2(1) → Y1(2). Latent homophily bias arises from implicit conditioning on the social tie A12, which opens the noncausal path Y2(←U2→[A12]←U1 → Y1(2), t=0,1,2. Confounding bias arises from unobserved common causes, C12, satisfying Y2(1) ← C12 → Y1(2). Although presented for the case when q=2, other cases are represented by dropping (when q=1) or adding (when q>2) Y and the analogous edges to those involving Y, k=1,2. Variables U and C are unobserved, all others are observed. BMI is affected by two more types of variables, each assumed to be at least partially unobserved. The first is a vector of individual-specific unobserved variables U,U → Y, (k=1,2, t=0,…q) such as metabolic functioning, food preferences, etc. Second, each individual's BMI is potentially affected by shared environmental exposures, C12, such as local food sources, restaurant commercials, food fads, etc. Thus, Y2( ← C12 → Y1( for some or all of t=0,…q; Figure1 depicts a case where C12 corresponds to an event at t=1. Finally, represents the existence of a social tie between individuals 1 and 2. Taking the perspective of individual 1, the goal is to identify the total causal effect of Y2( on Y1(; that is, the effect of 2's BMI at time t-1 on 1's subsequent BMI at time t=1,…,q. Without loss of generality, we focus on the peer effect from t=q-1 to t=q. In the causal model of Figure1, presented with q=2, treatment Y2( and outcome Y1( share three sources of association—one causal and two spurious. First, treatment may affect the outcome along the causal path Y2( → Y1(, the causal effect we aim to identify. Second, they may be associated due to unobserved shared environmental confounding by C12 along the unblocked non-causal paths Y2( ← C12 → Y1( and Y2( ← C12 → Y1( → Y1(. Third, and centrally for this investigation, treatment and outcome may be associated due to the preferential (nonrandom) formation of social ties. The status of A12 may be affected by (U1, U2), because, for example, people bond preferentially with others holding similar tastes in food (homophily—“birds of a feather flock together”) or with opposite tastes (heterophily—“opposites attract”). This preferential formation turns A12 into a collider variable. Investigating peer effects among individuals linked by a social tie necessarily implies conditioning on the social tie. Since A12 is a collider, conditioning on it opens the noncausal path U2( ← U2 → [A12] ← U1 → Y1(, and hence induces a noncausal association between treatment and outcome. Bias due to falsely considering this association as causal is generically known as homophily bias (Shalizi and Thomas, 2011) and constitutes a type of selection bias (Elwert and Christakis, 2008; Elwert, 2013). This spurious association cannot be eliminated by conditioning on any set of observed variables if the sources of tie formation are at least partially unobserved, and it will exist even if the causal effect of Y2( on Y1( is zero. In fact, using Pearl (1995), it can be shown that common cause confounding in C12 and homophily in A12 prevent non-parametric identification of the causal effect of Y2( on Y1( under the causal model of Figure1.

4.2. IV Identification for Various Causal Models of Peer Effects

We now investigate the identification of peer effects despite confounding and homophily bias in several more realistic causal models. Figures2 and 3 elaborate on the model in Figure1 in two ways: first, by explicitly adding the observed exogeneous covariates X (such as gender, age, education, and the geographic distance between ego's and alter's residences) and, second, by adding G (such as genes or other isomorphic variables) affecting BMI but not tie-formation for k=1,2. We do not index (X, U) by t but note that these variables may contain time-varying elements.

Figure 2

Figure 3

DAG involving time-varying instrumental variable (IV), GX2(, assumed to be a cause of Y2( through the interaction of G2 with a time-varying variable (e.g., age) in X2, t=1,…,q (presented when q=2). The variables X and U (k=1,2) are observed and unobserved individual predictors of Y, respectively, that may also affect tie-formation. While X can be conditioned on U cannot, necessitating the use of IV-methods. By conditioning on G2 and X2, the noncausal pathways from GX2( to Y1( (e.g., GX2( ← G2 → [A12] ← U1 → Y1(), t=1,2, are blocked making GX2( a valid IV. If GX2(0) → GX2(1) is added to the DAG, it is necessary to condition on GX2(0).

DAG involving time-invariant IV G2 for causal estimation of Y2( → Y1( when t=0,…q. The variables X and U (k=1,2) are observed and unobserved individual predictors of Y, respectively, that may also affect tie-formation. While X can be conditioned on U cannot, necessitating the use of IV-methods. When q=1 (one follow-up period), G2 instruments Y2(0); when q=2 (the case presented here), G2 instruments both Y2(1) and Y2(0); and so on until G2 instruments Y2(…Y2(0). IV identification is reliant on Y2(0),…Y2( being observed so that they can be instrumented (if dim(G2)≥q) and Y2(0),…Y2( not being causes of A12 (i.e., they cannot contribute to homophily). DAG involving time-varying instrumental variable (IV), GX2(, assumed to be a cause of Y2( through the interaction of G2 with a time-varying variable (e.g., age) in X2, t=1,…,q (presented when q=2). The variables X and U (k=1,2) are observed and unobserved individual predictors of Y, respectively, that may also affect tie-formation. While X can be conditioned on U cannot, necessitating the use of IV-methods. By conditioning on G2 and X2, the noncausal pathways from GX2( to Y1( (e.g., GX2( ← G2 → [A12] ← U1 → Y1(), t=1,2, are blocked making GX2( a valid IV. If GX2(0) → GX2(1) is added to the DAG, it is necessary to condition on GX2(0). Figures2 and 3 differ in only one, albeit crucial, respect. The model in Figure2 provides for a scenario where the time-invariant (assigned at conception) gene G alone is the instrument, whereas Figure3 supposes that gene expression varies over time due to an interaction with a time-varying covariate in X, GX. We shall refer to these as gene-alone and gene-interaction identification, respectively.

4.2.1. Gene-alone identification

We now evaluate whether G2 can serve as an IV for Y2( → Y1( under various conditioning strategies, where Z denotes the variables conditioned on. Figure2 includes several different cases based on q. We first suppose the number of periods since tie-formation is q=1 and then q=2, and finally draw conclusions for general q. The case when q=1 can be thought of as estimating a single peer effect over the entire follow-up period since tie-formation at t=0 while other cases allow the peer effect to be incrementalized, which is useful if there are time-varying predictors. In this section, we again focus on the peer effect from t=q-1 to t=q.

Theorem

Assume that q=1 in the causal model represented by Figure2. Then G2 is an IV for the total causal effect T2(0) → Y1(1) conditional on Z=A12.

Proof

Condition (1) of the single-IV criterion is met because A12 is not a descendant of Y2(0). Condition (2) is met because the path G2 → Y2(0) is a direct effect and hence is unblocked. Condition (3) is met because all paths from G2 to Y1(1) in pass through the colliders Y2(0) and Y2(1); since neither Y2(0) nor Y2(1) is conditioned on, and A12 is not a descendant of either, all paths from G2 to Y1(1) in are blocked. □ The model in Figure2 permits conditioning on certain additional variables.

Corollary

In Figure2 with q=1, any subset of Z={X2,G1,X1,Y1(0)} can be conditioned on in addition to A12 without affecting the IV identifiability of Y2(0)→ Y1(1). The single-IV criterion is met because (1) no variable in Z descends from Y2(0); (2) is trivially met; (3) all paths from G2 to Y1(1) in D pass through the colliders Y2(0) and Y2(1), which block these paths and are not opened by conditioning on Z since no variable in Z descends from Y2(0) or Y2(1). □ Corollary 1 is useful because all variables in Z are associated with the outcome Y1(1)—conditional on A12 and the other variables in Z—such that conditioning on them will reduce variance in Y1(1) and lead to more precise estimates. Gene-alone identification fails when q ≥ 2 when G2 is univariate in Figure2 because no amount of conditioning can remedy several exclusion violations. For example, the open path G2 → Y2( → Y1( → Y1( can only be blocked by conditioning on Y2( or Y1(; but doing so would necessarily induce another exclusion violation by opening the path G2 → [Y2(] ← U2→ [A12] ← U1 → Y1( as Y2( is a collider on this path and Y1( descends from this collider. However, the total causal effect of Y2( → Y1( can be identified via the IV-set criterion if G2 is multivariate (e.g., representing multiple genes, or multiple alleles of the same gene, that each affect Y2( over t=0,…,q-1; dim(G2) ≥ q). In the causal model represented by Figure2 with q=2, if dim(G2) ≥ 2, then G2 is an IV set for the total causal effect of Y2(1) on Y1(2) after conditioning on A12. The IV-set criterion for the joint causal effect of Y2(1) and Y2(0) on Y1(2) is met because (1) A12 does not descend from Y2(1) or Y2(0); (2) G2 → Y2(0) and G2 → Y2(0) are open and share no nodes (since G2 is multivariate); (3) all paths from G2 to Y1(2) must pass through Y2(0), Y2(1), or Y2(2), which are colliders in ; since neither Y2(0), Y2(1), Y2(2), nor any of their descendants are conditioned on, all paths from G2 to Y1(2) are blocked. Finally, since the total causal effect of Y2(1) on Y1(2) is not-mediated by Y2(0), IV set identification of the joint causal effect of Y2(1) and Y2(0) on Y1(2) implies identification of the total causal effect of Y2(1) on Y1(2). □ Theorem 2 generalizes to arbitrary q ≥ 2, dim(G2) ≥ q, where G2 instruments Y2(0),…,Y2(q−1) with any subset of Z={X2, G1, X1, Y1(0)} together with A12 as the conditioning set. Directly extend the proof of Theorem 2 and Corollary 1. □ The solution to the identification problem in Figure2 when q ≥ 2, G2 → Y2(, t=0,…,q, and dim(G2) ≥ q involves an unusual use of IV. Whereas typically IVs are used to identify treatment effects, here, G2 both identifies the treatment effect and remedies the exclusion violation that would occur if the paths Y2( → Y1( were not accounted for by instrumenting Y2( for t=1,…,q-1. Corollary 2 illustrates that G2 faces an increasing challenge with the duration of the social tie as all values of the alter phenotype over 0,…,q-1 must be instrumented. Because G2 has limited dimension this will eventually be impossible. The central limitation of gene-alone identification, however, is that it breaks down under homophily on phenotype. If Y2( → A12 for any t ϵ {0,…,q-1} is added to Figure2 then G2 of any dimension is not a valid IV to identify the total causal effect of Y2( on Y1(, conditional on A12. Because A12 is a descendant of Y2(, conditioning on A12 is equivalent to conditioning on Y2(, which opens the unblockable noncausal path G2 → [Y2(] ← U2 → [A12] ← U1 → Y1(, among others, t=0,…,q-1, representing an exclusion violation. □ Therefore, we next look beyond using genes alone as IVs.

4.2.2. Gene-interaction identification

Even though genes themselves are not time-varying, their expression often is. The causal model analogous to that of Figure2 but with time-varying gene expression is shown in Figure3. Let GX denote a variable representing individual k's (k=1,2) gene-by-age expression at time t (here the notation GX reflects that age is an element of X). The edges X → GXk and G → GX are included at all periods to represent varying gene expression due to age. In Figure3 the effect Y2( → Y1(, t=1,…,q (the case q=2 is presented), is identified by using GX2( to instrument Y2( conditional on G2, X2, and A12. Because GX2( only affects T2( the single-IV criterion applies. Therefore, after conditioning on A12, G2, and X2 an analogous argument as for Theorem 1 completes the proof. □ Under the DAG in Figure3, G → A12, GX → A12 and Y2( → A12 may be added for k=1,2,, t=2,…,q without compromising IV-identification based on GX2(. Corollary 4 (proof omitted) illustrates that exploiting time-varying gene expression is advantageous in three ways. First, it allows genetic homophily at (or before) t-2, 2 ≤ t ≤ q. Second, it allows homophily on the phenotype of interest up to but not including t-1. This restriction appears reasonable given prior work suggesting that changes in physical appearance (e.g., BMI) have minimal impact on tie-dissolution even if initial similar appearance led to tie-formation (O'Malley and Christakis, 2011). Third, the requirements for identification do not get more onerous with q. These flexibilities centrally motivate our adoption of Figure3 as the primary causal model in our empirical analysis.

4.2.3. Relaxing further assumptions

In observational data settings, it is important to evaluate the extent to which a given identification strategy is consistent with multiple plausible causal models. Table 1 summarizes several substantively important elaborations of the causal models in Figures2 and 3, all of which consist of adding edges; that is, relaxing assumptions (proofs omitted).

Table 1

Extensions to DAGs and their consequence when q=2 and individual 1 is the ego

Phenomenon	Effect	Change to Z	Applies to figure
Homophily on	Y_k(0) → A₁₂	No implication	3
measured phenotype	Y_k(1) → A₁₂	No remedy	2, 3
(k=1,2)	Y_k(2) → A₁₂	No remedy	2, 3
Homophily on	G_k → A₁₂	No implication	3
measured genotype	GX_k(0) → A₁₂	No implication	3
(k=1,2)	GX_k(1) → A₁₂	No remedy	3
Pleiotropy on	G₂ → X₂	Add X₂	2
observables	G₂ → X₂	No implication	3
Pleiotropy on	G₂ → U₂	No remedy	2
unobservablesa	G₂ → U₂	No implication	3
Population	PopStrat₁₂ →	Add dyad	2, 3
stratificationb	G_k(k=1,2)	fixed effectsc
Inter-phenotype	(X₂,U₂) → Y₁₍₀₎	No implication	2, 3
Peer effect	(X₂,U₂) → Y₁₍₁₎	No implication	2, 3
	(X₂,U₂) → Y₁₍₂₎	No implication	2, 3
Predictor	X₂ → X₁	No implication	2, 3
Associations	X₂ → C₁₂	Add X₂	2, 3
	X₂ → U₁, U₂	Add X₂	2, 3
Confounding on	C₁₂ → G₂	No remedy	2
genotype or	C₁₂ → GX₂₍₁₎	No remedy	3
gene expression	C₁₂ → GX₂₍₀₎	Add GX₂₍₀₎	3
Epigenetic	Y₂₍₀₎ → GX₂₍₁₎	Add Y₂(0)	3
Effects	Y₂₍₁₎ → GX₂₍₂₎	No implication	3
Serial dependent	GX₂₍₀₎ → GX₂₍₁₎	Add GX₂₍₀₎	3
gene-expression	GX₂₍₀₎ → Y₂₍₁₎	Add GX₂₍₀₎	3
Relationship	A₁₂ → Y_k(0)	No implication	2, 3
status (k=1,2)	A₁₂ → Y_k(1)	No implication	2, 3
	A₁₂ → Y_k(2)	No implication	2, 3

Including unmeasured prior phenotype, Y for k=1,2 and .

Shared ancestry of individuals 1 and 2.

Add indicator variables for each dyad to Z.

Extensions to DAGs and their consequence when q=2 and individual 1 is the ego Including unmeasured prior phenotype, Y for k=1,2 and . Shared ancestry of individuals 1 and 2. Add indicator variables for each dyad to Z. First, as noted previously, homophily on the phenotype at any time is lethal for gene-alone identification with a single IV under the model of Figure2, but homophily on phenotype prior to t-1 is not lethal for identifying the peer effect from t-1 to t under Figure3. Second, G2 may be pleiotropic; that is, affect not only BMI, but also other characteristics of the individual. In Figure2, G2 may additionally affect observed covariates X2 (necessitating conditioning on Z={A12, X2}) but not unobserved features directly affecting social-tie formation; that is, G2 → U2 (because of the irreparable exclusion violation G2 → U2 → [A12] → U1 → Y1(). By contrast, in Figure3, adding G2 → X2, G2 → U2 and even G2 → A12 are unproblematic, as is GX2( → U2 and GX2( → A12, t=2,…,q, (but not GX2( → U2 or GX2( → A12). Importantly, pleiotropy on unobservables (G2 → U2) includes effects of genes on latent pre-tie formation phenotype (which by virtue of being unobserved is an element of U2). Pleiotropy on latent pre-tie formation phenotype thus ruins IV identification only in the case of Figure2, but it does not ruin IV identification in Figure3. Third, population stratification describes an association between G2 and G1 based on sharing attributes due to common ancestry (Didelez and Sheehan, 2007). To protect the exclusion restriction, one should control for race and ethnicity and ensure (to the extent possible) that members of the dyad are not directly related (e.g., using the method in Price et al. (2006)). However, because ethnic origin (e.g., Irish, German, Greek) is seldom available within general racial groups, including dyad fixed-effects is a more rigorous strategy of accounting for population stratification. Fourth, our results also accommodate inter-phenotype peer effects; if X2 affects Y1(, t=0,…,q, the results above hold. Even if 2's unobserved characteristics, U2, affect Y1(t), our results continue to hold. Fifth, effects of 2's observed characteristics on unobserved shared environmental exposures (e.g., via residential choice), X2 → C12, or on 1's observed characteristics, X2 → X1, have no implications. Sixth, epigenetic confounding on unobserved contextual factors, C12 → GX2(, t=2,…q, can be accounted for by conditioning on GX2( under Figure3. Even under epigenetic effects due to the phenotype, which imply the addition of Y → GX, t=1,…,q, to Figure3, identifiability is not affected except if t < q then Y2(t – 1) must be added to Z. Finally, if GX2( → GX2(, t=1,…,q, (serial dependence) is added to Figure3 it is necessary to condition on GX2( in addition to G2 and X2 for GX2( (for t ≥ 2) to be an IV. Therefore, GX2( must not be fully determined by G2, X2, and GX2(. Likewise, if GX2( → Y2( is added to Figure3 then GX2( must be added to Z. In summary, the IV and IV-set criteria permit identification of peer effects in a surprisingly large class of causal models with latent homophily and confounding.

5. Potential Outcomes Representation

From hereon, we assume the causal model of Figure3 and its extensions, which gives IV point identification under linearity and homogeneity (Brito and Pearl, 2002). We now exhibit model form assumptions using the potential outcomes representation of the DAG in Figure3. We explicitly allow for time-varying elements of (X,U), k=1,2, and C12 by adding the subscript (t), use bold-face font to denote vectors, and use lower-case letters to denote observed and counterfactual values of random variables. A potential outcome is the value of an outcome Y that would be observed if a variable V were set by intervention to . An observed value of V is denoted v, distinguishing it from the counterfactual . Therefore, denotes the potential outcome that would result for individual 1 if individual 2's phenotype at t-1 were set to and her gene-expression were set to gx2(. Under the DAG in Figure3, a causal model for the potential outcomes of Y1( given the conditioning set Z( (which must include G2 and X2) is where α1, β, γ1, and γ2 are coefficients and ϵ1( is a random error. We assume ϵ1( has constant variance, which simplifies estimation, but note that the assumption can be relaxed without affecting identification. The involvement of U1( and C12( in 1996 illustrates that causal models make no distinction between observed and unobserved covariates. Due to the exclusion restriction, gx2( is absent from the right-hand-side of 1996. Therefore, the left-hand-side of 1996 may be denoted . Then the peer effect we seek to estimate satisfies for .

6. Dyadic Instrumental Variables Analysis

To implement IV analysis of 1996, we use a two-stage least squares (2SLS) procedure. The “first-stage” of 2SLS regresses the endogeneous variable Y2(, t=1,…q, on the IV and the exogeneous variables in Z( (including gx2( and Y1( if conditioned on), yielding the regression from which the fitted values, , are computed. The second-stage applies OLS to where , estimating the peer effect α1. Because gx2( is an IV in 2010, under OLS estimation is orthogonal to and Z( in 2002, ensuring unbiased and statistically efficient IV-based estimates. The procedure generalizes to accommodate multiple heterogeneous effects such as two-period dependence (i.e., if Y2( → Y1() and effect heterogeneity in observed effect modifiers (see Web Appendix).

6.1. Variance Estimation

Standard errors are computed using results from the general theory for 2SLS. Because the peer effects are of alter's lagged as opposed to contemporaneous phenotypes, the complications posed by the simultaneous involvement of the same observation as a predictor and an outcome (VanderWeele, Ogburn, and Tchetgen Tchetgen, 2012) are avoided. To account for repeated observations made on dyads over time, as outlined in the Web Appendix, we compute robust standard errors based on sandwich estimators (White, 1982).

7. Friend and Spouse Peer Effect Analysis of the FHS Network

We illustrate our methods using a novel social network dataset constructed from the first seven health exams of the Offspring Cohort of the Framingham Heart Study (FHS), encompassing 32 years of follow-up. The Offspring Cohort includes 5124 individuals. Genetic data was available for 3462 distinct individuals, appearing in 22,361 exams (see Web Appendix). The network ties considered here arise from participants naming friends and spouses at their health exams. Participants typically only named a single friend at each exam, which is likely to be the one with the most influence. Given the stability of the Framingham population from 1971 to 2003, approximately 50% of the nominated friend contacts were themselves also participants in the FHS and thus provided the same information, including BMI. Most spouses of FHS participants were also FHS participants. We estimate our model with a sample of 9270 unique dyads comprising spousal and nearly disjoint friendship dyads (ignoring occasional overlap of dyads when the same ego is named by multiple alters). Because the fat mass and obesity gene (FTO) and the melanocortin-4 receptor gene (MC4R) have been confirmed through original and replication studies to be strongly associated with obesity (Speliotes et al., 2010), we consider them as IVs for peer effects of BMI. There is also evidence suggesting that genetic effects may be moderated by a person's age (Lasky-Su et al., 2008), justifying consideration of age-dependent gene expression as an IV. Linearity is assumed for the data analysis and, moreover, we are interested in the linear peer effect of BMI itself. However, we note that in certain applications one might instead be interested in peer effects of obesity (BMI ≥ 30), the effect of some other nonlinear transformation of BMI, or in the extent to which the peer effect of BMI is modified by age or some other individual characteristic of the alter (or the ego). While many interesting specifications could be considered, for illustration, we have chosen to focus on a linear specification. We adjust for ego's gender, age, gender–age interaction, birth era, birth year, smoking status, number of siblings, geographic distance between residential locations of ego and alter at tie-formation, and gene–age interactions. Birth era accounts for whether an individual was born before 1942, between 1942 and 1948, or 1948 or later to capture possible cohort effects due to America's involvement in World War II. Because the offspring cohort is nearly 100% white, we do not adjust for race. In addition, we adjust for wave number dummies to account for secular trends in BMI. Therefore, one can think of gene-age expression as random with respect to exam timing. Inclusion of alter's smoking status provides assurance against a possible pleiotropic effect between FTO and smoking and MC4R and smoking.

7.1. Representation of Genes

Genetic alleles are represented in G,k=1,2, by four dummy variables for two of the three possible states of each of FTO (states AA, AT, TT) and MC4R (states CC, CT, TT). The A and C alleles have been recognized by geneticists as the risk-alleles of FTO and MC4R, respectively. Having two copies of the risk-allele is the riskiest state followed by the one-copy heterozygous state. Therefore, we also include a fifth dummy variable corresponding to FTO = AA and MC4R = CC. While we could instrument 5 waves of phenotypes using gene-alone IV identification (Figure2 and Corollary 2), we can relax more assumptions under gene–age interaction IV identification (Figure3, Theorem 3, and Table 1). The age-dependent association of the FTO gene with BMI is clearly evident in Figure4 (see Web Appendix for the same for MC4R).

Figure 4

Fitted values of BMI, , across the i=1,…n individuals in the FHS sample are obtained from a regression of BMI on exam (categorical), gender, birth era (categorical), year born, marital status, number of siblings, and smoking status. The smooth curves are computed using a generalized additive spline regression model with smoothing factors judiciously chosen to capture local trends but not overfit the data.

7.2. Dyadic Peer Effect Analyses

We estimate several statistical models, starting with one that is consistent with the causal model of Figure3, as well as statistical models obtained by adding several of the Exclusions in Table 1. The four reported here condition on G1, X1, and X2 and are distinguished by whether GX2( was excluded (as permitted in Figure3) or conditioned on (to accommodate GX2( → GX2() and by whether Y1( was excluded or conditioned on (only allowed under Figure3) to possibly improve precision. Because population stratification is a major concern in analyses involving genes and phenotypes, we include dyad fixed effects in all analyses. Thus, the five gene–age interaction variables of the alter (individual 2) are the IVs for Y2(t-1). We also performed analyses with the analogous five gene–age2 interaction variables as additional IVs; results remained essentially unchanged (not shown). We perform separate analyses for friends and spouses and use robust variance estimators to account for repeated observations over time (Section Variance Estimation).

7.3. Estimated Peer Effects

The IV estimates are consistent with positive BMI peer effects among friends and spouses (Table 2). Under the causal model of Figure 3 with Z(t)=(GX1(, X1(, X2(), the estimated BMI peer effect among friends (row 1) is positive and statistically significant (, 95% CI (0.063, 1.713)), whereas the BMI peer effect among spouses (row 5) is positive but not statistically significant (, 95% CI (−0.324, 0.522)). In all other specifications (i.e., relaxations of Figure3), the estimated BMI peer effects among friends and spouses are not statistically significant, although point estimates remain in the expected positive direction in most models. For many IV specifications, the corresponding OLS estimates differ appreciably, consistent with the presence of unobserved confounding and homophily bias in the OLS specifications.

Table 2

Dyadic peer effect analysis of lag alter BMI using time-varying gene–age expression as an instrument

Discretionary Z_(t) terms		IV Regression (2SLS)a				Regression (OLS)

GX_2(t-2)	Y_1(t-1)	F₅b	Estimate	95% CI		Estimate	95% CI
				Nominated friend

Exclude	Exclude	2.150	0.888	0.063	1.713	−0.011	−0.121	0.100
Exclude	Covariate	1.731	0.874	−0.031	1.779	0.009	−0.071	0.089
Covariate	Exclude	1.181	0.133	−0.796	1.062	−0.086	−0193	0.021
Covariate	Covariate	1.144	−0.003	−0.911	0.906	−0.077	−0.181	0.028
Spouse

Exclude	Exclude	4.064	0.099	−0.324	0.522	0.066	0.039	0.094
Exclude	Covariate	4.351	0.101	−0.287	0.488	0.032	0.008	0.055
Covariate	Exclude	0.268	−0.102	−1.855	1.652	0.050	0.017	0.082
Covariate	Covariate	0.181	0.906	−1.832	3.643	0.023	−0.006	0.051

Z(=(GX1(, X1(, X2() are exogeneous covariates and GX2( is an IV in all IV analyses. The elements of X, k=1,2, are: gender, age, gender–age interaction, birth era, birth year, smoking status, number of siblings, and (for k=1 only) the geographic distance between residential locations of ego and alter at tie-formation. All models include dyad fixed effects. GX2( and Y1( are added to Z( as indicated in the two left-most columns.

The F-statistic is for the overall effect of the IV, GX2(, in the first-stage equation. The critical value of the Cragg-Donald F-statistic, which quantifies the power of an IV, at the 20% level ranges from 6.71 to 6.77 across the models.

Dyadic peer effect analysis of lag alter BMI using time-varying gene–age expression as an instrument Z(=(GX1(, X1(, X2() are exogeneous covariates and GX2( is an IV in all IV analyses. The elements of X, k=1,2, are: gender, age, gender–age interaction, birth era, birth year, smoking status, number of siblings, and (for k=1 only) the geographic distance between residential locations of ego and alter at tie-formation. All models include dyad fixed effects. GX2( and Y1( are added to Z( as indicated in the two left-most columns. The F-statistic is for the overall effect of the IV, GX2(, in the first-stage equation. The critical value of the Cragg-Donald F-statistic, which quantifies the power of an IV, at the 20% level ranges from 6.71 to 6.77 across the models. The imprecision (and resulting lack of significance) of many of our IV estimates is owed to relatively weak first stages. F-statistics indicate that only the causal models of Figure3 (see GX2( excluded rows of Table 2) have first stages at which IV strength is modest at best by conventional standards (e.g., under row 1, F5=2.150 for friends F2=4.064 for spouses) (Stock, Wright, and Yogo, 2002). Note, specifically that conditioning on GX2( to account for possible serial dependence in gene expression (i.e., if GX2( → GX2( is added to Figure3) results in a very weak first stage (e.g., F5 ≤ 0.268 for spouses). This explains the noisy estimates of all rows with GX2( as additional covariates in Table 2. Therefore, the absence of GX2( → GX2( is crucial to IV peer-effect estimation using FHS data. Other specifications (results not shown) yield first stages of similar strength. To improve precision, one might collect more data to increase sample size; or one might (we believe implausibly) assume the absence of unobserved population stratification, which would permit removal of the dyad fixed effects and result in a stronger first stage (results not shown).

8. Conclusion

We derived IV methodology for the estimation of peer effects using longitudinal data. A key methodological distinction of our approach, compared to past observational approaches, is that we account for latent common causes and homophily. An important theoretical finding is that latent homophily places severe demands on IVs. Genes have appeal as IVs due to their inherent randomness, lack of visibility to peers, and ongoing influence on the phenotype. However, ongoing influence on phenotype is problematic to time-invariant IVs such as genetic alleles as all past values of the alter's phenotype post tie-formation must be instrumented (even if they only have an indirect effect on ego's BMI). However, if variation in gene expression across age is used as an IV, the dimension of the instrumented variable does not need to increase with the duration of the social tie. Using two genes widely recognized as having the strongest effects on BMI or obesity, we explored BMI peer effects among pairs of friends or spouses. Our analyses, which attempted to account for all sources of confounding, estimated large peer effects but lacked significance in all but one case. Continued research on the use of genes as IVs for peer effects is motivated by the fact that, if this approach is successful, many important medical, sociological, and economic questions might be more rigorously answered than they have been in the past without having to make strong assumptions about absence of unobserved homophily or unobserved confounding. Conclusive evidence of peer effects would confirm that treatment of traits such as obesity, smoking, alcoholism, and depression could be improved by treating an individual's peers in addition to himself, or by intervening on the composition of his peer group to remove undesirable peer influences.

9. Supplementary Web Appendix

Web Appendices, Tables, and Figures referenced in Sections 6, 6.1, 7, and 7.1 and additional references are available with this paper at the Biometrics website on Wiley Online Library. Example code, example data, and associated instructions for running the code are also available as a web supplement (same website).

16 in total

1. The spread of behavior in an online social network experiment.

Authors: Damon Centola
Journal: Science Date: 2010-09-03 Impact factor: 47.728

2. Principal components analysis corrects for stratification in genome-wide association studies.

Authors: Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal: Nat Genet Date: 2006-07-23 Impact factor: 38.330

3. Instruments for causal inference: an epidemiologist's dream?

Authors: Miguel A Hernán; James M Robins
Journal: Epidemiology Date: 2006-07 Impact factor: 4.822

4. Longitudinal analysis of large social networks: estimating the effect of health traits on changes in friendship ties.

Authors: A James O'Malley; Nicholas A Christakis
Journal: Stat Med Date: 2011-02-02 Impact factor: 2.373

Review 5. Mendelian randomization as an instrumental variable approach to causal inference.

Authors: Vanessa Didelez; Nuala Sheehan
Journal: Stat Methods Med Res Date: 2007-08 Impact factor: 3.021

6. Social interactions and smoking: evidence using multiple student cohorts, instrumental variables, and school fixed effects.

Authors: Jason M Fletcher
Journal: Health Econ Date: 2010-04 Impact factor: 3.046

7. Cooperative behavior cascades in human social networks.

Authors: James H Fowler; Nicholas A Christakis
Journal: Proc Natl Acad Sci U S A Date: 2010-03-08 Impact factor: 11.205

8. Homophily and Contagion Are Generically Confounded in Observational Social Network Studies.

Authors: Cosma Rohilla Shalizi; Andrew C Thomas
Journal: Sociol Methods Res Date: 2011-05

9. Why and When "Flawed" Social Network Analyses Still Yield Valid Tests of no Contagion.

Authors: Tyler J VanderWeele; Elizabeth L Ogburn; Eric J Tchetgen Tchetgen
Journal: Stat Politics Policy Date: 2012-02-01

10. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index.

Authors: Elizabeth K Speliotes; Cristen J Willer; Sonja I Berndt; Keri L Monda; Gudmar Thorleifsson; Anne U Jackson; Hana Lango Allen; Cecilia M Lindgren; Jian'an Luan; Reedik Mägi; Joshua C Randall; Sailaja Vedantam; Thomas W Winkler; Lu Qi; Tsegaselassie Workalemahu; Iris M Heid; Valgerdur Steinthorsdottir; Heather M Stringham; Michael N Weedon; Eleanor Wheeler; Andrew R Wood; Teresa Ferreira; Robert J Weyant; Ayellet V Segrè; Karol Estrada; Liming Liang; James Nemesh; Ju-Hyun Park; Stefan Gustafsson; Tuomas O Kilpeläinen; Jian Yang; Nabila Bouatia-Naji; Tõnu Esko; Mary F Feitosa; Zoltán Kutalik; Massimo Mangino; Soumya Raychaudhuri; Andre Scherag; Albert Vernon Smith; Ryan Welch; Jing Hua Zhao; Katja K Aben; Devin M Absher; Najaf Amin; Anna L Dixon; Eva Fisher; Nicole L Glazer; Michael E Goddard; Nancy L Heard-Costa; Volker Hoesel; Jouke-Jan Hottenga; Asa Johansson; Toby Johnson; Shamika Ketkar; Claudia Lamina; Shengxu Li; Miriam F Moffatt; Richard H Myers; Narisu Narisu; John R B Perry; Marjolein J Peters; Michael Preuss; Samuli Ripatti; Fernando Rivadeneira; Camilla Sandholt; Laura J Scott; Nicholas J Timpson; Jonathan P Tyrer; Sophie van Wingerden; Richard M Watanabe; Charles C White; Fredrik Wiklund; Christina Barlassina; Daniel I Chasman; Matthew N Cooper; John-Olov Jansson; Robert W Lawrence; Niina Pellikka; Inga Prokopenko; Jianxin Shi; Elisabeth Thiering; Helene Alavere; Maria T S Alibrandi; Peter Almgren; Alice M Arnold; Thor Aspelund; Larry D Atwood; Beverley Balkau; Anthony J Balmforth; Amanda J Bennett; Yoav Ben-Shlomo; Richard N Bergman; Sven Bergmann; Heike Biebermann; Alexandra I F Blakemore; Tanja Boes; Lori L Bonnycastle; Stefan R Bornstein; Morris J Brown; Thomas A Buchanan; Fabio Busonero; Harry Campbell; Francesco P Cappuccio; Christine Cavalcanti-Proença; Yii-Der Ida Chen; Chih-Mei Chen; Peter S Chines; Robert Clarke; Lachlan Coin; John Connell; Ian N M Day; Martin den Heijer; Jubao Duan; Shah Ebrahim; Paul Elliott; Roberto Elosua; Gudny Eiriksdottir; Michael R Erdos; Johan G Eriksson; Maurizio F Facheris; Stephan B Felix; Pamela Fischer-Posovszky; Aaron R Folsom; Nele Friedrich; Nelson B Freimer; Mao Fu; Stefan Gaget; Pablo V Gejman; Eco J C Geus; Christian Gieger; Anette P Gjesing; Anuj Goel; Philippe Goyette; Harald Grallert; Jürgen Grässler; Danielle M Greenawalt; Christopher J Groves; Vilmundur Gudnason; Candace Guiducci; Anna-Liisa Hartikainen; Neelam Hassanali; Alistair S Hall; Aki S Havulinna; Caroline Hayward; Andrew C Heath; Christian Hengstenberg; Andrew A Hicks; Anke Hinney; Albert Hofman; Georg Homuth; Jennie Hui; Wilmar Igl; Carlos Iribarren; Bo Isomaa; Kevin B Jacobs; Ivonne Jarick; Elizabeth Jewell; Ulrich John; Torben Jørgensen; Pekka Jousilahti; Antti Jula; Marika Kaakinen; Eero Kajantie; Lee M Kaplan; Sekar Kathiresan; Johannes Kettunen; Leena Kinnunen; Joshua W Knowles; Ivana Kolcic; Inke R König; Seppo Koskinen; Peter Kovacs; Johanna Kuusisto; Peter Kraft; Kirsti Kvaløy; Jaana Laitinen; Olivier Lantieri; Chiara Lanzani; Lenore J Launer; Cecile Lecoeur; Terho Lehtimäki; Guillaume Lettre; Jianjun Liu; Marja-Liisa Lokki; Mattias Lorentzon; Robert N Luben; Barbara Ludwig; Paolo Manunta; Diana Marek; Michel Marre; Nicholas G Martin; Wendy L McArdle; Anne McCarthy; Barbara McKnight; Thomas Meitinger; Olle Melander; David Meyre; Kristian Midthjell; Grant W Montgomery; Mario A Morken; Andrew P Morris; Rosanda Mulic; Julius S Ngwa; Mari Nelis; Matt J Neville; Dale R Nyholt; Christopher J O'Donnell; Stephen O'Rahilly; Ken K Ong; Ben Oostra; Guillaume Paré; Alex N Parker; Markus Perola; Irene Pichler; Kirsi H Pietiläinen; Carl G P Platou; Ozren Polasek; Anneli Pouta; Suzanne Rafelt; Olli Raitakari; Nigel W Rayner; Martin Ridderstråle; Winfried Rief; Aimo Ruokonen; Neil R Robertson; Peter Rzehak; Veikko Salomaa; Alan R Sanders; Manjinder S Sandhu; Serena Sanna; Jouko Saramies; Markku J Savolainen; Susann Scherag; Sabine Schipf; Stefan Schreiber; Heribert Schunkert; Kaisa Silander; Juha Sinisalo; David S Siscovick; Jan H Smit; Nicole Soranzo; Ulla Sovio; Jonathan Stephens; Ida Surakka; Amy J Swift; Mari-Liis Tammesoo; Jean-Claude Tardif; Maris Teder-Laving; Tanya M Teslovich; John R Thompson; Brian Thomson; Anke Tönjes; Tiinamaija Tuomi; Joyce B J van Meurs; Gert-Jan van Ommen; Vincent Vatin; Jorma Viikari; Sophie Visvikis-Siest; Veronique Vitart; Carla I G Vogel; Benjamin F Voight; Lindsay L Waite; Henri Wallaschofski; G Bragi Walters; Elisabeth Widen; Susanna Wiegand; Sarah H Wild; Gonneke Willemsen; Daniel R Witte; Jacqueline C Witteman; Jianfeng Xu; Qunyuan Zhang; Lina Zgaga; Andreas Ziegler; Paavo Zitting; John P Beilby; I Sadaf Farooqi; Johannes Hebebrand; Heikki V Huikuri; Alan L James; Mika Kähönen; Douglas F Levinson; Fabio Macciardi; Markku S Nieminen; Claes Ohlsson; Lyle J Palmer; Paul M Ridker; Michael Stumvoll; Jacques S Beckmann; Heiner Boeing; Eric Boerwinkle; Dorret I Boomsma; Mark J Caulfield; Stephen J Chanock; Francis S Collins; L Adrienne Cupples; George Davey Smith; Jeanette Erdmann; Philippe Froguel; Henrik Grönberg; Ulf Gyllensten; Per Hall; Torben Hansen; Tamara B Harris; Andrew T Hattersley; Richard B Hayes; Joachim Heinrich; Frank B Hu; Kristian Hveem; Thomas Illig; Marjo-Riitta Jarvelin; Jaakko Kaprio; Fredrik Karpe; Kay-Tee Khaw; Lambertus A Kiemeney; Heiko Krude; Markku Laakso; Debbie A Lawlor; Andres Metspalu; Patricia B Munroe; Willem H Ouwehand; Oluf Pedersen; Brenda W Penninx; Annette Peters; Peter P Pramstaller; Thomas Quertermous; Thomas Reinehr; Aila Rissanen; Igor Rudan; Nilesh J Samani; Peter E H Schwarz; Alan R Shuldiner; Timothy D Spector; Jaakko Tuomilehto; Manuela Uda; André Uitterlinden; Timo T Valle; Martin Wabitsch; Gérard Waeber; Nicholas J Wareham; Hugh Watkins; James F Wilson; Alan F Wright; M Carola Zillikens; Nilanjan Chatterjee; Steven A McCarroll; Shaun Purcell; Eric E Schadt; Peter M Visscher; Themistocles L Assimes; Ingrid B Borecki; Panos Deloukas; Caroline S Fox; Leif C Groop; Talin Haritunians; David J Hunter; Robert C Kaplan; Karen L Mohlke; Jeffrey R O'Connell; Leena Peltonen; David Schlessinger; David P Strachan; Cornelia M van Duijn; H-Erich Wichmann; Timothy M Frayling; Unnur Thorsteinsdottir; Gonçalo R Abecasis; Inês Barroso; Michael Boehnke; Kari Stefansson; Kari E North; Mark I McCarthy; Joel N Hirschhorn; Erik Ingelsson; Ruth J F Loos
Journal: Nat Genet Date: 2010-10-10 Impact factor: 38.330

12 in total

1. Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable.

Authors: Felix Elwert; Christopher Winship
Journal: Annu Rev Sociol Date: 2014-06-02

2. Modeling peer effect modification by network strength: The diffusion of implantable cardioverter defibrillators in the US hospital network.

Authors: A James O'Malley; Erika L Moen; Julie P W Bynum; Andrea M Austin; Jonathan S Skinner
Journal: Stat Med Date: 2020-01-11 Impact factor: 2.373

Review 3. Social networks and health: a systematic review of sociocentric network studies in low- and middle-income countries.

Authors: Jessica M Perkins; S V Subramanian; Nicholas A Christakis
Journal: Soc Sci Med Date: 2014-08-19 Impact factor: 4.634

Review 4. Social Relationships and Obesity: Benefits of Incorporating a Lifecourse Perspective.

Authors: Mark C Pachucki; Elizabeth Goodman
Journal: Curr Obes Rep Date: 2015-06

5. Qualitative analysis of social network influences on quitting smoking among individuals with serious mental illness.

Authors: Kelly A Aschbrenner; John A Naslund; Lydia Gill; Terence Hughes; Alistair J O'Malley; Stephen J Bartels; Mary F Brunette
Journal: J Ment Health Date: 2017-07-04

6. Can longitudinal generalized estimating equation models distinguish network influence and homophily? An agent-based modeling approach to measurement characteristics.

Authors: Kori Sauser Zachrison; Theodore J Iwashyna; Achamyeleh Gebremariam; Meghan Hutchins; Joyce M Lee
Journal: BMC Med Res Methodol Date: 2016-12-28 Impact factor: 4.615