Literature DB >> 35797272

Evaluating indirect genetic effects of siblings using singletons.

Laurence J Howe^1,2, David M Evans^1,3,4, Gibran Hemani^1,2, George Davey Smith^1,2, Neil M Davies^1,2,5.

Abstract

Estimating effects of parental and sibling genotypes (indirect genetic effects) can provide insight into how the family environment influences phenotypic variation. There is growing molecular genetic evidence for effects of parental phenotypes on their offspring (e.g. parental educational attainment), but the extent to which siblings affect each other is currently unclear. Here we used data from samples of unrelated individuals, without (singletons) and with biological full-siblings (non-singletons), to investigate and estimate sibling effects. Indirect genetic effects of siblings increase (or decrease) the covariance between genetic variation and a phenotype. It follows that differences in genetic association estimates between singletons and non-singletons could indicate indirect genetic effects of siblings if there is no heterogeneity in other sources of genetic association between singletons and non-singletons. We used UK Biobank data to estimate polygenic score (PGS) associations for height, BMI and educational attainment in self-reported singletons (N = 50,143) and non-singletons (N = 328,549). The educational attainment PGS association estimate was 12% larger (95% C.I. 3%, 21%) in the non-singleton sample than in the singleton sample, but the height and BMI PGS associations were consistent. Birth order data suggested that the difference in educational attainment PGS associations was driven by individuals with older siblings rather than firstborns. The relationship between number of siblings and educational attainment PGS associations was non-linear; PGS associations were 24% smaller in individuals with 6 or more siblings compared to the rest of the sample (95% C.I. 11%, 38%). We estimate that a 1 SD increase in sibling educational attainment PGS corresponds to a 0.025 year increase in the index individual's years in schooling (95% C.I. 0.013, 0.036). Our results suggest that older siblings may influence the educational attainment of younger siblings, adding to the growing evidence that effects of the environment on phenotypic variation partially reflect social effects of germline genetic variation in relatives.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35797272 PMCID： PMC9262210 DOI： 10.1371/journal.pgen.1010247

Source DB: PubMed Journal: PLoS Genet ISSN： 1553-7390 Impact factor: 6.020

Introduction

Parents transmit genetic variation to their offspring and shape their early-life environment [1,2]. Parental effects on the family environment partially relate to effects of parental genotypes (indirect genetic effects). For example, parental genotypes which influence their behaviour could impact the offspring’s environment [1,3,4] Other relative classes such as siblings may also have indirect effects on their relatives [5]; older siblings could influence the school achievement of younger siblings [6] or their smoking behaviour [7]. Phenotypic data from twins (monozygotic/dizygotic), foster siblings and only children can be used to estimate sibling effects by considering differences in phenotypic variance [8,9]. A complementary approach is to use molecular genetic data from extended families to estimate indirect genetic effects of parents and siblings [1,4,10,11,12-14], which can then inform the effects of parental and sibling phenotypes. However, existing studies that have sampled family members have limited power to estimate sibling effects because of the paucity of available family data and the statistical inefficiency of within-family models, even when genotypes of missing first degree relatives are imputed [12,13,15]. Here, we propose an alternative molecular genetics approach for evaluating indirect genetic effects of siblings which can use large samples of unrelated individuals and so is likely to have higher statistical power than within-family approaches. In a population sample of unrelated individuals, the association between a genetic variant and a phenotype reflects the effect of the genetic variant (or a correlated variant) on the phenotype in the index individual (direct effect), indirect genetic effects of relatives which strengthen (or weaken) the genotype-phenotype covariance, as well as demography (assortative mating, population stratification) [4,16,17]. One approach to evaluate indirect genetic effects is to compare genetic associations between non-adopted and adopted individuals. Adopted individuals are raised apart from their biological families so genetic associations will not capture indirect genetic effects of relatives [11,18]. Extending this intuition, we note that indirect genetic effects of siblings will not impact genetic association estimates from singletons (individuals without siblings). If other sources of genetic association are consistent between singletons and non-singletons, then differences between the singleton and non-singleton genetic association estimates could indicate indirect genetic effects of siblings. Potential group-level differences between singletons and non-singletons are unlikely to confound genetic associations unless the factor can influence genotype (e.g. ancestry) ().

Sources of association between genotypes and phenotypes.

The association between an individual’s genotype (GI) and a phenotype (PI) will capture the following: 1) direct genetic effects (DGE) of inheriting a variant or a correlated variant. 2) indirect genetic effects (IGEM,P,S) of parental (maternal/paternal GM,P) and sibling genotypes (GS) on PI via the shared environment because of the correlations between GI, GM,P and GS. 3) Confounding factors (C) such as population stratification and assortative mating which confound the associations between genetic variants and phenotypes. If an individual has no siblings (or is raised apart from their siblings) then the GI, PI association cannot, by definition, be affected by indirect genetic effects of siblings. Therefore, indirect genetic effects of siblings are a possible explanation for heterogeneity in genetic associations between singletons and non-singletons. Note that this figure features several simplifications such as maternal and paternal indirect genetic effects being consistent and no paths between confounding factors and parental or sibling genotypes.

Results

PGS associations in singletons and non-singletons

To evaluate potential indirect genetic effects of siblings, we explored differences in genetic association estimates between singletons and non-singletons for height, BMI and educational attainment genetic variants. We constructed polygenic scores (PGS) for these phenotypes using Genome-wide Association Study (GWAS) summary data independent of UK Biobank [19]. We estimated associations between the PGS and the relevant phenotype in the singleton and non-singleton samples, adjusting for sex, birth year and the first 10 ancestry informative principal components. We then estimated the difference (% attenuation) from the non-singleton to the singleton PGS association estimate for each phenotype. The non-singleton educational attainment PGS association estimate was 12% larger than the singleton estimate (95% C.I. 3%, 21%; difference P = 0.009). In contrast, we found no strong evidence for differences between the singleton and non-singleton PGS estimates for height (attenuation = 0%; 95% C.I. -3%, 2%) and BMI (attenuation = 5%; 95% C.I. -1%, 11%) ().

Differences between singleton and non-singleton PGS estimates.

Fig 2 illustrates the % attenuation in PGS association estimates from non-singletons to singletons for height, BMI and educational attainment. As a sensitivity analysis, we repeated the educational attainment analysis using PGS weightings from a recent within-sibship GWAS meta-analysis of educational attainment [20]. The within-sibship weighted PGS is unlikely to capture effects of population stratification, assortative mating and indirect genetic effects [3,4,20]. However, associations between this PGS and educational attainment in our sample (of unrelated individuals) may still be affected by these sources of association. This is because genetic variants in the within-sibship PGS with direct genetic effects on educational attainment may also have non-direct genetic effects which are not controlled for in models that don’t account for parental genotypes. The singleton within-sibship attenuation in the educational attainment PGS association estimate (11%; 95% C.I. -1%, 23%) was highly consistent with the estimate from the primary analysis (12%; 95% C.I. 3%, 21%) (Table A in ).

PGS associations and birth order

Birth order may influence the magnitude of sibling effects. For example, the schooling decisions of an older sibling are more likely to influence a younger sibling than the converse. We used birth order data (available in a subset of the UK Biobank cohort) to identify firstborns (individuals with only younger siblings) and non-firstborns (individuals with one or more older siblings). We then computed PGS associations, as above, in the firstborn and non-firstborn samples. PGS educational attainment association estimates were larger in non-firstborns than in firstborns (attenuation = 14%, 95% C.I. 3%, 26%, difference P = 0.013) with no strong evidence of heterogeneity for height or BMI. The firstborn PGS educational attainment association estimate was highly consistent with the singleton estimate (difference P = 0.93), suggesting that the larger PGS association in non-singletons is due to individuals with older siblings, rather than firstborns ().

Educational attainment PGS and years in schooling by singleton status and birth order.

Fig 3 displays association estimates between educational attainment PGS and educational attainment in singletons, non-singletons, firstborns and non-firstborns.

PGS associations for education by number of siblings

We next evaluated evidence for a linear relationship between number of siblings and the association of the educational attainment PGS with measured educational attainment. In the full sample (N = 378,445), we found no strong evidence for an interaction (P = 0.24). However, this sample included very large families where data could be more susceptible to misclassification error and the association estimates may be affected by lower parental investment or other confounding factors. Indeed, we found that the PGS association estimate from 18,746 individuals self-reporting having 6 or more siblings was 24% smaller than the estimate from individuals with 5 or fewer siblings (95% C.I. 11%, 38%, P = 5.2x10-4) ().

Educational attainment PGS and years in schooling by number of siblings.

Fig 4 displays association estimates between educational attainment PGS and educational attainment stratified by the number of self-reported siblings. Individuals reporting 6 or more siblings were combined into the same category. We evaluated whether there is a non-linear relationship between number of siblings and the PGS associations by applying a quadratic model including the square of the number of siblings and a quadratic interaction term. This model provided evidence of a non-linear relationship with the linear interaction estimate indicating a 0.017 increase in PGS association estimate per each additional sibling (95% C.I. 0.008, 0.025; P = 0.0001) in the opposite direction to the quadratic interaction estimate which indicated a 0.002 decrease in PGS association estimate per each unit increase in the square of the number of siblings (95% C.U. -0.001, -0.003; P = 0.0003). We also repeated the linear analysis after removing outlying individuals with extreme values, considering individuals with 6 or more siblings as outliers based on an outlier threshold of 5% because this group corresponded to 4.9% of the total sample. In this sample of individuals with 5 or fewer siblings we found evidence for a linear relationship; each additional sibling corresponded to an increase of 0.012 in the PGS association estimate (95% C.I. 0.006, 0.018; P = 4.3x10-5). Under assumptions discussed in the Methods, this estimate can be scaled (multipled by two) to provide an estimate of sibling indirect genetic effects; a 1 SD increase in the educational attainment PGS of a sibling increases an index individual’s years in schooling by 0.025 years (0.013, 0.036). For comparison, this estimate is 11% (95% C.I. 5%, 16%) of the magnitude of the PGS association estimate in the full sample (0.23 years, 95% C.I. 0.23, 0.24).

Phenotypic differences between singletons and non-singletons

As discussed in the Methods, comparisons of genetic association estimates between singletons and non-singletons may be sensitive to group-level differences. We compared the sex, age, height, BMI, measured educational attainment and educational attainment PGS of singletons and non-singletons. Singletons were more likely to be male (+1.0%; 95% C.I. 0.6%, 1.5%), older (+2.6 years; 95% C.I. 2.6, 2.7) and were born further south (5.2 km; 95% C.I. 3.6, 6.8) and east (3.5km; 95% C.I. 2.7, 4.3). After adjusting for age and sex, we found evidence that singletons are taller (+0.15 cm; 95% C.I. 0.09, 0.21), have higher BMI (+0.06 kg/m2; 95% C.I. 0.01, 0.10) and have more years in full-time education (+0.25 years; 95% C.I. 0.23, 0.27). However, we found no strong evidence of differences between singletons and non-singletons for educational attainment PGS (-0.002 SD difference; 95% C.I. -0.011, 0.008) (Tables C and D in ). Similar differences were also observed between singletons and firstborn non-singletons for age and BMI but contrastingly, singletons were shorter (-0.23 cm; -0.31, -0.14) and had fewer years in full-time education (-0.09 years; 95% C.I. -0.12, -0.06) than firstborns (Table E in ). These findings illustrate differences between singletons and non-singletons, which could relate to parental differences (e.g. education) but also birth order effects or factors influencing study participation. However, we note that group-level differences by family size are unlikely to confound our genetic association analyses with the exception of ancestral differences and that some of the observed differences are relatively modest (e.g. 5.2 km difference in birth coordinates).

Discussion

Here, we proposed that differences in genetic association estimates between singletons and non-singletons can be used to evaluate indirect genetic effects of siblings. Using UK Biobank data, we found that the association between the educational attainment PGS and educational attainment was larger for non-singletons than singletons. This difference was driven by individuals with older siblings rather than firstborns. We found that the relationship between number of siblings and educational attainment PGS associations was non-linear, with PGS associations attenuating substantially in larger families with more than 6 children. After removing these families, we found strong evidence for a linear relationship between number of siblings and the educational attainment PGS associations. These findings are suggestive of indirect genetic effects of siblings; older siblings influence the education of younger siblings. There are alternative explanations for our findings. First, group-level differences between singletons and non-singletons, such as for parental education [21] or health outcomes [22-24], could have led to differences in sources of genetic association [25]. For example, direct and indirect genetic effects on educational attainment could vary by socio-economic position [26-28] or by other covariates. Indeed, we observed differences between singletons and non-singletons for age, sex, and several phenotypes, consistent with parental differences, birth order effects or selection bias although we did not observe differences for the educational attainment PGS. Second, our results could have been impacted by collider bias [25,29], because stratifying on number of siblings (which is non-random) could distort associations between factors which influence number of siblings. Third, indirect genetic effects of parents may be stronger in singletons because of additional parental investment with fewer offspring. Similarly birth order may influence the magnitude of direct genetic effects and how an individual is affected by their parents [30,31]. However, differences in indirect genetic effects and birth order are unlikely to explain the observed differences in PGS associations for educational attainment. Previous literature [32] suggests that singletons and firstborns are more likely to receive additional parental investment, which would likely result in larger indirect genetic effects of parents on educational attainment and a larger PGS association estimate. Inconsistent with this explanation, we observed larger PGS estimates in non-singletons and non-firstborns. A more plausible explanation for the difference in association of education and the educational attainment PGS between firstborns and non-firstborns is that older siblings influence the educational decisions of younger siblings. Whereas an individual’s decision to go to university is less likely to be strongly influenced by younger siblings. Previous research has shown that PGS-phenotype associations can differ across ancestry groups as well as by other phenotypes such as socio-economic position, age and sex [27,33]. Indeed, it has been previously demonstrated that educational attainment PGS more strongly predict educational attainment in individuals with one sibling compared to individuals with no siblings, although these findings were not interpreted with respect to sibling IGEs or birth order [27]. As a sensitivity analysis, we performed analyses using an educational attainment PGS weighted by within-sibship GWAS estimates [20] which are robust against population stratification, assortative mating and indirect genetic effects of parents but not indirect genetic effects of siblings. Here we found consistent evidence that PGS association estimates are larger in non-singletons. This suggests that our results are unlikely to be explained by effects of population stratification, assortative mating and indirect genetic effects on the PGS weightings. However, these mechanisms could still affect the association between the PGS and educational attainment differently between singletons and non-singletons because analyses in unrelated individuals do not account for variance in parental genotypes. Molecular genetic analyses of indirect genetic effects of relatives [1,4,11,12,14] have generally found evidence of imitation rather than contrast effects [8,9,34], i.e. effects of parental genotypes on the shared environment result in children being more similar to their parents. Our results are consistent with imitation effects of siblings for educational attainment as they suggest stronger rather than weaker gene-environment correlations for individuals with older siblings. For example, this could occur if an older sibling going to university increases the probability that their younger sibling will also go to university. Contrastingly, there could be more subtle effects of an individual’s behaviour on the shared family environment. For example, an individual with a higher education PGS may help younger siblings more with their homework. A previous meta-analysis estimated that parental indirect genetic effects on educational attainment are around half of the magnitude of direct genetic effects of inherited variants [14]. We were unable to estimate direct genetic effects in this study but estimated that the indirect genetic effects of one sibling on education are around a tenth of the total genetic association estimate suggesting that sibling indirect genetic effects on education are likely to be substantially smaller than parental effects. However, there are complexities specific to interpreting estimates of indirect genetic effects of siblings. First, our estimate was derived using all non-singletons, but birth order is likely to affect the magnitude of indirect genetic effects of siblings. For example, supported by our findings, sibling indirect genetic effects could be much larger for non-firstborns. Second, we assumed a linear additive model with each additional sibling increasing the combined magnitude of the sibling indirect genetic effects. However, we observed an attenuated genetic association estimate in larger families suggesting that the relationship between number of siblings and genetic associations may be non-linear. The genetic association attenuation in larger families could also relate to differences in indirect genetic effects of parents or confounding. In contrast to our findings, previous studies using family-based approaches have reported limited evidence of sibling IGEs [12,13,18]. This could have been because of differences in statistical power or because of genuine heterogeneity. We compared our sibling IGE estimates to the estimates from one of these manuscripts (Kong et al [13]) and found limited evidence of heterogeneity. Our sibling IGE estimate, which was more precisely estimated, was consistent with the estimate and 95% confidence interval from Kong et al [13] (). This suggests that the differences in conclusions are likely to relate to differences in statistical power. Our findings add to the growing evidence for social effects of germline genetic variation. The main limitation of our approach is that it is sensitive to systematic differences between singletons and non-singletons. For example, interactions between the PGS and covariates could have biased our estimates of sibling indirect genetic effects. An additional limitation is that (beyond including birth year as a covariate) we did not account for possible generational effects (i.e. effects of the PGS changing over time) which could induce bias in combination with changes in family size over time. Further caveats are that our analyses may have been affected by selection bias relating to non-random participation in UK Biobank [29,35], that we did not account for possible indirect genetic effects of half siblings in our analyses as this data was not available and that we assumed random mating with assortative mating a likely source of bias. We also did not evaluate whether sibling indirect genetic effects vary by the sex of the index individual and their siblings (e.g. same sex sibling pairs may be more likely to influence one-another). Larger datasets of first-degree relatives will enable more precise estimation of sibling indirect genetic effects and allow the evaluation of sibling effects on a wider range of phenotypes such as smoking behaviour and alcohol consumption.

Methods

Ethics statement

UK Biobank received ethical approval from the Research Ethics Committee (11/NW/0383). Access to UK Biobank data was granted as part of application 8786 (PI: NMD).

Evaluating indirect genetic effects of siblings using singletons

There are several different mechanisms which can induce covariance between an individual’s genotype G and their trait value Y (Cov[G, Y]). Direct genetic effects, which we define as the effect of inheriting G on Y. For simplicity, we assume that the genetic variant is causal even though variants identified in GWAS are usually in linkage disequilibrium with the true causal variants. Indirect genetic effects of the maternal (G), paternal (G) and sibling (G) genotype on Y via the shared environment. These effects are partially captured by Cov[G, Y] because an individual’s genotype is correlated with the genotype of their relatives (i.e. indirect genetic effects of relatives induce gene-environment correlation). Note that effects of other relatives (e.g. grandparents) could also be captured by Cov[G, Y] and the formulation below could be extended to model these effects. Effects of unmeasured confounders (C) which influence both G and Y (e.g. population stratification, assortative mating). We consider how indirect genetic effects of siblings would affect the covariance in non-singletons (Cov[G, Y]) who have siblings. We model Y as a function of genotypes (G) and error variable ∈ (with mean 0). We assume a single autosomal genetic variant, additive genetic effects only, random mating, no confounding between G and Y and that non-singletons have one sibling each. where k are the (direct/indirect) effects of G on Y It follows that the population covariance of individuals with siblings Cov[G, Y] is: Further detail on the derivation of this covariance term is contained in . In contrast, in singletons, where indirect genetic effects from siblings are not possible, k would equal zero and the population covariance (Cov[G, Y]) would simplify to: The ordinary least squares (OLS) regression estimates of G on Y will capture indirect genetic effects of siblings. It follows that the expected OLS non-singleton β and singleton β regression coefficients are: As above, we assumed that each index individual had only one sibling. However, the model could be extended to account for different numbers of siblings (N). For example, assuming a linear relationship: Dependent on certain assumptions, twice the difference between β and β can be used to provide an unbiased estimate of indirect genetic effects of siblings. Assumption 1: No heterogeneity in direct effects (k) between non-singletons and singletons. If this assumption does not hold then the difference estimator will be biased by the difference in direct effects (k−k). Assumption 2: No heterogeneity in maternal (k) or paternal (k) indirect effects between non-singletons and singletons. If this assumption does not hold then the difference estimator will be biased by half of the differences in maternal and paternal indirect effects (k−k, k−k). Assumption 3: No heterogeneity in confounders of G and Y between non-singletons and singletons. If this assumption does not hold then the difference estimator will be biased by the confounder C which must influence both G and Y by definition so must be a function of G in the context of Y. If these assumptions hold, then twice the difference between the two regression coefficients will be an unbiased estimator of the indirect genetic effects of siblings. We note that group-level differences between singletons and non-singletons could lead to violations of the three assumptions. Indeed, there are well-observed group-level differences between singletons and non-singletons. For example, higher education is associated with having fewer children, so parents of singletons are likely to be more educated. However, group-level differences will not necessarily lead to differences in genetic association estimates. Genetic differences (e.g. mean or variance) between singletons and non-singletons combined with non-linear effects or interactions could lead to violation of all three assumptions. The second assumption would be violated if indirect genetic effects of parents are stronger (or weaker) for singletons because of differences in parental investment. The third assumption could be violated if there are differences in assortative mating and population structure between singletons and non-singletons. A further caveat is that stratifying on a non-random variable (e.g. number of siblings) could distort associations between determinants of Y (collider bias) [29]. In this context, collider bias is unlikely to have a large effect on genetic associations. Although number of siblings is non-random, it is unlikely to be directly influenced by G or Y. However, collider bias could distort estimates of β and β via paths involving parental characteristics that influence the number of siblings of the index individual. We note that if the effects of collider bias on β and β are consistent, then this bias would cancel out in the difference.

Simulations

To explore properties of the proposed framework we performed simulations. First, we confirmed the theory from above that sibling indirect genetic effects will lead to differences in genetic associations between singletons and non-singletons in simulated data. Second, we evaluated how non-random mating affects parameter estimates from the framework. Code for simulations is available at SiblingIGE/simulations.R at main · LaurenceHowe/SiblingIGE (github.com).

Model 1 –Random mating

In Model 1, we simulated 100,000 parent-offspring trios with singleton offspring and 100,000 parent-offspring quads with two non-singleton offspring (i.e. a sibling pair). a) Parent-offspring trios We simulated a normally distributed PGS for each individual using the following variance-covariance matrix to characterise within-family correlations for maternal (PGS), paternal (PGS), and offspring (PGS) PGS under random mating. The offspring phenotype Y was simulated as a function of the direct effect of PGS, indirect effects of PGS and PGS, effects of confounding C and a normally distributed error term (∈~N(0, 1)). where k are the (direct/indirect) effects of PGS on Y. b) Parent-offspring quads We simulated a normally distributed PGS as above for the parent-offspring trios but with two offspring using the following variance-covariance matrix. The phenotype of O1 was simulated as above but including an additional effect of the PGS of their sibling O2. where are the (direct/indirect) effects of on . We then computed the OLS regression estimates of an individual’s own PGS on their phenotype, i.e., PGS on Y in the singleton sample and on in the non-singleton sample. Simulations confirmed the theoretical expectations from above that the OLS estimates in the singleton and non-singleton samples are as follows: The simulations also confirmed the expectations of bias relating to violation of the assumptions and that the difference estimator provides an unbiased estimate of sibling indirect genetic effects if all the assumptions hold.

Model 2 –Non-random mating

In Model 2, we extended Model 1 by modifying the variance-covariance matrix of the within-family PGS to reflect assortative mating. This involved increasing the maternal-paternal correlations from 0 to 0.2 and the parent-offspring and sibling correlations from 0.5 to 0.6. Phenotypes were simulated as in Model 1. a) Parent-offspring trios b) Parent-offspring trios In Model 2, the OLS regression estimates and were slightly higher than in Model 1 because of the increased covariance of the PGS within-families. Notably, the difference term ) was also inflated over expectation suggesting that assortative mating, if not accounted for, could lead to overestimates of sibling indirect genetic effects from the proposed framework.

UK Biobank

UK Biobank is a large prospective cohort study of 503,325 individuals, aged between 38–73 years at baseline, who were recruited between 2006 and 2010 from across the United Kingdom. UK Biobank study participants were genotyped, completed questionnaire data at baseline and have linked records with secondary care data and other health registries. The cohort has been described in detail in previous publications, including information on genotyping [19,36]. In this study, we used UK Biobank genetic data and phenotype data (height, BMI, educational attainment, number of siblings, adoption status and birth order). Height of study participants was measured at baseline using a Seca 202 device at the assessment centre (field ID: 12144–0.0). BMI was derived manually using measures of standing height and weight (field ID: 21001.0.0). Educational attainment was defined using qualification data, as in a previous study [37]. Questionnaire data included information on the highest level of educational attainment which was then used to estimate the number of years spent in full-time education (field ID: 6138). Individuals were asked the number of full brothers (field ID: 1873), full sisters (field ID: 1883) and older siblings (field ID: 5057) in the baseline questionnaire. Individuals were also asked if they were adopted as a child (field ID: 1767). North-south (field ID: 129) and East-West (field ID: 130) birth coordinates were derived from questionnaire data on place of birth. Starting with the full UK Biobank dataset with genetic data, we restricted analyses to individuals of recent European descent based on a k-means cluster analysis on the first 4 genetic principal components and also removed closely-related individuals and standard exclusions (e.g. sex mismatch). More information on the internal quality control of UK Biobank data is contained in a previous document [38]. This dataset included 385,222 individuals. For the purposes of our analyses, we then removed 6,025 individuals who reported being adopted as a child. We defined singletons (N = 50,143) as individuals who self-reported having both no full brothers and no full sisters. We defined non-singletons (N = 328,549) as individuals who self-reported having 1 or more full brothers or 1 or more full sisters. Individuals who self-reported zero brothers or sisters to one question but did not answer the other question were set to missing. Data on birth order was available in a subset of the cohort (N = 110,326). Firstborns (N = 43,733) were defined as individuals who self-reported no older siblings while non-firstborns (N = 66,593) reported one or more older siblings.

PGS construction and association analyses

We constructed PGS for height, BMI and educational attainment by LD clumping (P < 1.0x10-5, r2 < 0.001, clumping distance = 10000 kb) summary data from previous GWAS. We used data from GWAS independent of UK Biobank for height [39], BMI [40] and educational attainment [37] to minimise sample overlap. For educational attainment, we used the summary data from the discovery sample which included 23AndMe. As a sensitivity analysis, we also constructed educational attainment PGS using weightings from a recent within-sibship GWAS of educational attainment [20], i.e. we included the same variants in the PGS but used the beta values from the within-sibship GWAS. This approach would reduce effects of population stratification, assortative mating and indirect genetic effects on the PGS itself. However, these sources of genetic association could still impact the association between the within-sibship PGS and educational attainment. We estimated associations between PGS and the relevant phenotype (e.g. height PGS and height) in the singleton, non-singleton, firstborn and non-firstborn samples separately, adjusting for sex, birth year and the first 10 ancestry-informative principal components. We estimated the % attenuation between the non-singleton and singleton PGS estimates using the difference of two means approach (based on the delta method) [41] and then rescaled the difference term and confidence interval as a ratio term (i.e. % attenuation = difference/non-singleton estimate). The estimates were derived in non-overlapping samples, so we assumed zero covariance. To evaluate a linear relationship between number of siblings and the educational attainment PGS estimate, we applied the following regression model including an interaction term (number of siblings multiplied by the PGS) to estimate the effect of each additional sibling on the PGS association estimate. In the first instance, we used the full sample of individuals with complete data on education and number of siblings. where PGS = educational attainment PGS, N = number of siblings. To evaluate a non-linear relationship, we estimated associations between educational attainment PGS and educational attainment separately in individuals with differing numbers of siblings (0, 1, 2, 3, 4, 5, 6 or more). We compared PGS estimates between individuals with 6 or more siblings and the rest of the sample using the difference of two means approach. We extended the linear model with a quadratic term including the number of siblings squared. where PGS = educational attainment PGS, N = number of siblings. We also repeated the linear relationship regression model after removing 18,746 outlier individuals with 6 or more siblings (4.9% of the sample). We multiplied the regression coefficient for the interaction term by two to generate an estimate of the indirect genetic effect of one sibling. We explored group-level difference between singletons and non-singletons for sex, birth year, birth coordinates (north-south, east-west), height, BMI, educational attainment and an educational attainment PGS. First, we extracted information on the mean and standard deviations for each of these measures for singletons, non-singletons, firstborns and non-firstborns. Second, we tested for differences in sex, birth year and birth coordinates (north-south, east west) using the difference of two means approach (based on the delta method) [41]. Third, we used a linear regression adjusting for sex and age with either (singleton 0, non-singleton 1) or (firstborn 0, non-firstborn 1) to investigate group-level differences for height, BMI, educational attainment and the educational attainment PGS.

Supplementary Materials.

Table A in S1 Text. PGS association estimates for singletons and non-singletons. Table displays association estimates between height, BMI and educational attainment PGS with the same phenotype in singletons and non-singletons. Table B in S1 Text. PGS association estimates for firstborns and non-firstborns. Table displays association estimates between height, BMI and educational attainment PGS with the same phenotype in firstborns and non-firstborns. Table C in S1 Text. Characteristics of singletons, non-singletons, firstborns and non-firstborns in UK Biobank. Table contains descriptives of group-level characteristics of singletons, non-singletons, firstborns and non-firstborns. Table D in S1 Text. Differences between singletons and non-singletons. Table contains estimates of differences in group-level characteristics between singletons and non-singletons. Table E in S1 Text. Differences between singletons and firstborns. Table contains estimates of differences in group-level characteristics between firstborns and non-firstborns. (DOCX) Click here for additional data file. 2 Sep 2021 Dear Dr Howe, Thank you very much for submitting your Research Article entitled 'Evaluating indirect genetic effects of siblings using singletons' to PLOS Genetics. The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the current manuscript. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review a much-revised version. We cannot, of course, promise publication at that time. Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org. If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist. To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process. To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder. [LINK] We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions. Yours sincerely, Heather J Cordell Associate Editor PLOS Genetics Gregory Barsh Editor-in-Chief PLOS Genetics Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: Review is uploaded as an attachment Reviewer #2: This paper focuses on a topic that has attracted a lot of attention, i.e. the estimation of indirect effects in within family designs. The paper focuses on a specific type of indirect effects, i.e. indirect effects that originate in siblings rather than parents. It exploits data from both siblings and singletons to estimate such effects and provides a valuable addition to the field. General comments The authors justify their approach by a gain in power from including singletons, which seems fairly intuitive. However, because the estimation methods seem different (here test for a difference in association strength) is it worth justifying this point further either by developing the argumentation of why there is a gain in power (compared to which design) or by simulations? Is it possible to include in the article the estimates of sibling effects from other designs (e.g. in adopted siblings or any comparable design using polygenic scores?) to verify whether effects are consistent with the approach used by the authors? At present, we have a suggested method but no benchmark either by simulation or by other designs showing whether this approach is reliable. What about sex of the sibling? Does it make a difference whether sibling is same-sex or different sex If the authors interpret the findings as sibling influences on educational decisions, we could imagine that having a same-sex sibling going to uni would make more of a difference? Comment on specific sections Data from twin design etc. are not only usable “in principle”, they can and are used to estimate these effects. It is fair to say that these approaches are sensitive to modelling assumptions and measurement error, but so is the molecular genetic approach. So maybe present this as a complement to triangulate findings rather than an alternative. I know that it is common in the UKB to adjust for 10 PCAs. The Kong paper adjusted for many more + interactions with birth year. Can the authors justify their choice? It is interesting that the authors used the results from a within-sibship GWAS as sensitivity analysis. Please provide estimates using within-sibship GWAS weights in singletons and sibling families in addition to the attenuation. I think that the sentence “However, associations between this PGS and educational attainment in our sample (of unrelated individuals) may still be affected by these sources of association” requires a bit more explanation. The reader may assume that the point of within-sibship GWAS is to obtain SNP weights that are unaffected by these issues even when implemented in singleton samples (e.g. a SNP whose effect is entirely explained by pop strat or indirect effect would be given a weight of 0 and thus not enter the composition of the polygenic score). (Reading the discussion, it appears I misunderstood this first sentence as the authors do use the within-sibship GWAS to exclude these sources of bias as expected. Maybe the sentence can be clarified and developed to avoid misunderstanding. In particular, it may be good to mention more explicitly that the within-sibship GWAS controls for parental indirect effects but does still retain signal from sibling indirect effect. If it did not then we would expect findings to be radically different). PGS associations and birth order Sentence: “suggesting that the larger PGS association in non-singletons is due to individuals with older siblings, rather than firstborns” makes sense but Figure 3 shows that PGS association is higher even in firstborns compared to non-singletons, so technically both would contribute to larger association in the larger non singleton sample. Maybe worth considering/discussing why firstborn and non-firstborn associations are higher than in the rest of the sample (e.g; selection effect) ? “we found limited evidence for an interaction (P = 0.24)” I know that P values are not the best indicator, but this is what the authors are using here, and pvalue of 0.24 is “no evidence” rather than “limited”. The whole approach relies on comparing the strengths of association in families with singletons vs non-singletons. And the difference is attributed to indirect genetic effects. This assumes that those families do not differ by any other characteristics, which may be a stretch. Same applies to the finding that the association is non-linear and different in families with 6+ children. There must be plenty of other factors that distinguish fairly rare families nowadays (6+ children) from others (e.g. religion, income available for each child). To what extent do these possible systematic differences affect the findings? The authors “lower investment or other confounders” in passing but not how these could affect estimates. (this is discussed more in the discussion section but may be worth flagging earlier in the intro or the results that this is an issue discussed at more lengths in methods and discussions). Looking at Figure 4 makes me question the decision to fit a linear effect after removing 6 or more siblings, which seems post hoc and arbitrary. There seems to be a flattening already from 3 siblings. I think it is worth first fitting a model including both a quadratic and a linear term for number of siblings to check whether there is an overall significant pattern and only conduct linear analyses as follow-up analysis, restricting those analyses to the linear part of the fitted curve (or better, use the linear term in the quadratic model to estimate interaction with PGS). My intuition would be that such a linear term accounting for the quadratic trend should be larger than the one that the authors estimated). One question that may not be pertinent. The analysis seems to suggest that nonsingeltons and non-firstborns (with not too many siblings) benefit from education-increasing indirect genetic effects via their sibling. How does it translate at the phenotypic level? Does it only make a difference in variances or also a difference of means (i.e. having a sibling helps with schooling). In an MR framework this association could be consistent with a causal effect of sibling education on the focal child’s education. I guess the effect would only increase the focal’s child education at the phenotypic level if the sibling had a higher polygenic load for education, and would decrease it if they had a lower polygenic load, leading to no expected differences in average between singletons and non-singletons. Is it possible to provide the descriptives and formally test the differences in both variances and means as well as discuss what should we expect and what we observe at the phenotypic level. (Reading the rest of the results, I see that the authors provided some of the descriptives between singleton/nonsingletons and firstborns and nonfirstborns, including mean differences in education. I think it would be worth to also formally test for differences of variance in education and discuss these findings in a systematic way, i.e. spell out earlier in the text rather than in the discussion whether we expect differences in means and variances or not, and if yes, in which direction. And discuss whether findings are consistent or in contradiction to what would be expected under the indirect effect model? Methods “Indirect genetic effects of the maternal (), paternal ( ) and sibling () genotype on via the shared environment. These effects are captured by [ , ]” Maybe “partially” captured? Cov[G1,Y] only captures indirect effects of transmitted rather than untransmitted parental alleles (i.e. )? Linked to that, in the second equation we get 0.5Kns,m from KmGm in the first equation. Readers less familiar with these models may benefit from an explicit demonstration of this (i.e; comes from cov(G1,GM) being 0.5) as well as a more detailed derivation (maybe as supplementary). Also, not sure I get why *Var[G1] applies to the whole expression in brackets rather than only the first term (i.e. cov(G1,k1G1) = k1Var[G1]), which I could understand with the full derivation. This is important as the expression for the beta depends on Var[G1] applying to the full expression (although not sure it matters practically as variances of polygenic scores are unity). “We constructed PGS for height, BMI and educational attainment by LD Clumping (P < 1.0x10-5, r2 < 0.001, clumping distance = 10000 kb).” This is a fairly basic way of computing the polygenic score. Why choose this p-value, any justification? Why not choose a more advanced method to account for LD like LDpred? This would have the benefit of increasing the accuracy of polygenic scores. Reviewer #3: In this paper, the authors use a creative design to try and estimate the magnitude of the sibling indirect effects in a PGS analysis for educational attainment, BMI, and height. Namely, they estimate the association of the PGS with its corresponding outcome in four subsamples of the UK Biobank: (a) those who report having no full siblings, (b) those who report having at least one sibling, (c) those who report having only younger siblings, and (d) those who report having at least one older sibling. Under the assumption that the direct and indirect genetic effects are the same across these different groups, a comparison of the PGS associations between these groups might shed light on whether a person's outcomes may be affected by the PGS of their sibling. While I liked the idea behind the design of this study, I find myself very concerned about the limitations of this approach. I really appreciated that many of these limitations are brought up in a careful way in the discussion and methods section, though a bit more discussion on how important these limitations are would be useful. I describe these concerns in more detail below: 1. The authors show in the methods section how a key assumption of their design is that the direct genetic effects and indirect genetic effects of parents is the same in the different groups they consider. They state that one piece of evidence in support of their assumptions would be if the groups were similar on other observable characteristics. However, when they make comparisons across these different groups, they find significant differences in means on a number of variables. So how likely is it that their key assumption holds? One thing they may consider is to take the variables where they detect significant differences (e.g., men vs women) and test whether the PGS has a different coefficient when you divide the same that way. They may also consider other variables that are fixed early in life like birth coordinates. 2. The PGS the authors use are based on population-level GWAS that are blind to the family size of the individuals in the sample. If there is GxE on family size and the GWAS sample has more individuals with at least one sibling, then we would expect the PGS to be more predictive in a sample of people with siblings, wouldn't we. Given that most families have at least two children, it seems sensible that this could be a problem. 3. I was uncomfortable with the author's decision to drop estimates that correspond to families with 6 or more siblings. Once you acknowledge that environmental factors (like parental investment) may be influencing the patterns seen in the results, why couldn't it be that all the results are being driven by these factors? 4. In the methods section, they work out the math to infer how large the indirect effect from siblings under the assumptions of their method. Eyeballing figure 3, the difference between singletons and non-firstborns is about .07, suggesting the coefficient on the sibling PGS would be .14, almost as big as the association in singletons. Is this plausible? 5. Overall, I'd bring the assumptions and limitations much more front and center and expand the discussion of whether those assumptions are justified. And if they aren't justified, how large do you anticipate the biases would be as a result of the confounds. ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: No: Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Submitted filename: PLOSgeneticsD2100943Review.docx Click here for additional data file. 9 Dec 2021 Submitted filename: PLOSGenetics_Response_Dec2021.docx Click here for additional data file. 12 Jan 2022 Dear Dr Howe, Thank you very much for submitting your Research Article entitled 'Evaluating indirect genetic effects of siblings using singletons' to PLOS Genetics. The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important problem, but one reviewer raised some substantial remaining concerns about the current version of the manuscript. Specifically, the reviewer points out that it is hard to disentangle your results from confounding due to gene-environment interactions. Given that there is existing evidence (albeit in a preprint) that goes against your results, the fear is that your findings could be largely explained by this confounding. Based on these reviews, we will not be able to accept this version of the manuscript, but we would be willing to review a much-revised version. We cannot, of course, promise publication at that time. Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by the reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org. If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist. To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process. To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder. [LINK] We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions. Yours sincerely, Heather J Cordell Associate Editor PLOS Genetics Gregory Barsh Editor-in-Chief PLOS Genetics Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors have thoroughly addressed my concerns. The revised manuscript is ready for publication in my opinion Reviewer #2: The authors carefully answered my concerns. Reviewer #3: The authors have made a large number of changes to the manuscript, highlighting how violations of the assumptions of their model affect their estimates and arguing why they believe their results are valid. Unfortunately, I still find their arguments unsatisfying. I describe my two main reasons below. 1. In their response letter (and the revised manuscript), the authors claim, "Group-level differences will only induce confounding bias if they influence both genotype and the outcome which is not possible for sex, age, and many other phenotypes which cannot plausibly influence germline genotype." While this may be true for each individual PGS association, it is not true for their test for indirect effects that the authors conduct in this paper. For example, if singletons are on average older, and older individuals have a weaker association between the PGS and EA, then you will detect a weaker EA/PGS association in singletons. Using the method proposed in this paper, the authors would then falsely conclude that there are significant indirect effects from siblings even if there are none. The authors acknowledge that "Family size is inversely associated with education-level because more education generally leads to having children later." This seems like a major unmeasured confounder of their test of sibling indirect effects. As a result, it is difficult to tell if the results reported in this paper are driven by a gene-environmental interactions or if they are driven by indirect effects of siblings. One thing the authors may consider, which I described (admittedly poorly) in my previous review, is to look at whether there are significant difference in the PGS association for EA across the observed variables for which they see differences between singletons and non-singletons. For example, the authors report that the singletons in their data are more often male and are born earlier. What is the association between the PGS and EA in men vs women or between the older and younger individuals in their sample? If the associations are the same, this is some evidence that GxE is not driving their results, at least for the observed variables that they are able to test. 2. I appreciate the correction to Figure 3, but even a sibling indirect effect of .06 seems implausibly large to me. In response to Reviewer 1, the authors mention that Young et al. (2020) as an underpowered example of estimating indirect sibling effects. However, the more relevant paper is Kong et al. (2020), which uses a design where they regress EA on the proband, sibling, and (imputed) parental PGSs in the UK Biobank. The coefficient on the proband is .117 and the coefficient on the sibling is -.001 (SE=.013). This is quite precise and substantially smaller than the implied estimate from comparing singletons and non-singletons. Important, the data from Kong et al. is from the same dataset as the data used in this paper, so the environmental contexts should be similar. I find the estimates of Kong et al. much more reliable since they are direct estimates and don't require as strong assumptions. Can the authors justify why their results are different than those found in Kong et al.? References: Young, AI, Nehzati, SM, Lee, C., Benonisdottir, S., Cesarini, D., Benjamin, DJ, ... & Kong, A. (2020). Mendelian imputation of parental genotypes for genome-wide estimation of direct and indirect genetic effects. BioRxiv . Kong, A., Benonisdottir, S., & Young, A. I. (2020). Family analysis with Mendelian imputations. BioRxiv. ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No 1 Mar 2022 Submitted filename: Figure1_Dec2021.tif Click here for additional data file. 21 Mar 2022 Dear Dr Howe, Thank you very much for submitting your Research Article entitled 'Evaluating indirect genetic effects of siblings using singletons' to PLOS Genetics. The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated your improvements and revisions, but identified some remaining concerns that we ask you address in a revised manuscript. Specifically, as mentioned by one reviewer, a sentence in the discussion and a short supplementary note describing how your results compare to existing estimates using more direct methods (something along the lines of what you wrote in the response) would be valuable to readers. We therefore ask you to modify the manuscript according to the review recommendations. In addition we ask that you: 1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. 2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images. We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org. If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission. To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process. To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder. [LINK] Please let us know if you have any questions while making these revisions. Yours sincerely, Heather J Cordell Associate Editor PLOS Genetics Gregory Barsh Editor-in-Chief PLOS Genetics Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: I have no additional concerns and recommend the acceptance of the manuscript. Reviewer #2: Manuscript ready for publication Reviewer #3: I thank the authors for their clarifying comments I am substantially less concerned than I was before. The point about the units of educational attainment is well taken. I'd note, however, that the PGS in Kong et al. is substantially more predictive than the PGS used in this paper. (I believe the implied R2 in this paper is roughly 1% while it is over 5% in Kong et al.) That said, if you assess the magnitude of the indirect sibling effect as a fraction of the population effect, I calculate a 95% confidence interval of (-11%,10%) for Kong et al. and (5%,15%), so they remain largely consistent even when accounting for differences in predictive power of the PGS. If the editor agrees, I think a sentence in the discussion and a short supplemental note of how your estimates compare to more noisy estimates of indirect genetic effects which use sibling data would be valuable to readers. ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No 9 May 2022 Submitted filename: SiblingIGE_response_May2022 (1).docx Click here for additional data file. 10 May 2022 Dear Dr Howe, We are pleased to inform you that your manuscript entitled "Evaluating indirect genetic effects of siblings using singletons" has been editorially accepted for publication in PLOS Genetics. Congratulations! Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made. Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org. In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field. This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date. Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics! Yours sincerely, Heather J Cordell Associate Editor PLOS Genetics Gregory Barsh Editor-in-Chief PLOS Genetics www.plosgenetics.org Twitter: @PLOSGenetics ---------------------------------------------------- Comments from the reviewers (if applicable): ---------------------------------------------------- Data Deposition If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website. The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-21-00943R3 More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support. Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present. ---------------------------------------------------- Press Queries If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org. 13 Jun 2022 PGENETICS-D-21-00943R3 Evaluating indirect genetic effects of siblings using singletons Dear Dr Howe, We are pleased to inform you that your manuscript entitled "Evaluating indirect genetic effects of siblings using singletons" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work! With kind regards, Anita Estes PLOS Genetics On behalf of: The PLOS Genetics Team Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom plosgenetics@plos.org | +44 (0) 1223-442823 plosgenetics.org | Twitter: @PLOSGenetics

33 in total

Review 1. Interaction revisited: the difference between two estimates.

Authors: Douglas G Altman; J Martin Bland
Journal: BMJ Date: 2003-01-25

2. Absence of siblings--a risk factor for hypertension?

Authors: M Trevisan; V Krogh; L Klimowski; S Bland; W Winkelstein
Journal: N Engl J Med Date: 1991-05-02 Impact factor: 91.245

3. Selection against variants in the genome associated with educational attainment.

Authors: Augustine Kong; Michael L Frigge; Gudmar Thorleifsson; Hreinn Stefansson; Alexander I Young; Florian Zink; Gudrun A Jonsdottir; Aysu Okbay; Patrick Sulem; Gisli Masson; Daniel F Gudbjartsson; Agnar Helgason; Gyda Bjornsdottir; Unnur Thorsteinsdottir; Kari Stefansson
Journal: Proc Natl Acad Sci U S A Date: 2017-01-17 Impact factor: 11.205

4. Sibling imitation and contrast effects.

Authors: G Carey
Journal: Behav Genet Date: 1986-05 Impact factor: 2.805

Review 5. Deconstructing the sources of genotype-phenotype associations in humans.

Authors: Alexander I Young; Stefania Benonisdottir; Molly Przeworski; Augustine Kong
Journal: Science Date: 2019-09-27 Impact factor: 47.728

6. Relation between number of siblings and adult mortality and stroke risk: 25 year follow up of men in the Collaborative study.

Authors: C L Hart; G Davey Smith
Journal: J Epidemiol Community Health Date: 2003-05 Impact factor: 3.710

7. Variable prediction accuracy of polygenic scores within an ancestry group.

Authors: Hakhamanesh Mostafavi; Arbel Harpak; Ipsita Agarwal; Dalton Conley; Jonathan K Pritchard; Molly Przeworski
Journal: Elife Date: 2020-01-30 Impact factor: 8.140

8. Phenome-wide heritability analysis of the UK Biobank.

Authors: Tian Ge; Chia-Yen Chen; Benjamin M Neale; Mert R Sabuncu; Jordan W Smoller
Journal: PLoS Genet Date: 2017-04-07 Impact factor: 5.917

9. Defining the role of common variation in the genomic and biological architecture of adult human height.

Authors: Andrew R Wood; Tonu Esko; Jian Yang; Sailaja Vedantam; Tune H Pers; Stefan Gustafsson; Audrey Y Chu; Karol Estrada; Jian'an Luan; Zoltán Kutalik; Najaf Amin; Martin L Buchkovich; Damien C Croteau-Chonka; Felix R Day; Yanan Duan; Tove Fall; Rudolf Fehrmann; Teresa Ferreira; Anne U Jackson; Juha Karjalainen; Ken Sin Lo; Adam E Locke; Reedik Mägi; Evelin Mihailov; Eleonora Porcu; Joshua C Randall; André Scherag; Anna A E Vinkhuyzen; Harm-Jan Westra; Thomas W Winkler; Tsegaselassie Workalemahu; Jing Hua Zhao; Devin Absher; Eva Albrecht; Denise Anderson; Jeffrey Baron; Marian Beekman; Ayse Demirkan; Georg B Ehret; Bjarke Feenstra; Mary F Feitosa; Krista Fischer; Ross M Fraser; Anuj Goel; Jian Gong; Anne E Justice; Stavroula Kanoni; Marcus E Kleber; Kati Kristiansson; Unhee Lim; Vaneet Lotay; Julian C Lui; Massimo Mangino; Irene Mateo Leach; Carolina Medina-Gomez; Michael A Nalls; Dale R Nyholt; Cameron D Palmer; Dorota Pasko; Sonali Pechlivanis; Inga Prokopenko; Janina S Ried; Stephan Ripke; Dmitry Shungin; Alena Stancáková; Rona J Strawbridge; Yun Ju Sung; Toshiko Tanaka; Alexander Teumer; Stella Trompet; Sander W van der Laan; Jessica van Setten; Jana V Van Vliet-Ostaptchouk; Zhaoming Wang; Loïc Yengo; Weihua Zhang; Uzma Afzal; Johan Arnlöv; Gillian M Arscott; Stefania Bandinelli; Amy Barrett; Claire Bellis; Amanda J Bennett; Christian Berne; Matthias Blüher; Jennifer L Bolton; Yvonne Böttcher; Heather A Boyd; Marcel Bruinenberg; Brendan M Buckley; Steven Buyske; Ida H Caspersen; Peter S Chines; Robert Clarke; Simone Claudi-Boehm; Matthew Cooper; E Warwick Daw; Pim A De Jong; Joris Deelen; Graciela Delgado; Josh C Denny; Rosalie Dhonukshe-Rutten; Maria Dimitriou; Alex S F Doney; Marcus Dörr; Niina Eklund; Elodie Eury; Lasse Folkersen; Melissa E Garcia; Frank Geller; Vilmantas Giedraitis; Alan S Go; Harald Grallert; Tanja B Grammer; Jürgen Gräßler; Henrik Grönberg; Lisette C P G M de Groot; Christopher J Groves; Jeffrey Haessler; Per Hall; Toomas Haller; Goran Hallmans; Anke Hannemann; Catharina A Hartman; Maija Hassinen; Caroline Hayward; Nancy L Heard-Costa; Quinta Helmer; Gibran Hemani; Anjali K Henders; Hans L Hillege; Mark A Hlatky; Wolfgang Hoffmann; Per Hoffmann; Oddgeir Holmen; Jeanine J Houwing-Duistermaat; Thomas Illig; Aaron Isaacs; Alan L James; Janina Jeff; Berit Johansen; Åsa Johansson; Jennifer Jolley; Thorhildur Juliusdottir; Juhani Junttila; Abel N Kho; Leena Kinnunen; Norman Klopp; Thomas Kocher; Wolfgang Kratzer; Peter Lichtner; Lars Lind; Jaana Lindström; Stéphane Lobbens; Mattias Lorentzon; Yingchang Lu; Valeriya Lyssenko; Patrik K E Magnusson; Anubha Mahajan; Marc Maillard; Wendy L McArdle; Colin A McKenzie; Stela McLachlan; Paul J McLaren; Cristina Menni; Sigrun Merger; Lili Milani; Alireza Moayyeri; Keri L Monda; Mario A Morken; Gabriele Müller; Martina Müller-Nurasyid; Arthur W Musk; Narisu Narisu; Matthias Nauck; Ilja M Nolte; Markus M Nöthen; Laticia Oozageer; Stefan Pilz; Nigel W Rayner; Frida Renstrom; Neil R Robertson; Lynda M Rose; Ronan Roussel; Serena Sanna; Hubert Scharnagl; Salome Scholtens; Fredrick R Schumacher; Heribert Schunkert; Robert A Scott; Joban Sehmi; Thomas Seufferlein; Jianxin Shi; Karri Silventoinen; Johannes H Smit; Albert Vernon Smith; Joanna Smolonska; Alice V Stanton; Kathleen Stirrups; David J Stott; Heather M Stringham; Johan Sundström; Morris A Swertz; Ann-Christine Syvänen; Bamidele O Tayo; Gudmar Thorleifsson; Jonathan P Tyrer; Suzanne van Dijk; Natasja M van Schoor; Nathalie van der Velde; Diana van Heemst; Floor V A van Oort; Sita H Vermeulen; Niek Verweij; Judith M Vonk; Lindsay L Waite; Melanie Waldenberger; Roman Wennauer; Lynne R Wilkens; Christina Willenborg; Tom Wilsgaard; Mary K Wojczynski; Andrew Wong; Alan F Wright; Qunyuan Zhang; Dominique Arveiler; Stephan J L Bakker; John Beilby; Richard N Bergman; Sven Bergmann; Reiner Biffar; John Blangero; Dorret I Boomsma; Stefan R Bornstein; Pascal Bovet; Paolo Brambilla; Morris J Brown; Harry Campbell; Mark J Caulfield; Aravinda Chakravarti; Rory Collins; Francis S Collins; Dana C Crawford; L Adrienne Cupples; John Danesh; Ulf de Faire; Hester M den Ruijter; Raimund Erbel; Jeanette Erdmann; Johan G Eriksson; Martin Farrall; Ele Ferrannini; Jean Ferrières; Ian Ford; Nita G Forouhi; Terrence Forrester; Ron T Gansevoort; Pablo V Gejman; Christian Gieger; Alain Golay; Omri Gottesman; Vilmundur Gudnason; Ulf Gyllensten; David W Haas; Alistair S Hall; Tamara B Harris; Andrew T Hattersley; Andrew C Heath; Christian Hengstenberg; Andrew A Hicks; Lucia A Hindorff; Aroon D Hingorani; Albert Hofman; G Kees Hovingh; Steve E Humphries; Steven C Hunt; Elina Hypponen; Kevin B Jacobs; Marjo-Riitta Jarvelin; Pekka Jousilahti; Antti M Jula; Jaakko Kaprio; John J P Kastelein; Manfred Kayser; Frank Kee; Sirkka M Keinanen-Kiukaanniemi; Lambertus A Kiemeney; Jaspal S Kooner; Charles Kooperberg; Seppo Koskinen; Peter Kovacs; Aldi T Kraja; Meena Kumari; Johanna Kuusisto; Timo A Lakka; Claudia Langenberg; Loic Le Marchand; Terho Lehtimäki; Sara Lupoli; Pamela A F Madden; Satu Männistö; Paolo Manunta; André Marette; Tara C Matise; Barbara McKnight; Thomas Meitinger; Frans L Moll; Grant W Montgomery; Andrew D Morris; Andrew P Morris; Jeffrey C Murray; Mari Nelis; Claes Ohlsson; Albertine J Oldehinkel; Ken K Ong; Willem H Ouwehand; Gerard Pasterkamp; Annette Peters; Peter P Pramstaller; Jackie F Price; Lu Qi; Olli T Raitakari; Tuomo Rankinen; D C Rao; Treva K Rice; Marylyn Ritchie; Igor Rudan; Veikko Salomaa; Nilesh J Samani; Jouko Saramies; Mark A Sarzynski; Peter E H Schwarz; Sylvain Sebert; Peter Sever; Alan R Shuldiner; Juha Sinisalo; Valgerdur Steinthorsdottir; Ronald P Stolk; Jean-Claude Tardif; Anke Tönjes; Angelo Tremblay; Elena Tremoli; Jarmo Virtamo; Marie-Claude Vohl; Philippe Amouyel; Folkert W Asselbergs; Themistocles L Assimes; Murielle Bochud; Bernhard O Boehm; Eric Boerwinkle; Erwin P Bottinger; Claude Bouchard; Stéphane Cauchi; John C Chambers; Stephen J Chanock; Richard S Cooper; Paul I W de Bakker; George Dedoussis; Luigi Ferrucci; Paul W Franks; Philippe Froguel; Leif C Groop; Christopher A Haiman; Anders Hamsten; M Geoffrey Hayes; Jennie Hui; David J Hunter; Kristian Hveem; J Wouter Jukema; Robert C Kaplan; Mika Kivimaki; Diana Kuh; Markku Laakso; Yongmei Liu; Nicholas G Martin; Winfried März; Mads Melbye; Susanne Moebus; Patricia B Munroe; Inger Njølstad; Ben A Oostra; Colin N A Palmer; Nancy L Pedersen; Markus Perola; Louis Pérusse; Ulrike Peters; Joseph E Powell; Chris Power; Thomas Quertermous; Rainer Rauramaa; Eva Reinmaa; Paul M Ridker; Fernando Rivadeneira; Jerome I Rotter; Timo E Saaristo; Danish Saleheen; David Schlessinger; P Eline Slagboom; Harold Snieder; Tim D Spector; Konstantin Strauch; Michael Stumvoll; Jaakko Tuomilehto; Matti Uusitupa; Pim van der Harst; Henry Völzke; Mark Walker; Nicholas J Wareham; Hugh Watkins; H-Erich Wichmann; James F Wilson; Pieter Zanen; Panos Deloukas; Iris M Heid; Cecilia M Lindgren; Karen L Mohlke; Elizabeth K Speliotes; Unnur Thorsteinsdottir; Inês Barroso; Caroline S Fox; Kari E North; David P Strachan; Jacques S Beckmann; Sonja I Berndt; Michael Boehnke; Ingrid B Borecki; Mark I McCarthy; Andres Metspalu; Kari Stefansson; André G Uitterlinden; Cornelia M van Duijn; Lude Franke; Cristen J Willer; Alkes L Price; Guillaume Lettre; Ruth J F Loos; Michael N Weedon; Erik Ingelsson; Jeffrey R O'Connell; Goncalo R Abecasis; Daniel I Chasman; Michael E Goddard; Peter M Visscher; Joel N Hirschhorn; Timothy M Frayling
Journal: Nat Genet Date: 2014-10-05 Impact factor: 38.330

10. Estimation of Parental Effects Using Polygenic Scores.

Authors: Jared V Balbona; Yongkang Kim; Matthew C Keller
Journal: Behav Genet Date: 2021-01-02 Impact factor: 2.965

1 in total

1. Estimating effects of parents' cognitive and non-cognitive skills on offspring education using polygenic scores.

Authors: Perline A Demange; Jouke Jan Hottenga; Abdel Abdellaoui; Espen Moen Eilertsen; Margherita Malanchini; Benjamin W Domingue; Emma Armstrong-Carter; Eveline L de Zeeuw; Kaili Rimfeld; Dorret I Boomsma; Elsje van Bergen; Gerome Breen; Michel G Nivard; Rosa Cheesman
Journal: Nat Commun Date: 2022-08-23 Impact factor: 17.694

1 in total