Literature DB >> 33414549

Investigating the genetic architecture of noncognitive skills using GWAS-by-subtraction.

Perline A Demange1,2,3, Margherita Malanchini4,5,6, Travis T Mallard6, Pietro Biroli7, Simon R Cox8, Andrew D Grotzinger6, Elliot M Tucker-Drob6,9, Abdel Abdellaoui1,10, Louise Arseneault5, Elsje van Bergen1,3, Dorret I Boomsma1, Avshalom Caspi5,11,12,13, David L Corcoran12, Benjamin W Domingue14, Kathleen Mullan Harris15, Hill F Ip1, Colter Mitchell16, Terrie E Moffitt5,11,12,13, Richie Poulton17, Joseph A Prinz12, Karen Sugden11, Jasmin Wertz11, Benjamin S Williams11, Eveline L de Zeeuw1,3, Daniel W Belsky18,19, K Paige Harden20, Michel G Nivard21.   

Abstract

Little is known about the genetic architecture of traits affecting educational attainment other than cognitive ability. We used genomic structural equation modeling and prior genome-wide association studies (GWASs) of educational attainment (n = 1,131,881) and cognitive test performance (n = 257,841) to estimate SNP associations with educational attainment variation that is independent of cognitive ability. We identified 157 genome-wide-significant loci and a polygenic architecture accounting for 57% of genetic variance in educational attainment. Noncognitive genetics were enriched in the same brain tissues and cell types as cognitive performance, but showed different associations with gray-matter brain volumes. Noncognitive genetics were further distinguished by associations with personality traits, less risky behavior and increased risk for certain psychiatric disorders. For socioeconomic success and longevity, noncognitive and cognitive-performance genetics demonstrated associations of similar magnitude. By conducting a GWAS of a phenotype that was not directly measured, we offer a view of genetic architecture of noncognitive skills influencing educational success.

Entities:  

Mesh:

Year:  2021        PMID: 33414549      PMCID: PMC7116735          DOI: 10.1038/s41588-020-00754-2

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


“It takes something more than intelligence to act intelligently.” – Fyodor Dostoyevsky, Crime and Punishment Success in school—and life—depends on skills beyond cognitive ability[1-4]. Randomized trials of early-life education interventions find substantial benefits to educational outcomes, employment, and adult health, even though the interventions have no lasting effects on children’s cognitive functions[5,6]. These results have captured attention of educators and policy makers, motivating interest in so-called “non-cognitive skills”[7-9]. Non-cognitive skills suspected to be important for educational success include motivation, curiosity, persistence, and self-control[1,10-13]. However, questions have been raised about the substance of these skills and the magnitudes of their impacts on life outcomes[14]. Twin studies find evidence that non-cognitive skills are heritable[3,15-18]. Genetic analysis could help clarify the contribution of these skills to educational attainment and elucidate their connections with other traits. However, lack of consistent and reliable measurements of non-cognitive skills in existing genetic datasets pose challenges[19]. To overcome these challenges, we designed a GWAS of a latent trait, i.e. a trait not measured in any of the genotyped subjects[20]. We borrowed the strategy used in the original analysis of non-cognitive skills within the discipline of economics[21,22]: we defined genetic influences on non-cognitive skills as the genetic variation in educational attainment that was not explained by cognitive skills. We then performed GWAS on this residual “non-cognitive” genetic variation in educational attainment. This approach is a necessarily imperfect representation of the true relationship between cognitive and non-cognitive skills; in human development, cognitive abilities and other skills relevant for educational attainment likely interact dynamically, each influencing the other[23]. Our analysis excludes genetic influences on education-relevant skills that also influence measured cognitive abilities. The value of this imperfect approach is to make a quantity otherwise difficult to study tractable for analysis. We conducted analysis using Genomic Structural Equation Modeling (Genomic-SEM)[24] applied to published GWAS summary statistics for educational attainment and cognitive performance[25]. Our analysis used these summary statistics to “subtract” genetic influence on cognitive performance from the association of each single-nucleotide polymorphism (SNP) with educational attainment. The remaining associations of each SNP with educational attainment formed a new GWAS of a non-cognitive skills phenotype that was never directly measured. We call this novel statistical approach GWAS-by-subtraction. We used results from the GWAS-by-subtraction of non-cognitive skills to conduct two sets of analyses. First, we conducted hypothesis-driven analysis using the phenotypic annotation approach[26]. We used genetic correlation and polygenic score analysis to test the hypothesis that non-cognitive skills influence educational and economic attainments and longevity and to investigate traits and behaviors that constitute non-cognitive skills. Second, we conducted hypothesis-free bioinformatic annotation analysis to explore the tissues, cell-types, and brain structures that might distinguish the biology of non-cognitive skills from the biology mediating cognitive influences on educational attainment.

Results

GWAS-by-subtraction identifies genetic associations with non-cognitive variance in educational attainment

The term “non-cognitive skills” was originally coined by economists studying individuals who were equivalent in cognitive ability but who differed in educational attainment[22]. Our analysis of non-cognitive skills was designed to mirror this original approach: we focused on genetic variation in educational outcomes not explained by genetic variation in cognitive ability. Specifically, we applied Genomic Structural Equation Modeling (Genomic-SEM)[24] to summary statistics from GWASs of educational attainment[25] and cognitive performance[25]. Both phenotypes were regressed on a latent factor representing genetic variance in cognitive performance (hereafter “Cog”). Educational attainment was further regressed on a second latent factor representing the residual genetic variance in educational attainment left over after regressing-out variance related to cognitive performance (hereafter “NonCog”). By construction, NonCog genetic variance was independent of Cog genetic variance (r g = 0). In other words, the NonCog factor represents genetic variation in educational attainment that is not accounted for by the Cog factor. These two latent factors were then regressed on individual SNPs, yielding a GWAS of the latent constructs NonCog and Cog. A graphical representation of the model is presented in . Parameters are derived in terms of the observed moments of the joint distribution of educational attainment, cognitive performance, and a SNP (see ). The NonCog latent factor accounted for 57% of total genetic variance in educational attainment. Using LD Score regression[27], we estimated SNP-heritability for NonCog to be h 2 = 0.0637 (SE = 0.0021). After conventional GWAS significance threshold correction, GWAS of NonCog identified 157 independent genome-wide significant lead SNPs (independent SNPs defined as outside a 250-kb window, or within a 250-kb window and r 2 < 0.1). The results from the NonCog GWAS are graphed as a Manhattan plot in . NonCog and Cog GWAS details are reported in , and the . In addition, we report a series of sensitivity analyses as follows: analysis of potential biases due to cohort differences ( and ); analysis of impact of allowing for positive genetic correlations between NonCog and Cog ( and , and and ; analysis of impact of allowing for a moderate causal effect of educational attainment on cognitive performance[28] ( and ).

Phenotypic annotation analysis elucidates behavioral, psychological and psychiatric correlates of non-cognitive skills genetics

Our phenotypic annotation analyses proceeded in two steps. First, we conducted polygenic score (PGS) and genetic correlation (rG) analysis to test whether our GWAS-by-subtraction succeeded in identifying genetic influences that were important to educational attainment and also distinct from genetic influences on cognitive ability. Second, we conducted PGS and rG analyses to explore how NonCog related to a network of phenotypes that psychology and economics research suggests might form the basis of non-cognitive influences on educational attainment. NonCog genetics are distinct from cognitive performance and are important to education, socioeconomic attainment, and longevity. To establish whether the Genomic-SEM GWAS-by-subtraction succeeded in isolating genetic variance in education that was independent of cognitive function, we compared genetic associations of NonCog and Cog with educational attainment and cognitive test performance. Results for analysis of education and cognitive test phenotypes are graphed in . We conducted PGS analysis of educational attainment in the Netherlands Twin Register[29] (NTR), National Longitudinal Study of Adolescent to Adult Health[30] (AddHealth), Dunedin Longitudinal Study[31], E-Risk[32], and Wisconsin Longitudinal Study[33] (WLS) cohorts (meta-analysis n = 24,056; cohorts descriptions in and and ). PGS effect-sizes were the same for NonCog and Cog (NonCog β = 0.24 (SE = 0.03), Cog β = 0.24 (SE = 0.02), P diff = 0.702; all PGS results are reported in and ). We conducted complementary genetic correlation analysis using Genomic SEM and GWAS summary statistics from a hold-out-sample GWAS of educational attainment (Supplementary Note). This analysis allowed us to compute an out-of-sample genetic correlation of NonCog with educational attainment. NonCog showed a stronger genetic correlation with educational attainment as compared to Cog (NonCog r = 0.71 (SE = 0.02), Cog r = 0.57 (SE = 0.02), P diff < 0.0001; all genetic correlation results are reported in and ). We conducted PGS analysis of cognitive test performance in the NTR, Texas Twin Project[34], Dunedin, E-Risk, and WLS cohorts (combined n = 11,351). The goal of our GWAS-by-subtraction analysis was to exclude, as much as possible, genetic variance in cognitive ability from genetic variance in skills relevant for education. Consistent with this goal, effect-sizes for NonCog PGS associations with full-scale IQ were smaller by half as compared to Cog PGS associations (NonCog β = 0.17 (SE = 0.02), Cog β = 0.29 (SE = 0.03); P diff < 0.0001). However, the non-zero correlation between the NonCog PGS and full-scale IQ is a reminder that the cognitive performance GWAS used in our GWAS-by-subtraction analyses does not capture the entirety of genetic influences on all forms of cognitive tests measured at all points in the lifespan. Additional PGS analyses of IQ subscales are reported in and and . We conducted complementary genetic correlation analysis using results from a published GWAS of childhood IQ[35]. Parallel to PGS analysis, the NonCog genetic correlation with childhood IQ was smaller by more than half as compared to the Cog genetic correlation (NonCog r g = 0.31 (SE = 0.06), Cog r g = 0.75 (SE = 0.08), P diff_fdr < 0.0001). Of the total genetic correlation between childhood IQ and educational attainment, 31% of the covariance was explained by NonCog and 69% by Cog. We next examined downstream economic and health outcomes associated with greater educational attainment[36,37]. In PGS analysis in the AddHealth and Dunedin cohorts (n = 6,358), NonCog and Cog PGSs showed similar associations with occupational attainment (NonCog β = 0.21 (SE = 0.01), Cog β = 0.21 (SE = 0.01), P diff = 0.902). In genetic correlation analysis, NonCog showed a similar relationship to income[38] as Cog (NonCog r g = 0.62, (SE = 0.04), Cog r g = 0.62 (SE = 0.04), P diff_fdr = 0.947) and a stronger relationship with neighborhood deprivation[38], a measure related to where a person can afford to live (NonCog r g = -0.51 (SE = 0.05), Cog r g = -0.32 (SE = 0.04), P diff_fdr = 0.001). In Genomic-SEM analysis, NonCog explained 53% of the genetic correlation between educational attainment and income and 65% of the genetic correlation between educational attainment and neighborhood deprivation (). We conducted genetic correlation analysis of longevity based on GWAS of parental lifespan[39]. Genetic correlations were stronger for NonCog as compared to Cog (NonCog r g = 0.37 (SE = 0.03); Cog r g = 0.27 (SE = 0.03); P diff_fdr = 0.024). In Genomic-SEM analysis, NonCog explained 61% of the genetic correlation between educational attainment and longevity. In sum, NonCog and Cog genetics showed similar relationships with educational attainment and its long-term outcomes, despite NonCog genetic having a much weaker relationship to measured cognitive test performance than Cog genetics. These findings broadly support the hypothesis that non-cognitive skills distinct from cognitive abilities are an important contributor to success across the life course. We next conducted a series of genetic correlation analyses to explore the network of phenotypes to which NonCog was genetically correlated. To develop understanding of the substance of non-cognitive skills, we tested where in that network of phenotypes genetic correlations with NonCog diverged from genetic correlations with Cog. Our analysis was organized around four themes: decision-making preferences, health-risk and fertility behaviors, personality traits, and psychiatric disorders. Results of genetic correlation analyses are graphed in and . Results are reported in . NonCog genetics were associated with decision-making preferences. In economics, non-cognitive influences on achievement and health are often studied in relation to decision-making preferences[40-43]. NonCog was genetically correlated with higher tolerance of risks[44] (r g = 0.10 (SE = 0.03)) and willingness to forego immediate gratification in favor of a larger reward at a later time[45] (delay discounting r g = -0.52 (SE = 0.08)). In contrast, Cog was genetically correlated with generally more cautious decision-making characterized by lower levels of risk tolerance (r g = -0.35 (SE = 0.07), P diff_fdr < 0.0001) and delay discounting (r g = -0.35 (SE = 0.07), P diff_fdr = 0.082). NonCog genetics were associated with less health-risk behavior and delayed fertility. An alternative approach to studying specific non-cognitive skills is to infer individual differences in non-cognitive skills from patterns of health-risk behavior. NonCog was genetically correlated with less health-risk behavior as indicated by analysis of obesity[46], substance use[44,47-50], and sexual behaviors and early fertility[44,51,52] (r g range 0.2-0.5), with the exception that the r g with alcohol use was not different from zero and r g with cannabis use was positive. Genetic correlations for Cog were generally in the same direction but of smaller magnitude. NonCog genetics were associated with a broad spectrum of personality characteristics linked with social and professional competency. In psychology, non-cognitive influences on achievement are conceptualized as personality traits, i.e. patterns of stable individual differences in emotion and behavior. The model of personality that has received the most attention in genetics is a five-factor model referred to as the Big Five. Genetic correlation analysis of the Big Five personality traits[53-55] revealed NonCog genetics were most strongly associated with Openness to Experience (being curious and eager to learn; r g = 0.30 (SE = 0.04)) and were further associated with a pattern of personality characteristic of changes that occur as people mature in adulthood[56]. Specifically, NonCog showed a positive r g with Conscientiousness (being industrious and orderly; r g = 0.13 (SE = 0.03)), Extraversion (being enthusiastic and assertive; r g = 0.14 (SE = 0.03)), and Agreeableness (being polite and compassionate; r g = 0.14 (SE = 0.05)), and negative r g with Neuroticism (being emotionally volatile; r g = -0.15 (SE = 0.04)). Genetic correlations of Cog with Openness to Experience and Neuroticism were similar to those for NonCog (P diff_fdr-Openness = 0.040, P diff_fdr-Neuroticism = 0.470). In contrast, genetic correlations of Cog with Conscientiousness, Extraversion, and Agreeableness were in the opposite direction (r g = -0.25 to -0.12, P diff_fdr < 0.0005). PGS analysis of personality traits is reported in , , and the . NonCog genetics were associated with higher risk for multiple psychiatric disorders. In clinical psychology and psychiatry, research is focused on mental disorders. Mental disorders are generally associated with impairments in academic achievement and social role functioning[57,58]. However, positive genetic correlations with educational attainment and creativity have been reported for some disorders[59,60]. We therefore tested NonCog r with psychiatric disorders based on published case-control GWAS of mental disorders[61-67]. NonCog was associated with higher risk for multiple clinically defined disorders, including anorexia nervosa (r g = 0.26 (SE = 0.04)), obsessive-compulsive disorder (r g = 0.31 (SE = 0.06)), bipolar disorder (r g = 0.27 (SE = 0.03)), and schizophrenia (r g = 0.26 (SE = 0.02)). Genetic correlations between Cog and psychiatric disorders were either smaller in magnitude (anorexia nervosa r g = 0.08 (SE = 0.03), P diff_fdr < 0.001; obsessive-compulsive disorder r g = 0.05 (SE = 0.05), P diff_fdr = 0.002) or in the opposite direction (bipolar disorder r g = -0.07 (SE = 0.03), P diff_fdr < 0.001; schizophrenia r g = -0.22 (SE = 0.02), P diff_fdr < 0.001). Both NonCog and Cog showed negative genetic correlations with attention-deficit/hyperactivity disorder (NonCog r g = -0.37 (SE = 0.03), Cog r g = -0.37 (SE = 0.04), P diff_fdr = 0.947). In sum, NonCog genetics were associated with phenotypes from economics and psychology thought to mediate non-cognitive influences on educational success. These associations contrasted with associations for Cog genetics, supporting distinct pathways of influence on achievement in school and later in life. Opposing patterns of association were also observed for psychiatric disorders, suggesting that the unexpected positive genetic correlation between educational attainment and mental health problems uncovered in previous studies[60,68,69] arises from non-cognitive genetic influences on educational attainment.

Biological annotation analyses reveal shared and specific neurobiological correlates

The goal of biological annotation of GWAS discoveries is to elucidate molecular mechanisms mediating genetic influences on the phenotype of interest. Our biological annotation analysis proceeded in two steps. First, we conducted enrichment analysis to test whether some tissues and cell-types were more likely to mediate NonCog and Cog heritabilities than others. Second, we conducted genetic correlation analysis to explore how NonCog and Cog genetics related to different brain structures. NonCog and Cog genetics were enriched in similar tissues and cells. We tested whether common variants in genes specifically expressed in 53 GTEx tissues[70] or in 152 tissues captured in a previous aggregation of RNA-seq studies[71,72] were enriched in their effects on Cog or NonCog. Genes predominantly expressed in the brain rather than peripheral tissues were enriched in both NonCog and Cog (). To examine expression patterns at a more granular level of analysis, we used MAGMA[73] and stratified LD score regression[74] to test enrichment of common variants in 265 nervous system cell-type-specific gene-sets[75] (). In MAGMA analysis, common variants in 95 of 265 gene-sets were enriched for association with NonCog. The enriched cell-types were predominantly neurons (97%), with enrichment most pronounced for telencephalon-projecting neurons, di- and mesencephalon neurons, and to a lesser extent, telencephalon interneurons ( and ). Enrichment for Cog was similar to NonCog (correlation between Z-statistics Pearson’s r = 0.85), and there were no differences in cell-type-specific enrichment, suggesting that the same types of brain cells mediate genetic influences on NonCog and Cog (). Stratified LDSC results were similar to results from MAGMA (, , and ). The absence of differences in cell-type specific enrichment is surprising given that NonCog and Cog are genetically uncorrelated. We therefore used the TWAS/Fusion tool[76] to conduct gene-level analysis. This analysis revealed a mixture of concordant and discordant gene effects on NonCog and Cog consistent with the genetic correlation of zero (, , and ). NonCog and Cog genetics show diverging associations with total and regional brain volumes. Educational attainment has previously been found to be genetically correlated with greater total brain volume[77,78]. We therefore used a GWAS of regional brain volume to compare the r g of NonCog and Cog with total brain volume and with 100 regional brain volumes (99 gray matter volumes and white matter volume) controlling for total brain volume ()[79]. For total brain volume, genetic correlation was stronger for Cog as compared to NonCog (Cog r g = 0.22 (SE = 0.04), NonCog r g = 0.07 (SE = 0.03), P diff = 0.005). Total gray matter volume, controlling for total brain volume, was not associated with either NonCog or Cog (NonCog: r g = 0.07 (SE = 0.04); Cog: r g = 0.06 (SE = 0.04)). For total white matter volume, conditional on total brain volume, genetic correlation was weakly negative for NonCog as compared to Cog (NonCog r g = -0.12 (SE = 0.04), Cog (r g = -0.01 (SE = 0.04), P diff = 0.04). NonCog was not associated with any of the regional gray-matter volumes after FDR correction. In contrast, Cog was significantly associated with regional gray-matter volumes for the bilateral fusiform, insula and posterior cingulate (r g range 0.11-0.17), as well as left superior temporal (r g = 0.11 (SE = 0.04)), left pericalcarine (r g = -0.16 (SE = 0.05)) and right superior parietal volumes (r g = -0.22 (SE = 0.06)) (). Finally, we tested genetic correlation of NonCog and Cog with white matter tract integrity as measured using diffusion tensor imaging (DTI)[80]. Analyses included 5 DTI parameters in each of 22 white matter tracts (). NonCog was positively associated with the mode of anisotropy parameter (which denotes a more tubular, as opposed to planar, water diffusion) in the corticospinal tract, retrolenticular limb of the internal capsule, and splenium of the corpus callosum (). However, all correlations were small (0.10 < r g < 0.14), and we detected no genetic correlations that differed between NonCog and Cog ().

Discussion

GWAS of non-cognitive influences on educational attainment identified 157 independent loci and polygenic architecture accounting for more than half the genetic variance in educational attainment. In genetic correlation and PGS analysis, these non-cognitive (NonCog) genetics showed similar magnitude of associations with educational attainment, economic attainment, and longevity to genetics associated with cognitive influences on educational attainment (Cog). As expected, NonCog genetics had much weaker associations with cognition phenotypes as compared to Cog genetics. These results contribute new GWAS evidence in support of the hypothesis that heritable non-cognitive skills influence educational attainment and downstream life-course economic and health outcomes. Phenotypic and biological annotation analyses shed light on the substance of heritable non-cognitive skills influencing education. Economists hypothesize that preferences that guide decision-making in the face of risk and delayed rewards represent non-cognitive influences on educational attainment. Consistent with this hypothesis, NonCog genetics were associated with higher risk tolerance and lower time discounting. These decision-making preferences are associated with financial wealth, whereas opposite preferences are hypothesized to contribute to a feedback loop perpetuating poverty[81]. Consistent with results from analysis of decision-making preferences, NonCog genetics were also associated with healthier behavior and later fertility. Psychologists hypothesize that the Big Five personality characteristics of conscientiousness and openness are the two “pillars of educational success”[2,3,82]. Our results provide some support for this hypothesis, with the strongest genetic correlation evident for openness. However, they also show that non-cognitive skills encompass the full range of personality traits, including agreeableness, extraversion, and the absence of neuroticism. This pattern mirrors the pattern of personality change that occurs as young people mature into adulthood[56]. Thus, non-cognitive skills share genetic etiology with what might be termed as “mature personality”. The absolute magnitudes of genetic correlations between NonCog and individual personality traits are modest. This result suggests that the personality traits described by psychologists capture some, but not all, genetic influence on non-cognitive skills. Although the general pattern of findings in our phenotypic annotation analysis indicated non-cognitive skills were genetically related to socially desirable characteristics and behaviors, there was an important exception. Genetic correlation analysis of psychiatric disorder GWAS revealed positive associations of NonCog genetics with schizophrenia, bipolar disorder, anorexia nervosa, and obsessive-compulsive disorder. Previously, these psychiatric disorders have been shown to have a positive r g with educational attainment, a result that has been characterized as paradoxical given the impairments in educational and occupational functioning typical of serious mental illness. Our results clarify that these associations are driven by non-cognitive factors associated with success in education. These results align with the theory that clinically defined psychiatric disorders represent extreme manifestations of dimensional psychological traits, which might be associated with adaptive functioning within the normal range[83-85]. Finally, biological annotation analyses suggested that genetic variants contributing to educational attainment not mediated through cognitive abilities are enriched in genes expressed in the brain, specifically in neurons. Even though NonCog and Cog were genetically uncorrelated, variants in the same neuron-specific gene-sets were enriched for both traits. Although we found some evidence of differences between NonCog and Cog in associations with gray matter volumes, moderate sample sizes in neuroimaging GWAS mean these results must be treated as preliminary, requiring replication with data from larger-scale GWAS of white-matter and gray-matter phenotypes. Limited differentiation of NonCog and Cog in biological annotation analyses focused at the levels of tissue and cell type highlights need for finer-grained molecular data resources to inform these analyses and the complementary value of phenotypic annotation analyses focused at the level of psychology and behavior. We acknowledge limitations. Cognitive and non-cognitive skills develop in interaction with one another. For example, the dynamic mutualism hypothesis[86] proposes that non-cognitive characteristics shape investments of time and effort, leading to differences in the pace of cognitive development[87,88]. However, in Genomic-SEM analysis, the NonCog factor is, by construction, uncorrelated with genetic influences on adult cognition as measured in the Cog GWAS. Our statistical separation of NonCog from cognition is thus a simplified representation of development. Longitudinal studies with repeated measures of cognitive and candidate non-cognitive skills are needed to study their reciprocal relationships across development[89,90]. Our statistical separation of NonCog from cognition is also incomplete. The ability to control statistically for any variable, genetic or otherwise, depends on how well and comprehensively that variable is measured[91]. The tests of cognitive performance included in the Cog GWAS likely do not capture all genetic influences on all forms of cognitive ability across the lifespan[92,93]. Despite these limitations, our simplified and incomplete statistical separation of NonCog from Cog allowed us to test whether heritable traits other than cognitive ability influenced educational attainment and to explore what those traits might be. Because our analysis was based on GWAS of educational attainment, non-cognitive genetics identified here may differ from non-cognitive genetics affecting other socioeconomic attainments like income, or traits and behaviors that mediate responses to early childhood interventions, to the extent that those genetics do not affect educational attainment. Parallel analysis of alternative attainment phenotypes will clarify the specificity of discovered non-cognitive genetics. In the case of GWAS of educational attainment, the included samples were drawn mainly from Western Europe and the U.S., and participants completed their education in the late 20th and early 21st centuries. The phenotype of educational attainment reflects an interaction between an individual and the social system in which they are educated. Differences across social systems, including education policy, culture, and historical context, may result in different heritable traits influencing on educational attainment[94]. Results therefore may not generalize beyond the times and places GWAS samples were collected. Generalization of the NonCog factor is also limited by restriction of included GWAS to individuals of European ancestry. Lack of methods for integrating genome-scale genetic data across populations with different ancestries[95,96] requires this restriction, but raises threats to external validity. GWAS of other ancestries and development of methods for trans-ancestry analysis can enable analysis of (Non)Cog in non-European populations. Within the bounds of these limitations, results illustrate the application of Genomic-SEM to conduct GWAS of a phenotype not directly measured in GWAS databases. This application could have broad utility beyond the genetics of educational attainment. The GWAS-by-subtraction method allowed us to study a previously hard-to-interpret residual value. Our analysis provides a first view of the genetic architecture of non-cognitive skills influencing educational success. These skills are central to theories of human capital formation within the social and behavioral sciences and are increasingly the targets of social policy interventions. Our results establish that non-cognitive skills are central to the heritability of educational attainment and illuminate connections between genetic influences on these skills and social and behavioral science phenotypes.

Methods

Meta-analysis of educational attainment GWAS

We reproduced the Social Science Genetic Association Consortium (SSGAC) 2018 GWAS of educational attainment[25] by meta-analyzing published summary statistics for n = 766,345 (www.thessgac.org/data) with summary statistics obtained from 23andMe, Inc. (n = 365,538). We included SNPs with sample size > 500,000 and MAF > 0.005 in the 1000 Genomes reference set (10,101,243 SNPs). We did not apply genomic control, as standard errors of publicly available and 23andMe summary statistics were already corrected[25]. Meta-analysis was performed using METAL[100].

GWAS-by-subtraction

The objective of our GWAS-by-subtraction analysis was to estimate, for each SNP, the association with educational attainment that was independent of that SNP’s association with cognition (hereafter, the NonCog SNP effect). We used Genomic-SEM[24] in R 3.4.3 to analyze GWAS summary statistics for the educational attainment and cognitive performance phenotypes in the SSGAC’s 2018 GWAS[25]. The model regressed the educational-attainment and cognitive-performance summary statistics on two latent variables, Cog and NonCog (). Cog and NonCog were then regressed on each SNP in the genome. This analysis allowed for two paths of association with educational attainment for each SNP. One path was fully mediated by Cog. The other path was independent of Cog and measured the non-cognitive SNP effect, NonCog. To identify independent hits with P < 5 × 10-8 (the customary P-value threshold to approximate an alpha value of 0.05 in GWAS), we pruned the results using a radius of 250 kb and an LD threshold of r [2] < 0.1 (). We explore alternative lead SNPs and loci definition in . The parameters estimated in a GWAS-by-subtraction and their derivation in terms of the genetic covariance are described in the (model specification), and practical analysis steps are further described in the (SNP filtering). The effective sample size of the NonCog and Cog GWAS was estimated to 510,795 and 257,700, respectively (see ). We investigated biases from unaccounted-for heterogeneity in overlap across SNPs in the educational attainment and cognitive performace GWAS and describe possible strategy to deal with it (). We investigated potential biases due to cohort differences in SNP heritability in the . We evaluated the consequences of modifying r g (NonCog, Cog) = 0 by evaluating r g = 0.1, 0.2 or 0.3, and we investigated the consequences of a violation of the assumed causation between cognitive performance and educational attainment in the .

Genetic correlations

We used Genomic-SEM to compute genetic correlations of Cog and NonCog with other education-linked traits for which well-powered GWAS data were available (SNP-h 2 z-statistics > 2; ) and to test whether genetic correlations with these traits differed between Cog and NonCog. Specifically, models tested the null hypothesis that trait genetic correlations with Cog and NonCog could be constrained to be equal using a chi-squared test with FDR adjustment to correct for multiple testing. The FDR adjustment was conducted across all genetic correlation analyses reported in the article, excluding the analyses of brain volumes described below. Finally, we used Genomic-SEM analysis of genetic correlations to estimate the percentage of the genetic covariance between educational attainment and the target traits that was explained by Cog and NonCog using the model illustrated in .

Polygenic score analysis

Polygenic score analyses were conducted in data drawn from six population-based cohorts from the Netherlands, the U.K., the U.S., and New Zealand: (1) the Netherlands Twin Register (NTR)[29,101], (2) E-Risk[32], (3) the Texas Twin Project[34], (4) the National Longitudinal Study of Adolescent to Adult Health (AddHealth)[30,102], dbGaP accession phs001367.v1.p1; (5) Wisconsin Longitudinal Study on Aging (WLS)[33], dbGaP accession phs001157.v1.p1; and (6) the Dunedin Multidisciplinary Health and Development Study[31]. and describe cohort-specific metrics, and we include a short description of the cohorts’ populations and recruitment in . Only participants with European ancestry were included in the analysis, due to the low portability of PGS between different ancestry populations. Polygenic scores were computed with PLINK based on weights derived using the LD-pred[103] software with an infinitesimal prior and the 1000 Genomes phase 3 sample as a reference for the LD structure. LD-pred weights were computed in a shared pipeline to ensure comparability between cohorts. Each outcome (e.g., IQ score) was regressed on the Cog and NonCog polygenic scores and a set of control variables (sex, 10 principal components derived from the genetic data and, for cohorts in which these quantities varied, genotyping chip and age), using Stata 14 for WLS, Stata 15 for E-Risk and the Dunedin Study, and R (versions 3.4.3 and newer) for NTR, AddHealth, and the Texas Twin Project. In cohorts containing related individuals, non-independence of observations from relatives was accounted for using generalized estimation equations (GEE) or by clustering of standard errors at the family level. We used a random effects meta-analysis to aggregate the results across the cohorts. This analysis allows a cohort-specific random intercept. Individual cohort results are in and meta-analytic estimates in .

Biological annotation

Enrichment of tissue-specific gene expression. We used gene-sets defined in Finucane et al.[104] to test for the enrichment of genes specifically expressed in one of 53 GTEx tissues[70], or 152 tissues captured by the Franke et al. aggregation of RNA-seq studies[71,72]. This analysis seeks to confirm the role of brain tissues in mediating Cog and NonCog influences on educational attainment. The exact analysis pipeline used is available online (https://github.com/bulik/ldsc/wiki/Cell-type-specific-analyses). Enrichment of cell-type specific expression. We leveraged single cell RNA sequencing (scRNA-seq) data of cells sampled from the mouse nervous system[75] to identify cell-type specific RNA expression. Zeisel et al.[75] sequenced cells obtained from 19 regions in the contiguous anatomical regions in the peripheral sensory, enteric, and sympathetic nervous system. After initial QC, they retained 492,949 cells, which were sampled down to 160,796 high quality cells. These cells were further grouped into clusters representing 265 broad cell-types. We analyzed the dataset published by Zeisel et al. containing mean transcript counts for all genes with count >1 for each of the 265 clusters (). We restricted analysis to genes with expression levels above the 25th percentile. For each gene in each cell-type, we computed the cell-type specific proportion of reads for the gene (normalizing the expression within cell-type). We then computed the proportion of proportions over the 265 cell-types (computing the specificity of the gene to a specific cell-type). We ranked the 12,119 genes retained in terms of specificity to each cell-type and then retained the 10% of genes most specific to a cell-type as the “cell-type specific” gene-set. We then tested whether any of the 265 cell-type specific gene-sets were enriched in the Cog or NonCog GWAS. This analysis sought to identify specific cell-types and specific regions in the brain involved in the etiology of Cog and NonCog. We further computed the difference in enrichment for Cog and NonCog to test whether any cell types were specific to either trait. For these analyses, we leveraged two widely used enrichment analysis tools: MAGMA[73] and stratified LD score regression[74] with the European reference panel from 1000 Genomes Project Phase 3 as SNP location and LD structure reference, Gencode release 19 as gene location reference and the human-mouse homology reference from MGI (http://www.informatics.jax.org/downloads/reports/HOM_MouseHumanSequence.rpt). MAGMA. We used MAGMA (v1.07b[73]), a program for gene-set analysis based on GWAS summary statistics. We computed gene-level association statistics using a window of 10 kb around the gene for both Cog and NonCog. We then used MAGMA to run a competitive gene-set analysis, using the gene P-values and gene correlation matrix (reflecting LD structure) produced in the gene-level analysis. The competitive gene-set analysis tests whether the genes within the cell-type-specific gene-set described above are more strongly associated with Cog/NonCog than other genes. Stratified LD-score regression. We used LD-score regression to compute LD scores for the SNPs in each of our “cell-type specific” gene-sets. Parallel to MAGMA analysis, we added a 10-kb window around each gene. We ran partitioned LD-score regression to compute the contribution of each gene-set to the heritability of Cog and NonCog. To guard against inflation, we used LD score best practices, and included the LD score baseline model (baselineLD.v2.2) in the analysis. We judged the statistical significance of the enrichment based on the P-value associated with the tau coefficient. Difference in enrichment between Cog and NonCog. To compute differences in enrichment, we compute a standardized difference between the per-annotation enrichment for Cog and NonCog as: where e Cog is the enrichment of a particular gene-set for Cog, eNonCog is the enrichment for the same gene-set for NonCog, seCog is the standard error of the enrichment for Cog, seNonCog is the standard error of the enrichment for NonCog, and CTI is the LD score cross-trait intercept, a metric of dependence between the GWASs of Cog and NonCog. We investigated the significance of the difference between Cog and NonCog tau coefficient with Equation 1 as well as by computing jackknifed standard errors. From the jackknifed estimates of the coefficient output by the LDSC software, we computed the jackknifed estimates and standard errors of the difference between Cog and NonCog tau coefficients, as well as a z-statistic for each annotation. Enrichment of gene expression in the brain. We performed a transcriptome-wide association study (TWAS) using FUSION[76] (http://gusevlab.org/projects/fusion/). We used pre-computed brain-gene-expression weights available on the FUSION website, generated from 452 human individuals as part of the CommonMind Consortium. We then superimposed the bivariate distribution of the results of the TWAS for Cog and NonCog over the bivariate distribution expected given the sample overlap between educational attainment and cognitive performance (the GWAS on which our GWAS of Cog and NonCog are based, see ).

Brain modalities

Brain volumes. We conducted genetic correlation analysis of brain volumes using GWAS results published by Zhao et al.[79], who performed GWAS of total brain volume and 100 regional brain volumes, including 99 gray matter volumes and total white matter volume (). Analyses included covariate adjustment for sex, age, their square interaction and 20 principle components. Analyses of regional brain volumes additionally included covariate adjustment for total brain volume. GWAS summary statistics for these 101 brain volumes were obtained from https://med.sites.unc.edu/bigs2/data/gwas-summary-statistics/. Summary statistics were filtered and pre-processed using Genomic-SEM’s “munge” function, retaining all HapMap3 SNPs with allele frequency > 0.01 outside the MHC region. We used Genomic-SEM to compute the genetic correlations between Cog, NonCog and brain volumes. Analyses of regional volumes controlled for total brain volume. For each volume, we tested whether correlations differed between Cog and NonCog. Specifically, we used a chi-squared test to evaluate the null hypothesis that the two genetic correlations were equal. We used FDR adjustment to correct for multiple testing. The FDR adjustment is applied to the results for all gray matter volumes for Cog and NonCog separately. White matter structures. We conducted genetic-correlation analysis of white-matter structures using GWAS results published by Zhao et al.[80], who performed GWAS of diffusion tensor imaging (DTI) measures of the integrity of white-matter tracts. DTI parameters were derived for fractional anisotropy (FA), mean diffusivity (MD), axial diffusivity (AD), radial diffusivity (RD), and mode of anisotropy (MO). Each of these parameters was measured for 22 white matter tracts of interests (), resulting in 110 GWAS. GWAS summary statistics for these 110 GWAS were obtained from https://med.sites.unc.edu/bigs2/data/gwas-summary-statistics/. Summary statistics were filtered and processed using Genomic-SEM’s “munge” function, retaining all HapMap3 SNPs with allele frequency > 0.01 outside the MHC region. For each white matter structure, we tested whether genetic correlations differed between Cog and NonCog. Specifically, we used a chi-squared test to evaluate the null hypothesis that the two genetic correlations were equal. We used FDR adjustment to correct for multiple testing. As these different diffusion parameters are statistically and logically interdependent, having been derived from the same tensor, FDR adjustment was applied to the results for each type of white matter diffusion parameter separately. FDR correction was applied separately for Cog and NonCog.

Additional Resources

A FAQ on why, how and what we studied is available here: https://medium.com/@kph3k/investigating-the-genetic-architecture-of-non-cognitive-skills-using-gwas-by-subtraction-b8743773ce44 A tutorial on how to perform GWAS-by-subtraction: http://rpubs.com/MichelNivard/565885 Additional resources to Genomic SEM software: A wiki including numerous tutorials: https://github.com/MichelNivard/GenomicSEM/wiki A Genomic SEM user group for specific questions relating to models and software: https://groups.google.com/g/genomic-sem-users A venue to report technical issues: https://github.com/MichelNivard/GenomicSEM/issues
  69 in total

Review 1.  Psychological correlates of university students' academic performance: a systematic review and meta-analysis.

Authors:  Michelle Richardson; Charles Abraham; Rod Bond
Journal:  Psychol Bull       Date:  2012-03       Impact factor: 17.737

2.  Skill formation and the economics of investing in disadvantaged children.

Authors:  James J Heckman
Journal:  Science       Date:  2006-06-30       Impact factor: 47.728

3.  "Same but different": Associations between multiple aspects of self-regulation, cognition, and academic abilities.

Authors:  Margherita Malanchini; Laura E Engelhardt; Andrew D Grotzinger; K Paige Harden; Elliot M Tucker-Drob
Journal:  J Pers Soc Psychol       Date:  2018-12-13

4.  Learning Motivation Mediates Gene-by-Socioeconomic Status Interaction on Mathematics Achievement in Early Childhood.

Authors:  Elliot M Tucker-Drob; K Paige Harden
Journal:  Learn Individ Differ       Date:  2011-12-09

5.  The Rate of Return to the High/Scope Perry Preschool Program.

Authors:  James J Heckman; Seong Hyeok Moon; Rodrigo Pinto; Peter A Savelyev; Adam Yavitz
Journal:  J Public Econ       Date:  2010-02-01

6.  A gradient of childhood self-control predicts health, wealth, and public safety.

Authors:  Terrie E Moffitt; Louise Arseneault; Daniel Belsky; Nigel Dickson; Robert J Hancox; Honalee Harrington; Renate Houts; Richie Poulton; Brent W Roberts; Stephen Ross; Malcolm R Sears; W Murray Thomson; Avshalom Caspi
Journal:  Proc Natl Acad Sci U S A       Date:  2011-01-24       Impact factor: 11.205

7.  Genetically-mediated associations between measures of childhood character and academic achievement.

Authors:  Elliot M Tucker-Drob; Daniel A Briley; Laura E Engelhardt; Frank D Mann; K Paige Harden
Journal:  J Pers Soc Psychol       Date:  2016-06-23

8.  Hard evidence on soft skills.

Authors:  James J Heckman; Tim Kautz
Journal:  Labour Econ       Date:  2012-08-01

9.  True grit and genetics: Predicting academic achievement from personality.

Authors:  Kaili Rimfeld; Yulia Kovas; Philip S Dale; Robert Plomin
Journal:  J Pers Soc Psychol       Date:  2016-02-11

10.  A systematic review and meta-analysis of effects of early life non-cognitive skills on academic, psychosocial, cognitive and health outcomes.

Authors:  Lisa G Smithers; Alyssa C P Sawyer; Catherine R Chittleborough; Neil M Davies; George Davey Smith; John W Lynch
Journal:  Nat Hum Behav       Date:  2018-11-05
View more
  19 in total

1.  Genetic and environmental influences on the progression from alcohol use disorder to alcohol-related medical conditions.

Authors:  Alexis C Edwards; Kristina Sundquist; Jan Sundquist; Kenneth S Kendler; Sara Larsson Lönn
Journal:  Alcohol Clin Exp Res       Date:  2021-12-19       Impact factor: 3.455

2.  Genome-wide analyses of individual differences in quantitatively assessed reading- and language-related skills in up to 34,000 people.

Authors:  Else Eising; Nazanin Mirza-Schreiber; Eveline L de Zeeuw; Carol A Wang; Dongnhu T Truong; Andrea G Allegrini; Chin Yang Shapland; Gu Zhu; Karen G Wigg; Margot L Gerritse; Barbara Molz; Gökberk Alagöz; Alessandro Gialluisi; Filippo Abbondanza; Kaili Rimfeld; Marjolein van Donkelaar; Zhijie Liao; Philip R Jansen; Till F M Andlauer; Timothy C Bates; Manon Bernard; Kirsten Blokland; Milene Bonte; Anders D Børglum; Thomas Bourgeron; Daniel Brandeis; Fabiola Ceroni; Valéria Csépe; Philip S Dale; Peter F de Jong; John C DeFries; Jean-François Démonet; Ditte Demontis; Yu Feng; Scott D Gordon; Sharon L Guger; Marianna E Hayiou-Thomas; Juan A Hernández-Cabrera; Jouke-Jan Hottenga; Charles Hulme; Juha Kere; Elizabeth N Kerr; Tanner Koomar; Karin Landerl; Gabriel T Leonard; Maureen W Lovett; Heikki Lyytinen; Nicholas G Martin; Angela Martinelli; Urs Maurer; Jacob J Michaelson; Kristina Moll; Anthony P Monaco; Angela T Morgan; Markus M Nöthen; Zdenka Pausova; Craig E Pennell; Bruce F Pennington; Kaitlyn M Price; Veera M Rajagopal; Franck Ramus; Louis Richer; Nuala H Simpson; Shelley D Smith; Margaret J Snowling; John Stein; Lisa J Strug; Joel B Talcott; Henning Tiemeier; Marc P van der Schroeff; Ellen Verhoef; Kate E Watkins; Margaret Wilkinson; Margaret J Wright; Cathy L Barr; Dorret I Boomsma; Manuel Carreiras; Marie-Christine J Franken; Jeffrey R Gruen; Michelle Luciano; Bertram Müller-Myhsok; Dianne F Newbury; Richard K Olson; Silvia Paracchini; Tomáš Paus; Robert Plomin; Sheena Reilly; Gerd Schulte-Körne; J Bruce Tomblin; Elsje van Bergen; Andrew J O Whitehouse; Erik G Willcutt; Beate St Pourcain; Clyde Francks; Simon E Fisher
Journal:  Proc Natl Acad Sci U S A       Date:  2022-08-23       Impact factor: 12.779

3.  Common huntingtin-related genetic variation is associated with neurobiological and aging traits in humans.

Authors:  Alana N Slike; Galen E B Wright
Journal:  Cell Death Discov       Date:  2022-07-09

4.  Delayed tracking and inequality of opportunity: Gene-environment interactions in educational attainment.

Authors:  Antonie Knigge; Ineke Maas; Kim Stienstra; Eveline L de Zeeuw; Dorret I Boomsma
Journal:  NPJ Sci Learn       Date:  2022-05-04

5.  Investigating the effect of sexual behaviour on oropharyngeal cancer risk: a methodological assessment of Mendelian randomization.

Authors:  Mark Gormley; Tom Dudding; Linda Kachuri; Kimberley Burrows; Amanda H W Chong; Richard M Martin; Steven J Thomas; Jessica Tyrrell; Andrew R Ness; Paul Brennan; Marcus R Munafò; Miranda Pring; Stefania Boccia; Andrew F Olshan; Brenda Diergaarde; Rayjean J Hung; Geoffrey Liu; Eloiza H Tajara; Patricia Severino; Tatiana N Toporcov; Martin Lacko; Tim Waterboer; Nicole Brenner; George Davey Smith; Emma E Vincent; Rebecca C Richmond
Journal:  BMC Med       Date:  2022-01-31       Impact factor: 11.150

6.  snpXplorer: a web application to explore human SNP-associations and annotate SNP-sets.

Authors:  Niccolo Tesi; Sven van der Lee; Marc Hulsman; Henne Holstege; Marcel J T Reinders
Journal:  Nucleic Acids Res       Date:  2021-07-02       Impact factor: 16.971

Review 7.  Dissecting polygenic signals from genome-wide association studies on human behaviour.

Authors:  Abdel Abdellaoui; Karin J H Verweij
Journal:  Nat Hum Behav       Date:  2021-05-13

8.  Educational attainment polygenic score predicts inhibitory control and academic skills in early and middle childhood.

Authors:  Gianna Rea-Sandin; Veronica Oro; Emma Strouse; Sierra Clifford; Melvin N Wilson; Daniel S Shaw; Kathryn Lemery-Chalfant
Journal:  Genes Brain Behav       Date:  2021-08-03       Impact factor: 3.708

9.  A general dimension of genetic sharing across diverse cognitive traits inferred from molecular data.

Authors:  Javier de la Fuente; Gail Davies; Andrew D Grotzinger; Elliot M Tucker-Drob; Ian J Deary
Journal:  Nat Hum Behav       Date:  2020-09-07

10.  Polygenic Scores for Cognitive Abilities and Their Association with Different Aspects of General Intelligence-A Deep Phenotyping Approach.

Authors:  Erhan Genç; Caroline Schlüter; Christoph Fraenz; Larissa Arning; Dorothea Metzen; Huu Phuc Nguyen; Manuel C Voelkle; Fabian Streit; Onur Güntürkün; Robert Kumsta; Sebastian Ocklenburg
Journal:  Mol Neurobiol       Date:  2021-05-05       Impact factor: 5.590

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.