| Literature DB >> 31160569 |
Oleksandr Frei1, Dominic Holland2,3, Olav B Smeland4,5, Alexey A Shadrin4, Chun Chieh Fan2,6,7, Steffen Maeland4, Kevin S O'Connell4, Yunpeng Wang4,2,7, Srdjan Djurovic8,9, Wesley K Thompson10,11, Ole A Andreassen4,5, Anders M Dale12,13,14,15.
Abstract
Accumulating evidence from genome wide association studies (GWAS) suggests an abundance of shared genetic influences among complex human traits and disorders, such as mental disorders. Here we introduce a statistical tool, MiXeR, which quantifies polygenic overlap irrespective of genetic correlation, using GWAS summary statistics. MiXeR results are presented as a Venn diagram of unique and shared polygenic components across traits. At 90% of SNP-heritability explained for each phenotype, MiXeR estimates that 8.3 K variants causally influence schizophrenia and 6.4 K influence bipolar disorder. Among these variants, 6.2 K are shared between the disorders, which have a high genetic correlation. Further, MiXeR uncovers polygenic overlap between schizophrenia and educational attainment. Despite a genetic correlation close to zero, the phenotypes share 8.3 K causal variants, while 2.5 K additional variants influence only educational attainment. By considering the polygenicity, discoverability and heritability of complex phenotypes, MiXeR analysis may improve our understanding of cross-trait genetic architectures.Entities:
Mesh:
Year: 2019 PMID: 31160569 PMCID: PMC6547727 DOI: 10.1038/s41467-019-10310-0
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Components of the bivariate mixture in three scenarios of polygenic overlap. All figures are generated from synthetic data, where causal variants were drawn from the MiXeR model, the total polygenicity in each trait is set to 0.01%, SNP heritability is set to 0.4, GWAS N = 100,000. First column shows two traits where causal variants do not overlap. Second column adds a component of causal variants affecting both traits in the same (concordant) direction. Third column shows a scenario of polygenic overlap without genetic correlation. Top row shows simulated bivariate density of additive effects of allele substitution (β1, β2), the bottom row shows bivariate density of GWAS signed test statistics (z1, z2) for GWAS SNPs (genotyped or imputed). Due to linkage disequilibrium, GWAS-signed test statistic has substantially larger volume of SNPs associated with the phenotype. The aim of the MiXeR model is to infer distribution of causal effects (top row), using GWAS data (bottom row) as an input. Figures are generated on a regular grid of 100 × 100 bins, color histogram indicates log10(N) where N is the number of SNPs projected into a bin
Fig. 2Selected simulations with bivariate model: a the estimates of polygenic overlap; b the estimates of correlation of the effect sizes in shared polygenic component; c the estimates of genetic correlation. The bars in blue indicate an average value of model estimates across ten simulation runs. The bars in cyan show true (simulated) parameters. Error bars represent standard deviation of the model estimate across ten simulation runs. Individual simulation runs are shown as dot points. Different bars correspond to levels of polygenic overlap: from zero (no overlap) to complete polygenic overlap. Simulated heritability is 0.4, simulated fraction of causal variants is 0.03% in both traits
The results of cross-trait analysis with the MiXeR model for schizophrenia (SCZ), bipolar disorder (BIP), educational attainment (EDU) and height
| Trait 1 | Trait 2 | rg (se) | rgLDSR (se) | ||||
|---|---|---|---|---|---|---|---|
| SCZ | BIP | 6.19 (0.99) | 2.10 (1.26) | 0.21 (0.44) | 0.853 (0.019) | 0.725 (0.071) | 0.725 (0.024) |
| SCZ | EDU | 8.29 (0.84) | 0.00 (0.04) | 2.54 (1.02) | 0.071 (0.015) | 0.062 (0.014) | 0.079 (0.022) |
| SCZ | Height | 0.83 (0.10) | 7.46 (0.87) | 2.29 (0.12) | −0.045 (0.060) | −0.007 (0.010) | −0.008 (0.019) |
| BIP | EDU | 5.72 (1.46) | 0.68 (1.16) | 5.11 (1.58) | 0.278 (0.051) | 0.191 (0.036) | 0.188 (0.023) |
| BIP | Height | 0.83 (0.11) | 5.57 (1.11) | 2.29 (0.13) | 0.001 (0.067) | 0.000 (0.013) | −0.014 (0.021) |
| EDU | Height | 1.76 (0.11) | 9.07 (0.58) | 1.37 (0.10) | 0.519 (0.040) | 0.157 (0.010) | 0.141 (0.012) |
Columns: n12 – estimated number of shared causal variants, reported in thousands; n1 (n2)– estimated number of causal variants, unique to trait 1 (trait 2), reported in thousands; ρ12 – correlation of effect sizes in shared polygenic component; rg – genetic correlation (, see Online Methods); rgLDSR—estimate of genetic correlation from LD Score Regression. The number of variants (n12, n1, and n2) are adjusted to explain 90% of heritability in the corresponding component. Parameters are fitted using approximately 1.1 M HapMap3 SNPs
Fig. 3Venn diagrams of unique and shared polygenic components at the causal level, showing polygenic overlap (gray) between schizophrenia (SCZ, blue), bipolar disorder (BIP, orange), educational attainment (EDU, green), and height (red). The numbers indicate the estimated quantity of causal variants (in thousands) per component, explaining 90% of SNP heritability in each phenotype, followed by the standard error. The size of the circles reflects the degree of polygenicity
Fig. 4Top row shows bivariate density of the observed GWAS signed test statistics (z1, z2), middle row shows predicted density from the MiXeR model. The bottom row shows estimated bivariate density of additive causal effects (β1, β2) that underlie model prediction. Three columns represent schizophrenia (SCZ) versus bipolar disorder (BIP), educational attainment (EDU), and height GWAS. Density is visualized using regular grid of 100 × 100 bins, color indicates log10(N), where N is the observed number (for the top row) or the expected number (for the middle and bottom rows) of SNPs projected into a bin
Fig. 5Conditional Q–Q plots of observed versus expected −log10 p-values in the primary trait as a function of significance of association with a secondary trait at the level of p ≤ 0.1 (orange lines), p ≤ 0.01 (green lines), p ≤ 0.001 (red lines). Blue line indicates all SNPs. Dotted lines in blue, orange, green, and red indicate model predictions for each stratum. Black dotted line is the expected Q–Q plot under null (no SNPs associated with the phenotype). Points on the Q–Q plot are weighted according to LD structure, using n = 64 iterations of random pruning at LD threshold r2 = 0.1