| Literature DB >> 24721987 |
Peter M Visscher1, Gibran Hemani1, Anna A E Vinkhuyzen2, Guo-Bo Chen2, Sang Hong Lee2, Naomi R Wray2, Michael E Goddard3, Jian Yang1.
Abstract
We have recently developed analysis methods (GREML) to estimate the genetic variance of a complex trait/disease and the genetic correlation between two complex traits/diseases using genome-wide single nucleotide polymorphism (SNP) data in unrelated individuals. Here we use analytical derivations and simulations to quantify the sampling variance of the estimate of the proportion of phenotypic variance captured by all SNPs for quantitative traits and case-control studies. We also derive the approximate sampling variance of the estimate of a genetic correlation in a bivariate analysis, when two complex traits are either measured on the same or different individuals. We show that the sampling variance is inversely proportional to the number of pairwise contrasts in the analysis and to the variance in SNP-derived genetic relationships. For bivariate analysis, the sampling variance of the genetic correlation additionally depends on the harmonic mean of the proportion of variance explained by the SNPs for the two traits and the genetic correlation between the traits, and depends on the phenotypic correlation when the traits are measured on the same individuals. We provide an online tool for calculating the power of detecting genetic (co)variation using genome-wide SNP data. The new theory and online tool will be helpful to plan experimental designs to estimate the missing heritability that has not yet been fully revealed through genome-wide association studies, and to estimate the genetic overlap between complex traits (diseases) in particular when the traits (diseases) are not measured on the same samples.Entities:
Mesh:
Year: 2014 PMID: 24721987 PMCID: PMC3983037 DOI: 10.1371/journal.pgen.1004269
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1Standard error of the estimate of variance explained by all SNPs vs. sample size.
The first three columns are the averaged standard error observed from 100 simulations under three heritability levels. The last column is the predicted standard error from our approximation theory. The plotted data can be found in Table S1.
Standard error of the estimate of genetic correlation from a bivariate analysis of two traits measured on the same or different samples using genome-wide SNP data.
| Same sample | Different samples | ||||||||||
|
|
| Est. | SE (Obs.) | s.e.m. | SE (Approx.) |
|
| Est. | SE (Obs.) | s.e.m. | SE (Approx.) |
|
| 4000 | 0.00 | 0.128 | 0.0024 | 0.114 | 1000 | 3000 | 0.04 | 0.288 | 0.0126 | 0.264 |
|
| 6000 | 0.00 | 0.085 | 0.0009 | 0.076 | 2000 | 4000 | 0.00 | 0.191 | 0.0062 | 0.161 |
|
| 8000 | −0.01 | 0.065 | 0.0006 | 0.057 | 3000 | 5000 | −0.01 | 0.129 | 0.0021 | 0.118 |
|
| 10000 | 0.00 | 0.053 | 0.0004 | 0.046 | 4000 | 6000 | −0.01 | 0.103 | 0.0014 | 0.093 |
|
| 4000 | 0.42 | 0.112 | 0.0019 | 0.108 | 1000 | 3000 | 0.32 | 0.295 | 0.0124 | 0.309 |
|
| 6000 | 0.39 | 0.076 | 0.0009 | 0.072 | 2000 | 4000 | 0.39 | 0.230 | 0.0141 | 0.182 |
|
| 8000 | 0.38 | 0.057 | 0.0006 | 0.054 | 3000 | 5000 | 0.37 | 0.136 | 0.0036 | 0.131 |
|
| 10000 | 0.39 | 0.046 | 0.0005 | 0.043 | 4000 | 6000 | 0.38 | 0.107 | 0.0015 | 0.103 |
|
| 4000 | 0.80 | 0.081 | 0.0024 | 0.091 | 1000 | 3000 | 0.62 | 0.417 | 0.0274 | 0.418 |
|
| 6000 | 0.80 | 0.056 | 0.0017 | 0.061 | 2000 | 4000 | 0.83 | 0.248 | 0.0127 | 0.232 |
|
| 8000 | 0.80 | 0.042 | 0.0012 | 0.046 | 3000 | 5000 | 0.86 | 0.198 | 0.0069 | 0.164 |
|
| 10000 | 0.79 | 0.036 | 0.0010 | 0.036 | 4000 | 6000 | 0.83 | 0.133 | 0.0034 | 0.127 |
Same sample: two traits are measured on the same set of samples. Different sample: two traits are measured on the different sets of samples. : parameter of genetic correlation (i.e. proportion of simulated causal variants shared between the two traits). Est.: estimate of genetic correlation from 100 simulations. SE(Obs.): mean of the observed standard errors from 100 simulations. s.e.m.: standard error of the mean (i.e. SE(Obs.)). SE(Approx.): standard error calculated from our approximation theory.
Standard errors of the estimates of variance explained by all SNPs on the observed scale () from published analyses of case-control studies for a number of diseases vs. those predicted from the approximation theory.
|
|
|
|
|
| SE(Obs.) | SE(Approx.) |
|
| 1604 | 1953 | 0.001 | 0.851 | 0.088 | 0.089 |
|
| 3290 | 3849 | 0.020 | 0.364 | 0.049 | 0.044 |
|
| 3154 | 6981 | 0.080 | 0.231 | 0.036 | 0.031 |
|
| 9087 | 12171 | 0.010 | 0.410 | 0.015 | 0.015 |
|
| 6704 | 9031 | 0.010 | 0.441 | 0.021 | 0.020 |
|
| 9041 | 9381 | 0.150 | 0.177 | 0.017 | 0.017 |
|
| 3303 | 3428 | 0.010 | 0.310 | 0.046 | 0.047 |
|
| 4163 | 12040 | 0.050 | 0.253 | 0.020 | 0.020 |
N cases: number of cases. N controls: number of controls. SE(Obs.): reported standard error of the estimate of from real data analysis. SE(Approx.): standard error of calculated from our approximation theory. MDD: major depression disorder. ASD: autism spectrum disorders. ADHD: attention-deficit/hyperactivity disorder.
Figure 2Standard error (SE) of the estimate of variance explained by all SNPs on the underlying scale () from a univariate analysis of a case-control study vs. total number of cases and controls (sample size).
The SE is predicted from the approximation theory given different levels of disease prevalence (K) and proportion of cases in the sample (v).
Standard errors of the estimates of genetic correlations from published bivariate analyses of case-control studies for psychiatric diseases [7] vs. those predicted from the approximation theory.
| Disease |
|
|
|
| Disease |
|
|
|
|
| SE (Obs.) | SE (Approx.) |
|
| 0.01 | 9032 | 7980 | 0.40 |
| 0.01 | 6664 | 5258 | 0.39 | 0.68 | 0.044 | 0.049 |
|
| 0.01 | 9051 | 10385 | 0.38 |
| 0.15 | 8998 | 7823 | 0.16 | 0.43 | 0.055 | 0.057 |
|
| 0.01 | 9111 | 12146 | 0.41 |
| 0.01 | 3226 | 3308 | 0.29 | 0.16 | 0.059 | 0.057 |
|
| 0.01 | 9013 | 10115 | 0.42 |
| 0.05 | 4108 | 9936 | 0.22 | 0.08 | 0.046 | 0.045 |
|
| 0.01 | 6665 | 7408 | 0.42 |
| 0.15 | 8997 | 7680 | 0.17 | 0.47 | 0.061 | 0.063 |
|
| 0.01 | 6704 | 9030 | 0.43 |
| 0.01 | 3207 | 3294 | 0.31 | 0.04 | 0.065 | 0.061 |
|
| 0.01 | 6656 | 7041 | 0.38 |
| 0.05 | 4099 | 9873 | 0.25 | 0.05 | 0.053 | 0.052 |
|
| 0.15 | 9031 | 9370 | 0.17 |
| 0.01 | 3239 | 3331 | 0.31 | 0.05 | 0.089 | 0.090 |
|
| 0.15 | 8936 | 8668 | 0.16 |
| 0.05 | 4098 | 11233 | 0.24 | 0.32 | 0.071 | 0.073 |
|
| 0.01 | 3156 | 3254 | 0.27 |
| 0.05 | 4181 | 12022 | 0.23 | −0.13 | 0.087 | 0.090 |
SCZ: schizophrenia. BPD: bipolar disorder. MDD: major depression disorder. ASD: autism spectrum disorders. ADHD: attention-deficit/hyperactivity disorder. N cases: number of cases. N controls: number of controls. K: disease prevalence. : estimate of variance explained by all SNPs on the observed scale, which was calculated from the reported and disease prevalence in Supplementary Table 1 of Lee et al. [7]. : genetic correlation. SE(Obs.): reported standard error of the estimate of from real data analysis. SE(Approx.): standard error of calculated from our approximation theory.
Figure 3Statistical power of detecting genetic variance (correlation) under different study designs.
a) Univarite analysis of a quantitative trait. b) Univariate analysis of a case-control study assuming equal number of cases and controls (v = 0.5) and heritability of liability () of 0.2. c) Bivariate analysis of two quantitative traits measured on the same set of individuals, assuming heritability of 0.2 for both traits. d) Bivariate analysis of two case-control studies on independent sets of samples, assuming equal numbers of cases and controls for each disease, and equal sample size (total number of cases and controls), equal heritability of liability ( = 0.2) and equal prevalence (K = 0.01) for both diseases.