| Literature DB >> 25640677 |
Robert Maier1, Gerhard Moser1, Guo-Bo Chen1, Stephan Ripke2, William Coryell3, James B Potash3, William A Scheftner4, Jianxin Shi5, Myrna M Weissman6, Christina M Hultman7, Mikael Landén8, Douglas F Levinson9, Kenneth S Kendler10, Jordan W Smoller11, Naomi R Wray1, S Hong Lee12.
Abstract
Genetic risk prediction has several potential applications in medical research and clinical practice and could be used, for example, to stratify a heterogeneous population of patients by their predicted genetic risk. However, for polygenic traits, such as psychiatric disorders, the accuracy of risk prediction is low. Here we use a multivariate linear mixed model and apply multi-trait genomic best linear unbiased prediction for genetic risk prediction. This method exploits correlations between disorders and simultaneously evaluates individual risk for each disorder. We show that the multivariate approach significantly increases the prediction accuracy for schizophrenia, bipolar disorder, and major depressive disorder in the discovery as well as in independent validation datasets. By grouping SNPs based on genome annotation and fitting multiple random effects, we show that the prediction accuracy could be further improved. The gain in prediction accuracy of the multivariate approach is equivalent to an increase in sample size of 34% for schizophrenia, 68% for bipolar disorder, and 76% for major depressive disorders using single trait models. Because our approach can be readily applied to any number of GWAS datasets of correlated traits, it is a flexible and powerful tool to maximize prediction accuracy. With current sample size, risk predictors are not useful in a clinical setting but already are a valuable research tool, for example in experimental designs comparing cases with high and low polygenic risk.Entities:
Mesh:
Year: 2015 PMID: 25640677 PMCID: PMC4320268 DOI: 10.1016/j.ajhg.2014.12.006
Source DB: PubMed Journal: Am J Hum Genet ISSN: 0002-9297 Impact factor: 11.043
Estimates of SNP Heritability and Genetic Correlations from Multivariate Analysis of Five Psychiatric Disorders
| SCZ | 8,826 | 6,106 | 0.235 | 0.011 |
| BIP | 5,867 | 3,328 | 0.218 | 0.017 |
| MDD | 8,770 | 6,506 | 0.286 | 0.023 |
| ASD | 3,086 | 3,163 | 0.130 | 0.024 |
| ADHD | 3,997 | 8,479 | 0.281 | 0.022 |
| BIP/SCZ | 5,867/8,826 | 3,328/6,106 | 0.590 | 0.048 |
| MDD/SCZ | 8,770/8,826 | 6,506/6,106 | 0.365 | 0.047 |
| MDD/BIP | 8,770/5,867 | 6,506/3,328 | 0.371 | 0.060 |
| ASD/SCZ | 3,086/8,826 | 3,163/6,106 | 0.194 | 0.071 |
| ASD/BIP | 3,086/5,867 | 3,163/3,328 | 0.084 | 0.089 |
| ASD/MDD | 3,086/8,770 | 3,163/6,506 | 0.054 | 0.089 |
| ADHD/SCZ | 3,997/8,826 | 8,479/6,106 | 0.055 | 0.046 |
| ADHD/BIP | 3,997/5,867 | 8,479/3,328 | 0.160 | 0.059 |
| ADHD/MDD | 3,997/8,770 | 8,479/6,506 | 0.242 | 0.059 |
| ADHD/ASD | 3,997/3,086 | 8,479/3,163 | −0.044 | 0.088 |
Abbreviations are as follows: SE, standard error; SCZ, schizophrenia; BIP, bipolar disorder; MDD, major depressive disorder; ASD, autism spectrum disorder; ADHD, attention deficit disorder.
Numbers of Cases and Controls in the Independent Validation Data Sets before and after Removing Related Individuals
| All | 5,193 | 6,391 | 2,208 | 6,056 | 831 | 474 |
| After cut-off QC | 4,068 | 5,471 | 2,029 | 5,338 | 822 | 466 |
| Number of SNPs | 745,631 | 645,237 | 673,109 | |||
Abbreviations are as follows: SCZ, Swedish schizophrenia GWAS; BIP, Swedish bipolar disorder GWAS; MDD, GENRED2 GWAS.
Prediction Accuracy for Schizophrenia, Bipolar Disorder, and Major Depressive Disorder in Independent Validation Data Sets
| STGBLUP | 0.198 | 0.129 | 0.045 | 0.784 | 0.709 | 0.304 |
| MTGBLUP | 0.222 | 0.159 | 0.075 | 0.815 | 0.697 | 0.466 |
| STGBLUP-CNS | 0.203 | 0.132 | 0.045 | 0.789 | 0.719 | 0.306 |
| MTGBLUP-CNS | 0.224 | 0.162 | 0.076 | 0.807 | 0.690 | 0.476 |
Prediction accuracy is given as the correlation coefficient between the observed disease status and the predicted genomic risk score in the validation data. Regression deviated from one reflects the degree of bias of the risk scores.
p Values from the Likelihood Ratio Test Comparing Different Models
| STGBLUP | MTGBLUP | 2.4 × 10−24 | 6.6 × 10−16 | 1.0 × 10−2 |
| STGBLUP | STGBLUP-CNS | 9.1 × 10−6 | 4.6 × 10−3 | 5.8 × 10−1 |
| MTGBLUP | MTGBLUP-CNS | 2.4 × 10−3 | 5.3 × 10−3 | 3.3 × 10−1 |
| STGBLUP | MTGBLUP-CNS | 6.7 × 10−26 | 1.3 × 10−17 | 7.3 × 10−3 |
Likelihood ratio LR = −2 [logL(x1) − logL(x1+ x2)] where logL(x1) (logL(x1+x2)) is the log likelihood from a logistic regression with case-control status as the dependent variable and x1 (x1 and x2) as independent explanatory variable.
Figure 1Odds Ratios of Individuals Stratified into Deciles Based on GBLUP Genetic Risk in Independent Samples, using the Decile with the Lowest Risk as the Baseline
The vertical error bars denote 95% CI. We note that the estimates for the different methods are highly correlated, and therefore the vertical error bars cannot be used to infer significance of difference between the methods (see Appendix C).
Figure 2Theoretical and Observed Prediction Accuracy of STGBLUP and MTGBLUP Depending on Sample Size
Theoretical line of prediction accuracy increased with larger sample size (solid line), the observed accuracy achieved by STGBLUP with the actual sample size (red dot), and the observed accuracy achieved by MTGBLUP and inferred sample size (blue dot). The increase from MTGBLUP equates to ∼4,660 samples for schizophrenia, ∼5,550 samples for bipolar disorder, and ∼10,940 for major depressive disorder. The vertical error bars denote 95% CI. We note that the estimates for the different methods are highly correlated, and therefore the vertical error bars cannot be used to infer significance of difference between the methods (see Appendix C).
| m | 1 | 0.927 | 0.222 |
| s | 0.927 | 1 | 0.189 |
| y | 0.222 | 0.198 | 1 |