| Literature DB >> 29515099 |
Robert M Maier1,2,3, Zhihong Zhu4, Sang Hong Lee5,6, Maciej Trzaskowski4, Douglas M Ruderfer7, Eli A Stahl8, Stephan Ripke9,10,11, Naomi R Wray5,4, Jian Yang5,4, Peter M Visscher12,13, Matthew R Robinson14,15,16.
Abstract
Genomic prediction has the potential to contribute to precision medicine. However, to date, the utility of such predictors is limited due to low accuracy for most traits. Here theory and simulation study are used to demonstrate that widespread pleiotropy among phenotypes can be utilised to improve genomic risk prediction. We show how a genetic predictor can be created as a weighted index that combines published genome-wide association study (GWAS) summary statistics across many different traits. We apply this framework to predict risk of schizophrenia and bipolar disorder in the Psychiatric Genomics consortium data, finding substantial heterogeneity in prediction accuracy increases across cohorts. For six additional phenotypes in the UK Biobank data, we find increases in prediction accuracy ranging from 0.7% for height to 47% for type 2 diabetes, when using a multi-trait predictor that combines published summary statistics from multiple traits, as compared to a predictor based only on one trait.Entities:
Mesh:
Year: 2018 PMID: 29515099 PMCID: PMC5841449 DOI: 10.1038/s41467-017-02769-6
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Schematic of the methods. a Data and programs used to create predictors. b Terminology to refer to different types of predictors. OLS, ordinary least squares. The most common GWAS methodology to estimate SNP effects is to estimate the effect sizes of one SNP at a time using linear regression. BLUP, best linear unbiased prediction. SNP effects are estimated simultaneously for all SNPs. The estimates depend on the other SNPs included in the analysis, since the contribution from correlated SNPs will be shared between them
Fig. 2Improving prediction accuracy using information from multiple traits. a Expected gain from multi-trait vs cross-trait predictors as a function of rG. Two traits are considered. The first trait has a sample size of 20,000 and a SNP heritability of 0.5. The sample size and SNP heritability of the second trait vary between panels. The blue line shows the expected prediction accuracy of a single-trait predictor. The black line shows the expected prediction accuracy of a multi-trait predictor. The purple line shows the expected prediction accuracy of a cross-trait predictor (using only trait 2 to predict trait 1). The advantage of a multi-trait predictor over a cross-trait predictor decreases with increasing rG, h2, and sample size of the second trait. b Simulation results. Prediction accuracy is shown as correlation between simulated genetic value and predicted phenotype of individuals. Genotypes from European individuals in the GERA cohort were used for simulation. Boxplots show results across six replicates. In the left panels, the LD structure was removed by permuting dosage values for each SNP across all individuals. In the right panels, the original genotypes were used for simulation. Expected prediction accuracies were derived for the case of unlinked genotypes and are shown as red horizontal bars. In each section, the prediction accuracy of three predictors is shown: (1) single trait BLUP, (2) multi-trait BLUP (MT-BLUP), and (3) weighted approximate BLUP (summary statistic-based multi-trait predictor: wMT-SBLUP). Simulation in genotypes without LD results in prediction accuracies, which conform to expectations. In the presence of LD, the expected prediction accuracy depends very much on the choice of Meff
Fig. 3Prediction accuracy for schizophrenia and bipolar disorder from several single-trait and multi-trait predictors. Prediction accuracy of seven different types of predictors using PGC1 schizophrenia and bipolar disorder data. Single-trait predictor (lighter colours) are on the left, multi-trait predictors (darker colours) are on the right. Black error bars indicate correlation coefficient standard errors, calculated as
Fig. 4Prediction accuracy for single-trait and multi-trait predictors in UK Biobank traits. Prediction accuracy for six traits in the UK Biobank for multi-trait predictors (light blue bars, wMT-SBLUP) and single-trait predictors (colourful bars on the right, SBLUP). Black bars show the correlation coefficient standard error. The multi-trait predictors for each trait are composed of all traits for which colourful bars are shown (rGp-value < 0.05). Smaller bars on the right show, from top to bottom, sample size, SNP heritability, rG, and weights (given by Eq. (15)) for each trait