| Literature DB >> 35020805 |
Shuang Song1,2, Lin Hou1,2,3, Jun S Liu4.
Abstract
MOTIVATION: Polygenic risk score (PRS) has been widely exploited for genetic risk prediction due to its accuracy and conceptual simplicity. We introduce a unified Bayesian regression framework, NeuPred, for PRS construction, which accommodates varying genetic architectures and improves overall prediction accuracy for complex diseases by allowing for a wide class of prior choices. To take full advantage of the framework, we propose a summary-statistics-based cross-validation strategy to automatically select suitable chromosome-level priors, which demonstrates a striking variability of the prior preference of each chromosome, for the same complex disease, and further significantly improves the prediction accuracy.Entities:
Year: 2022 PMID: 35020805 PMCID: PMC8963326 DOI: 10.1093/bioinformatics/btac024
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Comparison of three neuronized priors applied to the seven WTCCC complex diseases. (a) The predictive r2 is estimated from fivefold CV. The orange solid and dashed lines indicate the mean predictive r2 and standard deviations, respectively, estimated by NeuPred with priors automatically selected. (b) The predictive r2 of NeuPred under three neuronized priors for each chromosome for CAD and HT. In each CV, NeuPred automatically selected a prior, and the prior that is most frequently selected is marked with a star, which varies greatly among diseases and chromosomes
Fig. 2.Comparison of prediction accuracy among NeuPred and other 12 methods on real data experiments. (a) Predictive r2 on the four diseases (CD, CEL, RA and T2D) with large-scale GWAS studies. PRSs were derived from summary statistics of GWAS studies, and the AUC was evaluated based on independent test datasets. The LD matrix was externally estimated from the 1000G reference panel. (b) Predictive r2 on the four diseases (ATH, HT, BMI and HGT) with GWAS studies from UKBB. The AUC was evaluated based on independent test datasets. The LD matrix was estimated from the UKBB European individuals
AUC of summary-statistics-based PRS methods for four diseases with large-scale GWAS studies and independent test data
| Trait | Without | With | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NeuPred | Unadj PRS | LDpred- inf | SBayesR | SBayesC | RSS | PRS- CS-auto | LDpred2- inf | LDpred2- auto | P + T | LDpred | PRS- CS | LDpred2-grid | |
| CD |
| 0.632 | 0.623 | 0.692 | 0.698 | 0.584 | 0.584 | 0.631 | 0.633 | 0.679 | 0.661 | 0.707 | 0.632 |
| CEL |
| 0.594 | 0.585 | 0.618 | 0.617 | 0.508 | 0.587 | 0.571 | 0.607 | 0.572 | 0.606 | 0.584 | 0.625 |
| RA |
| 0.645 | 0.625 | 0.598 | 0.608 | 0.636 | 0.706 | 0.654 | 0.656 | 0.688 | 0.662 | 0.704 | 0.596 |
| T2D |
| 0.587 | 0.581 | 0.604 | 0.619 | 0.523 | 0.616 | 0.575 | 0.565 | 0.567 | 0.614 | 0.584 | 0.614 |
Note: The four diseases are Crohn’s disease (CD), celiac disease (CEL), rheumatoid arthritis (RA) and type 2 diabetes (T2D). The LD matrix was externally estimated from the 1000G. The UKBB data were used for post hoc parameter tuning for P + T, LDpred, PRS-CS and LDpred2-grid. The highest AUC is highlighted in boldface.