| Literature DB >> 35012283 |
Wonil Chung1,2.
Abstract
Predicting individual traits and diseases from genetic variants is critical to fulfilling the promise of personalized medicine. The genetic variants from genome-wide association studies (GWAS), including variants well below GWAS significance, can be aggregated into highly significant predictions across a wide range of complex traits and diseases. The recent arrival of large-sample public biobanks enables highly accurate polygenic predictions based on genetic variants across the whole genome. Various statistical methodologies and diverse computational tools have been introduced and developed to computed the polygenic risk score (PRS) more accurately. However, many researchers utilize PRS tools without a thorough understanding of the underlying model and how to specify the parameters for the best performance. It is advantageous to study the statistical models implemented in computational tools for PRS estimation and the formulas of parameters to be specified. Here, we review a variety of recent statistical methodologies and computational tools for PRS computation.Entities:
Keywords: PRS models; computational tools; polygenic risk score
Year: 2021 PMID: 35012283 PMCID: PMC8752975 DOI: 10.5808/gi.21053
Source DB: PubMed Journal: Genomics Inform ISSN: 1598-866X
List of PRS methods, underlying statistical models, computational tools, and required data
| Trait/Ethnicity | Method | Statistical model | Computational tool | Required data |
|---|---|---|---|---|
| Single trait, single ethnicity | PRS | Linear model | PLINK, PRSice, PRSice-2 | Summary data |
| LDpred | Bayesian model | LDpred, LDpred-2 | Summary data | |
| GBLUP | LMM | GCTA | Individual data | |
| SBLUP | LMM | GCTA | Summary data | |
| BayesR | Bayesian model | GCTB | Individual data | |
| SBayesR | Bayesian model | GCTB | Summary data | |
| Penalized Regression | Penalized regression | glmnet | Individual data | |
| Lassosum | Penalized regression | lassosum | Summary data | |
| Multiple traits, single ethnicity | MTGBLUP | Multivariate LMM | MTG | Individual data |
| wMT-SBLUP | Multivariate LMM | wMT-SBLUP | Summary data | |
| CTPR | Multivariate penalized regression | CTPR | Individual data | |
| Single trait, multiple ethnicities | XP-BLUP | Two-component LMM | XP-BLUP | Individual data |
| Multi-ethnic PRS | Linear mixture approaches | multi-ethnic PRS | Summary data | |
| Multi-ancestry PRS | Linear mixture approaches | multi-ancestry PRS | Summary data |
PRS, polygenic risk score; GBLUP, genomic BLUP; LMM, linear mixed model; GCTA, Genome-wide Complex Trait Analysis; SBLUP, statistics BLUP; GCTB, Genome-wide Complex Trait Bayesian Analysis; MTGBLUP, multi-trait GBLUP; wMT-SBLUP, weighted multi-trait SBLUP; CTPR, cross-trait penalized regression.
Fig. 1.Best statistical models and software based on data type, sample size, LD reference panel, and the number of traits and ethnicities. CTPR, cross-trait penalized regression; GBLUP, genomic BLUP; GWAS, genome-wide association studies; LD, linkage disequilibrium; MTGBLUP, multi-trait GBLUP; SBLUP, statistics BLUP; wMT-SBLUP, weighted multi-trait SBLUP.