| Literature DB >> 35896674 |
Lei Clifton1, Jennifer A Collister2, Xiaonan Liu2, Thomas J Littlejohns2, David J Hunter2,3.
Abstract
Polygenic risk scores (PRS) are proposed for use in clinical and research settings for risk stratification. However, there are limited investigations on how different PRS diverge from each other in risk prediction of individuals. We compared two recently published PRS for each of three conditions, breast cancer, hypertension and dementia, to assess the stability of using these algorithms for risk prediction in a single large population. We used imputed genotyping data from the UK Biobank prospective cohort, limited to the White British subset. We found that: (1) 20% or more of SNPs in the first PRS were not represented in the more recent PRS for all three diseases, by the same SNP or a surrogate with R2 > 0.8 by linkage disequilibrium (LD). (2) Although the difference in the area under the receiver operating characteristic curve (AUC) obtained using the two PRS is hardly appreciable for all three diseases, there were large differences in individual risk prediction between the two PRS. For instance, for each disease, of those classified in the top 5% of risk by the first PRS, over 60% were not so classified by the second PRS. We found substantial discordance between different PRS for the same disease, indicating that individuals could receive different medical advice depending on which PRS is used to assess their genetic susceptibility. It is desirable to resolve this uncertainty before using PRS for risk stratification in clinical settings.Entities:
Mesh:
Year: 2022 PMID: 35896674 PMCID: PMC9329440 DOI: 10.1038/s41598-022-17012-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
PRS chosen for each condition.
| Disease | PRS | nSNPs | Source (PGS ID) | Trait | Construction |
|---|---|---|---|---|---|
| Breast cancer | A | 313 | Mavaddat2019 (PGS000004) | Overall breast cancer | Hard-thresholding and stepwise forward regression, |
| B | 118,388 | Fritsche2020 (PGS000511) | Overall breast cancer | Lassosum, s = 0.5, lambda = 0.004281 | |
| Hypertension | A | 267 | Warren2017 (Not in PGS Catalog) | Medication-adjusted BP traits (SBP, DBP, PP) | Pairwise-independent, LD-filtered (r2 < 0.2) previously reported and novel variants |
| B | 884 | Evangelou2018 (PGS002257) | Medication-adjusted BP traits (SBP, DBP, PP) | Pairwise-independent, LD-filtered (r2 < 0.1) previously reported and novel variants | |
| Dementia | A | 57 | Najar (Jan 2021) (PGS000812) | Clinically defined Alzheimer’s disease | Variants significant at |
| B | 39 | Ebenau (Sep 2021) (PGS001775) | Alzheimer’s disease | Genome-wide significant variants |
The Source column lists the paper in which each PRS was derived, along with the PGS ID (Polygenic score ID in PGS Catalog) where available. The “Construction” column outlines the derivation method of each PRS, as described in its source paper. For a brief summary of the dataset(s) where each score was derived, see Supplementary Table 1. For more detailed information about each PRS, please see its source paper.
PRS compared for each outcome and their performance characteristics in the UK Biobank.
| Disease (N) | PRS | nSNPs | OR (95% CI) | Crude-AUC (95% CI) | Multi-AUC (95% CI) | NRI (95% CI) | LD: R2 > 0.8 (% overlap) | |
|---|---|---|---|---|---|---|---|---|
| Breast cancer (171,490) | A | 313 | 3.41 (2.89, 4.03) | 0.63 (0.63, 0.64) | 0.64 (0.63, 0.64) | 0.03 (0.00, 0.05) | 0.65 | 225 (72%) |
| B | 118,388 | 3.94 (3.36, 4.61) | 0.64 (0.63, 0.64) | 0.64 (0.63, 0.65) | ||||
| Hypertension (317,581) | A | 267 | 1.83 (1.70, 1.98) | 0.57 (0.56, 0.57) | 0.69 (0.69, 0.69) | 0.16 (0.16, 0.17) | 0.66 | 210 (79%) |
| B | 884 | 2.18 (2.02, 2.35) | 0.59 (0.59, 0.59) | 0.70 (0.70, 0.70) | ||||
| Dementia (335,689) | A | 57 | 1.67 (1.30, 2.13) | 0.55 (0.54, 0.56) | 0.80 (0.79, 0.80) | 0.13 (0.10, 0.16) | 0.51 | 13 (23%) |
| B | 39 | 2.30 (1.85, 2.87) | 0.57 (0.56, 0.57) | 0.80 (0.79, 0.81) |
N: number of participants whose PRS score was obtained. nSNPs: number of SNPs in PRS prior to genetic quality control. OR: odds ratio for top 1% versus middle quintile of PRS from multivariable logistic regression model adjusted for age, sex, genotyping array and first 5 PCs. AUC: area under receiver-operating curve. Crude-AUC: only continuous PRS was fitted in a regression model. Multi-AUC: continuous PRS was fitted, further adjusted for age and sex. NRI: continuous net reclassification index obtained from predicted risks by two multivariable logistic regression models that contain age, sex, continuous PRS for this disease, genotyping array and first 5 PCs. The model containing PRS-B is considered the “updated” model. : Pearson correlation coefficient between the two continuous PRS for this disease. LD: number (%) of SNPs in PRS-A which either appear in or are in linkage disequilibrium (R2 > 0.8) with SNPs in PRS-B. Breast cancer models are not adjusted for sex because its population is restricted to females. 95% CI for AUC and NRI calculated by bootstrapping.
Figure 1ROC plots obtained from predictions from multivariable logistic regression of age, sex, continuous PRS, genotyping array and first 5 PCs against disease outcome.
Cross-classification of predicted risk of breast cancer among the whole study population, according to the percentiles of each PRS.
| Percentiles of PRS-A (%) | Percentiles of PRS-B | ||||||
|---|---|---|---|---|---|---|---|
| < 1% | 1–20% | 20–40% | 40–60% | 60–80% | 80–99% | ≥ 99% | |
| < 1 | 345 (20.1, 20.1, 0.2) | 1140 (66.5, 3.5, 0.7) | 176 (10.3, 0.5, 0.1) | 44 (2.6, 0.1, 0.0) | 9 (0.5, 0.0, 0.0) | 1 (0.1, 0.0, 0.0) | 0 (0.0, 0.0, 0.0) |
| 1–20 | 1117 (3.4, 65.1, 0.7) | 15,409 (47.3, 47.3, 9.0) | 8818 (27.1, 25.7, 5.1) | 4696 (14.4, 13.7, 2.7) | 2052 (6.3, 6.0, 1.2) | 490 (1.5, 1.5, 0.3) | 1 (0.0, 0.1, 0.0) |
| 20–40 | 198 (0.6, 11.5, 0.1) | 8788 (25.6, 27.0, 5.1) | 9989 (29.1, 29.1, 5.8) | 8053 (23.5, 23.5, 4.7) | 5210 (15.2, 15.2, 3.0) | 2051 (6.0, 6.3, 1.2) | 9 (0.0, 0.5, 0.0) |
| 40–60 | 43 (0.1, 2.5, 0.0) | 4648 (13.6, 14.3, 2.7) | 7998 (23.3, 23.3, 4.7) | 8908 (26.0, 26.0, 5.2) | 8050 (23.5, 23.5, 4.7) | 4607 (13.4, 14.1, 2.7) | 44 (0.1, 2.6, 0.0) |
| 60–80 | 11 (0.0, 0.6, 0.0) | 2098 (6.1, 6.4, 1.2) | 5296 (15.4, 15.4, 3.1) | 7988 (23.3, 23.3, 4.7) | 10,047 (29.3, 29.3, 5.9) | 8688 (25.3, 26.7, 5.1) | 170 (0.5, 9.9, 0.1) |
| 80–99 | 1 (0.0, 0.1, 0.0) | 497 (1.5, 1.5, 0.3) | 2006 (6.2, 5.8, 1.2) | 4574 (14.0, 13.3, 2.7) | 8736 (26.8, 25.5, 5.1) | 15,674 (48.1, 48.1, 9.1) | 1095 (3.4, 63.8, 0.6) |
| ≥ 99 | 0 (0.0, 0.0, 0.0) | 3 (0.2, 0.0, 0.0) | 15 (0.9, 0.0, 0.0) | 35 (2.0, 0.1, 0.0) | 194 (11.3, 0.6, 0.1) | 1072 (62.5, 3.3, 0.6) | 396 (23.1, 23.1, 0.2) |
Number of participants are shown as n (col%, row%, cell%). Higher percentiles of PRS indicate increased risk of breast cancer; “≥ 99%” percentile corresponds to the top 1% risk.