| Literature DB >> 35591975 |
Nina Mars1, Sini Kerminen1, Yen-Chen A Feng2,3,4,5, Masahiro Kanai3,4,6, Kristi Läll7, Laurent F Thomas8,9,10, Anne Heidi Skogholt9, Pietro Della Briotta Parolo1, Benjamin M Neale3,4,11, Jordan W Smoller2,4,11, Maiken E Gabrielsen9,12, Kristian Hveem9, Reedik Mägi7, Koichi Matsuda13, Yukinori Okada14,15,16,17, Matti Pirinen1,18,19, Aarno Palotie1,3,4, Andrea Ganna1,3,20, Alicia R Martin3,4,6, Samuli Ripatti1,18,20.
Abstract
Polygenic risk scores (PRS) measure genetic disease susceptibility by combining risk effects across the genome. For coronary artery disease (CAD), type 2 diabetes (T2D), and breast and prostate cancer, we performed cross-ancestry evaluation of genome-wide PRSs in six biobanks in Europe, the United States, and Asia. We studied transferability of these highly polygenic, genome-wide PRSs across global ancestries, within European populations with different health-care systems, and local population substructures in a population isolate. All four PRSs had similar accuracy across European and Asian populations, with poorer transferability in the smaller group of individuals of African ancestry. The PRSs had highly similar effect sizes in different populations of European ancestry, and in early- and late-settlement regions with different recent population bottlenecks in Finland. Comparing genome-wide PRSs to PRSs containing a smaller number of variants, the highly polygenic, genome-wide PRSs generally displayed higher effect sizes and better transferability across global ancestries. Our findings indicate that in the populations investigated, the current genome-wide polygenic scores for common diseases have potential for clinical utility within different health-care settings for individuals of European ancestry, but that the utility in individuals of African ancestry is currently much lower.Entities:
Keywords: ancestry; breast cancer; coronary artery disease; diabetes; global health; health disparities; polygenic risk score; precision medicine; prostate cancer; risk prediction
Year: 2022 PMID: 35591975 PMCID: PMC9010308 DOI: 10.1016/j.xgen.2022.100118
Source DB: PubMed Journal: Cell Genom ISSN: 2666-979X
Study characteristics
| Biobank | Ancestry | Sample size | Age, mean (SD) | Women, % | CAD | T2D | Breast cancer | Prostate cancer | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Cases N = 88,830 | AAO, mean (SD) | Cases N = 110,685 | AAO, mean (SD) | Cases, N = 32,922 | AAO, mean (SD) | Cases, N = 26,700 | AAO, mean (SD) | ||||||
| Estonian Biobank | Estonia | EUR | 110,597 | 43.4 (16.0) | 67.3 | 5,064 | 67.5 (11.9) | 7,066 | 60.9 (12.7) | 1,379 | 59.5 (13.4) | 1,202 | 68.7 (9.2) |
| FinnGen | Finland | EUR | 258,402 | 60.3 (17.1)∗ | 56.5 | 25,706 | 64.9 (11.8) | 37,001 | 60.1 (11.9) | 11,573 | 59.0 (11.6) | 8,709 | 68.5 (8.1) |
| HUNT | Norway | EUR | 69,422 | 50.8 (17.0) | 53.0 | 6,594 | 69.0 (12.5) | 5,228 | 68.1 (13.4) | 1,731 | 61.7 (13.4) | 2,224 | 70.6 (9.2) |
| MGB Biobank | United States | EUR | 25,696 | 60.0 (16.5) | 53.1 | 3,206 | – | 5,182 | – | 1,513 | – | 1,593 | – |
| UK Biobank | United Kingdom | EUR | 343,676 | 56.9 (8.0) | 53.7 | 17,986 | 62.1 (8.9) | 13,616 | 54.6 (8.5) | 11,075 | 54.0 (8.1) | 7,429 | 59.7 (6.2) |
| BioBank Japan | Japan | EAS | 178,726 | 63.1 (14.0) | 46.3 | 29,080 | 61.7 | 40,121 | 56.2 | 5,316 | 56.1 | 5,192 | 71.1 |
| MGB Biobank | United States | AFR | 1,535 | 54.1 (16.3) | 61.4 | 285 | – | 660 | – | 64 | – | 80 | – |
| UK Biobank | United Kingdom | AFR | 7,618 | 51.9 (8.1) | 57.0 | 169 | 56.9 (10.3) | 691 | 50.2 (8.9) | 132 | 50.2 (9.1) | 199 | 57.4 (7.4) |
| UK Biobank | United Kingdom | SAS | 7,628 | 53.4 (8.5) | 46.1 | 740 | 58.6 (9.7) | 1,120 | 50.0 (8.7) | 139 | 51.2 (7.9) | 72 | 59.6 (7.2) |
EUR = European, EAS = East Asian, AFR = African (self-reported African/Caribbean in UK Biobank), SAS = South Asian, CAD = coronary artery disease, T2D = type 2 diabetes, AAO = age at onset, SD = standard deviation. Disease definitions are listed by cohort in STAR Methods. In HUNT, we show the age at baseline for those participating in either HUNT2 or HUNT3, and a mean of these baseline ages for individuals participating in both. ∗Age at the end of follow-up.
Figure 1Effect sizes of polygenic risk scores (PRSs) across ancestries
(A) The results across ancestry groups, with “European” representing a pooled OR of effect sizes from (B).
(B) The results across different populations with European ancestry.
(C) The results across early- and late-settlement regions in Finland (FinnGen).
ORs with 95% CIs (CI) are shown for 1 SD increase in PRS. See Table 1 for respective number of cases and controls. The pooled OR (“European”) in (A) was obtained by random effects meta-analysis of effects shown in (B). In (C), out of 258,402 in FinnGen, 8,117 individuals were excluded, comprising 3,157 born abroad, 4,304 born in regions ceded to the Soviet Union, 182 born in Åland Islands, and 474 with missing data. Detailed information of the Finnish regions in (C) is provided in the description of FinnGen in STAR Methods. CAD = coronary artery disease, T2D = type 2 diabetes.
Figure 2Comparison of polygenic risk scores (PRSs) generated with different methods
The figure shows a comparison of three types of PRSs in UK Biobank: previously published PRSs using a smaller number of variants (“limited-variant PRS”),,,, PRSs generated with LDpred, and PRSs generated with PRS-CS. ORs with 95% CI are shown across ancestries for 1 SD increase in the PRS. Detailed effect size comparisons are in Table S3. CAD = coronary artery disease, T2D = type 2 diabetes. Table 1 shows the respective number of cases and controls.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| The FinnGen data may be accessed through Finnish Biobanks’ FinBB portal | - | |
| GWAS genotype data of BioBank Japan are available at the National Bioscience Database Center Human Database | Nagai et al., 2017 | Research ID: hum0014; |
| UK Biobank data are available through a procedure described at | Bycroft et al., 2018 | |
| The Trøndelag Health Study (HUNT). The HUNT data may be accessed by application to the HUNT Research Centre. | Krokstad et al., 2013 | |
| De-identified data of the MGB Biobank that supports this study is available from the MGB Biobank portal. Restrictions apply to the availability of these data, which are available to MGB-affiliated researchers via a formal application. | Karlson et al., 2016 | |
| Estonian Biobank. Researchers interested in Estonian Biobank can request the access at | Leitsalu et al., 2015 | |
| PGS Catalog/LDpred polygenic risk scores | This paper | |
| PGS Catalog/PRS-CS polygenic risk score for prostate cancer | This paper | |
| PGS Catalog/PRS-CS polygenic risk score for breast cancer | Mars et al., 2020 | |
| PGS Catalog/PRS-CS polygenic risk score for coronary artery disease and type 2 diabetes | Tamlander et al., 2022 | |
| PRS-CS (version Sep 10, 2020) | Ge et al., 2019 | |
| PLINK v2.00a2.3LM | Chang et al., 2015 | |
| STEROID 0.1 | - | |
| Eagle v2.3.5 | Loh P-R et al. 2016 | |
| R statistical programming v3.2.0 or later | - | |
| LDpred v1.0.7 | Vilhjálmsson et al. 2015 | |
| PRS-CS pipeline | - | |
| Project code | This paper | |