| Literature DB >> 34529658 |
Lars G Fritsche1,2,3,4, Ying Ma1,2, Daiwei Zhang1,2, Maxwell Salvatore1,3,5, Seunggeun Lee1,2,6, Xiang Zhou1,2,3, Bhramar Mukherjee1,2,3,4,5,7.
Abstract
Polygenic risk scores (PRS) can provide useful information for personalized risk stratification and disease risk assessment, especially when combined with non-genetic risk factors. However, their construction depends on the availability of summary statistics from genome-wide association studies (GWAS) independent from the target sample. For best compatibility, it was reported that GWAS and the target sample should match in terms of ancestries. Yet, GWAS, especially in the field of cancer, often lack diversity and are predominated by European ancestry. This bias is a limiting factor in PRS research. By using electronic health records and genetic data from the UK Biobank, we contrast the utility of breast and prostate cancer PRS derived from external European-ancestry-based GWAS across African, East Asian, European, and South Asian ancestry groups. We highlight differences in the PRS distributions of these groups that are amplified when PRS methods condense hundreds of thousands of variants into a single score. While European-GWAS-derived PRS were not directly transferrable across ancestries on an absolute scale, we establish their predictive potential when considering them separately within each group. For example, the top 10% of the breast cancer PRS distributions within each ancestry group each revealed significant enrichments of breast cancer cases compared to the bottom 90% (odds ratio of 2.81 [95%CI: 2.69,2.93] in European, 2.88 [1.85, 4.48] in African, 2.60 [1.25, 5.40] in East Asian, and 2.33 [1.55, 3.51] in South Asian individuals). Our findings highlight a compromise solution for PRS research to compensate for the lack of diversity in well-powered European GWAS efforts while recruitment of diverse participants in the field catches up.Entities:
Mesh:
Year: 2021 PMID: 34529658 PMCID: PMC8445431 DOI: 10.1371/journal.pgen.1009670
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 6.020
Fig 1Violin plots of the breast and prostate cancer PRS distributions.
Breast cancer (left) and prostate cancer (right) GPRS (GWAS hit-based PRS, top) and CSPRS (PRS-CS-based PRS, bottom) stratified by ancestry group are shown. Black vertical lines indicate 25, 50, and 75% quantiles within the ancestry-specific case (orange) and control (green) distributions. Red lines indicate 10% quantiles of the corresponding UKB PRS distribution in all controls. Sample sizes for each sub-set can be found in Table 1.
Association and evaluation of cancer PRS across ancestry groups.
| GWAS Trait/Outcome | PRS Method (SNPs) | Ancestry Group | n Cases | n Controls | PRS Association | PRS Evaluation | |
|---|---|---|---|---|---|---|---|
| Odds Ratio* (95% CI) | P | ||||||
|
| GPRS (334) | EUR | 14109 | 214163 | 1.594 (1.567, 1.622) | 1.7x10-613 | 0.628 (0.623, 0.632) |
| SAS | 149 | 3598 | 1.451 (1.231, 1.71) | 8.80x10-6 | 0.603 (0.557, 0.651) | ||
| AFR | 116 | 3666 | 1.442 (1.199, 1.735) | 0.00011 | 0.6 (0.546, 0.65) | ||
| EAS | 45 | 1069 | 1.852 (1.385, 2.475) | 3.20x10-5 | 0.676 (0.59, 0.757) | ||
| CSPRS (1,120,410) | EUR | 14109 | 214163 | 1.771 (1.739, 1.803) | 5.4x10-857 | 0.653 (0.648, 0.657) | |
| SAS | 149 | 3598 | 1.656 (1.4, 1.958) | 4.00x10-9 | 0.641 (0.594, 0.687) | ||
| AFR | 116 | 3666 | 1.761 (1.453, 2.134) | 7.90x10-9 | 0.651 (0.598, 0.701) | ||
| EAS | 45 | 1069 | 1.761 (1.297, 2.39) | 0.00029 | 0.66 (0.585, 0.735) | ||
|
| GPRS (377) | EUR | 6561 | 182590 | 1.943 (1.894, 1.993) | 3.1x10-566 | 0.68 (0.674, 0.687) |
| SAS | 51 | 4305 | 1.785 (1.389, 2.295) | 6.00x10-6 | 0.652 (0.576, 0.726) | ||
| AFR | 144 | 2681 | 1.501 (1.254, 1.796) | 9.70x10-6 | 0.615 (0.567, 0.66) | ||
| EAS | 7 | 622 | 1.63 (0.83, 3.205) | 0.16 | 0.619 (0.442, 0.811) | ||
| CSPRS (1,120,596) | EUR | 6561 | 182590 | 2.14 (2.085, 2.197) | 4.2x10-711 | 0.702 (0.695, 0.708) | |
| SAS | 51 | 4305 | 2.383 (1.826, 3.111) | 1.70x10-10 | 0.745 (0.684, 0.8) | ||
| AFR | 144 | 2681 | 1.325 (1.107, 1.586) | 0.0021 | 0.579 (0.527, 0.63) | ||
| EAS | 7 | 622 | 1.943 (1.005, 3.755) | 0.048 | 0.626 (0.385, 0.853) | ||
Analyses were adjusted for birth year, genotyping array, and first ten principal components. Odds ratios are given per standard deviation within ethnic group. Abbreviations: AAUC, covariate-adjusted area under the receiver-operator characteristics curve; CI, confidence interval; GWAS, genome-wide association study; PRS, polygenic risk score; GPRS: GWAS Hits-based PRS; CSPRS: PRS-CS based PRS; SNP, single nucleotide polymorphism; AFR: African; EAS: East Asian; EUR: European, SAS: South Asian
Fig 2Observed case proportion across CSPRS (PRS-CS-based PRS) risk deciles.
Proportions of breast cancer cases (A) and prostate cancer cases (B) stratified by ancestry groups are shown. Total case counts per ancestry group are given in parentheses. Underlying sample counts and corresponding Cochran-Armitage Test for Trend P-values are reported in S3 and S4 Tables.
Case enrichment in breast and prostate cancer PRS top 10% versus bottom 90%.
| GWAS Trait/Outcome | Ancestry Group | GPRS | CSPRS | ||
|---|---|---|---|---|---|
| OR Top 10% (95% CI) | P | OR Top 10% (95% CI) | P | ||
| Overall Breast Cancer | EUR | 2.36 (2.26, 2.47) | 1.0x10-328 | 2.81 (2.69, 2.93) | 2.5x10-499 |
| SAS | 2.54 (1.70, 3.79) | 5.98x10-6 | 2.33 (1.55, 3.51) | 5.01x10-5 | |
| AFR | 2.18 (1.36, 3.49) | 0.00116 | 2.88 (1.85, 4.48) | 2.75x10-6 | |
| EAS | 3.52 (1.79, 6.9) | 0.000254 | 2.60 (1.25, 5.40) | 0.0106 | |
| Prostate Cancer | EUR | 3.32 (3.13, 3.52) | 1.26x10-346 | 4.00 (3.78, 4.23) | 3.77x10-495 |
| SAS | 3.11 (1.66, 5.84) | 0.000409 | 4.41 (2.43, 8.04) | 1.18x10-6 | |
| AFR | 1.41 (0.85, 2.34) | 0.179 | 1.78 (1.09, 2.92) | 0.0223 | |
| EAS | 4.89 (1.26, 19.0) | 0.0219 | 6.53 (1.71, 25.0) | 0.00614 | |
Abbreviations: PRS, polygenic risk score; GPRS: GWAS Hits-based PRS; CSPRS: PRS-CS based PRS; AFR, African; EAS, East Asian; EUR, European, SAS, South Asian