| Literature DB >> 32991828 |
Lars G Fritsche1, Snehal Patil2, Lauren J Beesley3, Peter VandeHaar2, Maxwell Salvatore3, Ying Ma2, Robert B Peng4, Daniel Taliun2, Xiang Zhou5, Bhramar Mukherjee6.
Abstract
To facilitate scientific collaboration on polygenic risk scores (PRSs) research, we created an extensive PRS online repository for 35 common cancer traits integrating freely available genome-wide association studies (GWASs) summary statistics from three sources: published GWASs, the NHGRI-EBI GWAS Catalog, and UK Biobank-based GWASs. Our framework condenses these summary statistics into PRSs using various approaches such as linkage disequilibrium pruning/p value thresholding (fixed or data-adaptively optimized thresholds) and penalized, genome-wide effect size weighting. We evaluated the PRSs in two biobanks: the Michigan Genomics Initiative (MGI), a longitudinal biorepository effort at Michigan Medicine, and the population-based UK Biobank (UKB). For each PRS construct, we provide measures on predictive performance and discrimination. Besides PRS evaluation, the Cancer-PRSweb platform features construct downloads and phenome-wide PRS association study results (PRS-PheWAS) for predictive PRSs. We expect this integrated platform to accelerate PRS-related cancer research.Entities:
Keywords: EHR; GWAS; PRS; PheWAS; cancer genetics; complex traits; electronic health records; genome-wide association studies; phenome-wide association studies; polygenic risk scores
Mesh:
Year: 2020 PMID: 32991828 PMCID: PMC7675001 DOI: 10.1016/j.ajhg.2020.08.025
Source DB: PubMed Journal: Am J Hum Genet ISSN: 0002-9297 Impact factor: 11.025
Demographics and Clinical Characteristics of the Analytic Datasets
| Total participants | 38,360 | 408,595 |
| Females, n (%) | 20,141 (52.5%) | 220,896 (54.1%) |
| Mean age, years (SD) | 56.8 (16.2) | 56.9 (8.0) |
| Median number of visits per participant | 45 | not available |
| Median time (years) between first and last visit | 5.5 | not available |
| Median number of unique ICD9 codes | 36 | 2 |
| Median number of unique ICD10 codes | 31 | 6 |
| Number of PheCodes with more than 50 cases | 1,689 | 1,419 |
| Any cancer diagnosis | 20,751 (54.1%) | 69,190 (16.9%) |
| Basal cell carcinoma (172.21) | 2,988 (7.79%) | not available |
| Melanomas of skin, dx or hx (172.1) | 2,701 (7.04%) | 2,682 (0.66%) |
| Breast cancer [female] (174.1) | 2,605 (12.93%) | 12,483 (5.65%) |
| Cancer of prostate (185) | 2,432 (13.35%) | 5,977 (3.18%) |
| Squamous cell carcinoma (172.22) | 1,917 (5.00%) | not available |
| Cancer of bladder (189.2) | 1,575 (4.11%) | 2,413 (0.59%) |
| Colorectal cancer (153) | 1,196 (3.12%) | 4,585 (1.12%) |
| Non-Hodgkins lymphoma (202.2) | 1,141 (2.97%) | 1,810 (0.44%) |
| Cancer of connective tissue (170.2) | 1,097 (2.86%) | 331 (0.08%) |
| Malignant neoplasm of kidney, except pelvis (189.11) | 1,083 (2.82%) | 1,033 (0.25%) |
| Colon cancer (153.2) | 941 (2.45%) | 3,108 (0.76%) |
| Myeloproliferative disease (200) | 886 (2.31%) | 992 (0.24%) |
| Cancer of bronchus; lung (165.1) | 874 (2.28%) | 2,232 (0.55%) |
| Thyroid cancer (193) | 798 (2.08%) | 347 (0.08%) |
| Malignant neoplasm of rectum, rectosigmoid junction, and anus (153.3) | 669 (1.74%) | 2,167 (0.53%) |
| Malignant neoplasm of uterus (182) | 643 (3.19%) | 1,285 (0.58%) |
| Nodular lymphoma (202.21) | 632 (1.65%) | 365 (0.09%) |
| Cancer of tongue (145.2) | 550 (1.43%) | 310 (0.08%) |
| Leukemia (204) | 545 (1.42%) | 1,665 (0.41%) |
| Cancer of brain (191.11) | 483 (1.26%) | 525 (0.13%) |
The provided characteristics are based on the European subjects in MGI and white British subjects in UKB for which phenotype and imputed genotype data were available. SD, standard deviation.
ICD9/10-CM codes
Skin cancer sub-types
Figure 1Schematic Overview of PRS Generation and Analysis
Overview of GWAS Sources and PRS Construction Methods
| GWAS Catalog | yes | yes | no | |
| Large GWAS | yes | yes | yes, if full GWAS | |
| UKB GWAS | yes | yes | yes | |
| yes | yes | yes | ||
| yes | yes | yes | ||
| yes | yes | yes | ||
Multiple PRSs were constructed per trait of interest depending on availability of GWAS summary statistics.
Uncorrelated variants with p value ≤ 5 × 10−5, 5 × 10−6, 5 × 10−7, 5 × 10−8 (“GWAS Hits”), or 5 × 10−9
LD pruning and p value thresholding
Figure 2Distribution of Breast Cancer and Chronic Lymphoid Leukemia PRSs in MGI and UKB
Breast cancer (A, B) and chronic lymphoid leukemia (C, D) PRSs in matched case-controls samples in MGI (A, C) and UKB (B, D) are shown. The five top PRS percentiles (1%, 2%, 5%, 10%, and 25% [defined in control subjects]) are indicated by the shaded areas under the density curves while corresponding odds ratios (OR) and their 95% confidence intervals are given in the top right corner of each plot. PRSs were standardized.
Comparison of PRS Methods on Breast Cancer PRS Performance in MGI and UKB
| Lassosum: s = 0.5, λ = 0.0043 | 2.48 (1.63,3.77) | |||||
| P&T: p ≤ 4 × 10−4 | 3,038 | 0.0532 | 0.135 | 0.635 (0.618,0.651) | 1.64 (1.54,1.75) | 2.84 (1.88,4.28) |
| Fixed threshold: p ≤ 5 × 10−5 | 1,307 | 0.0521 | 0.135 | 0.634 (0.618,0.65) | 1.63 (1.53,1.74) | 2.75 (1.81,4.17) |
| Fixed threshold: p ≤ 5 × 10−6 | 712 | 0.0484 | 0.135 | 0.629 (0.612,0.645) | 1.60 (1.50,1.70) | 3.06 (2.04,4.59) |
| Fixed threshold: p ≤ 5 × 10−7 | 464 | 0.0476 | 0.135 | 0.627 (0.611,0.644) | 1.60 (1.50,1.70) | |
| Fixed threshold: p ≤ 5 × 10−8 | 334 | 0.0462 | 0.136 | 0.625 (0.609,0.641) | 1.58 (1.49,1.69) | 3.32 (2.24,4.93) |
| Fixed threshold: p ≤ 5 × 10−9 | 264 | 0.0455 | 0.136 | 0.624 (0.608,0.64) | 1.58 (1.48,1.68) | 2.56 (1.68,3.90) |
| Lassosum: s = 0.9, λ = 0.0043 | ||||||
| P&T: p ≤ 1 × 10−4 | 1,682 | 0.0401 | 0.0811 | 0.63 (0.623,0.637) | 1.61 (1.57,1.66) | 3.69 (3.16,4.31) |
| Fixed threshold: p ≤ 5 × 10−6 | 712 | 0.0402 | 0.0811 | 0.628 (0.62,0.635) | 1.61 (1.57,1.65) | 3.32 (2.83,3.90) |
| Fixed threshold: p ≤ 5 × 10−5 | 1,307 | 0.0392 | 0.0812 | 0.627 (0.62,0.634) | 1.60 (1.56,1.64) | 3.49 (2.98,4.08) |
| Fixed threshold: p ≤ 5 × 10−7 | 464 | 0.0384 | 0.0812 | 0.626 (0.618,0.633) | 1.59 (1.55,1.63) | 3.81 (3.27,4.44) |
| Fixed threshold: p ≤ 5 × 10−8 | 334 | 0.0361 | 0.0813 | 0.622 (0.615,0.63) | 1.57 (1.53,1.61) | 3.69 (3.16,4.31) |
| Fixed threshold: p ≤ 5 × 10−9 | 264 | 0.0347 | 0.0813 | 0.62 (0.612,0.627) | 1.55 (1.51,1.59) | 3.28 (2.79,3.86) |
PRSs are based on the BCAC Consortium GWAS on overall breast cancer. Italic values indicate best performing PRSs according to the corresponding metrics for MGI or UKB.
Influence of GWAS Sources on Breast Cancer PRS Performance in MGI
| Large GWAS Michailidou et al. | Lassosum: s = 0.5, λ = 0.0043 | 118,388 | 2.48 (1.63,3.77) | ||||
| GWAS Catalog (N/A) | P&T: p | 62 | 0.0346 | 0.136 | 0.607 (0.589,0.623) | 1.49 (1.40,1.58) | |
| UKB GWAS PHECODE (23,839) | Lassosum: s = 0.5, λ = 0.014 | 6,977 | 0.0340 | 0.137 | 0.609 (0.592,0.624) | 1.48 (1.39,1.57) | 1.94 (1.21,3.11) |
| UKB GWAS FINNGEN (18,376) | Lassosum: s = 0.5, λ = 0.018 | 2,267 | 0.0300 | 0.137 | 0.600 (0.584,0.616) | 1.44 (1.35,1.53) | 2.37 (1.54,3.65) |
| UKB GWAS ICD10 (15,792) | Lassosum: s = 0.5, λ = 0.018 | 4,047 | 0.0264 | 0.137 | 0.595 (0.579,0.610) | 1.40 (1.32,1.49) | 2.05 (1.30,3.24) |
| UKB GWAS PHESANT (15,282) | Fixed threshold: p | 22 | 0.0204 | 0.138 | 0.579 (0.561,0.597) | 1.34 (1.27,1.43) | 2.32 (1.49,3.61) |
| Large GWAS Michailidou et al. | Lassosum: s = 0.9, λ = 0.0043 | 286,144 | |||||
| GWAS Catalog (N/A) | P&T: p | 79 | 0.0226 | 0.0819 | 0.598 (0.59,0.605) | 1.43 (1.39,1.46) | 2.68 (2.25,3.18) |
Italic values indicate best performing PRS according to the corresponding metrics for MGI or UKB.
Effective sample size: 4 / (1/#cases + 1/#controls); n/a: not available; references of studies contributing to GWAS Catalog PRS are listed in Table S4.
PRSs were scaled to mean = 0 and SD = 1.
Top 1% versus rest.
Top PRSs for the 20 Most Common Cancer Traits in MGI
| Basal cell carcinoma (172.21) | Large GWAS: Chahal et al. | P&T: p ≤ 4 × 10−8 | 27 | 0.106 | 0.632 (0.616,0.647) | 1.66 (1.57,1.76) | 3.79 (2.68,5.35) |
| Melanomas of skin (172.1) | UKB GWAS PHECODE | P&T: p ≤ 2 × 10−7 | 15 | 0.0952 | 0.604 (0.587,0.62) | 1.49 (1.4,1.57) | 2.97 (2.04,4.34) |
| Breast cancer [female] (174.1) | Large GWAS: Michailidou et al. | Lassosum: s = 0.5, λ = 0.0043 | 118,388 | 0.134 | 0.641 (0.625,0.656) | 1.70 (1.59,1.81) | 2.48 (1.63,3.77) |
| Cancer of prostate (185) | Large GWAS: Schumacher et al. | Lassosum: s = 0.5, λ = 0.007 | 26,418 | 0.145 | 0.665 (0.647,0.684) | 1.91 (1.77,2.05) | 4.92 (3.21,7.55) |
| Squamous cell carcinoma (172.22) | GWAS Catalog | P&T: p ≤ 1 × 10−11 | 7 | 0.0977 | 0.593 (0.573,0.613) | 1.45 (1.36,1.55) | 3.74 (2.46,5.68) |
| Cancer of bladder (189.2) | GWAS Catalog | P&T: p ≤ 5 × 10−8 | 13 | 0.0917 | 0.572 (0.55,0.594) | 1.29 (1.2,1.39) | 1.47 (0.779,2.77) |
| Colorectal cancer (153) | Large GWAS: Huyghe et al. | P&T: p ≤ 4 × 10−7 | 81 | 0.0828 | 0.553 (0.525,0.577) | 1.21 (1.12,1.32) | 3.04 (1.79,5.17) |
| Colon cancer (153.2) | UKB GWAS PHECODE | Lassosum: s = 0.2, λ = 0.038 | 150 | 0.083 | 0.567 (0.54,0.594) | 1.25 (1.13,1.37) | 1.17 (0.477,2.87) |
| Cancer of bronchus/lung (165.1) | GWAS Catalog | P&T: p ≤ 1 × 10−10 | 14 | 0.0827 | 0.529 (0.503,0.558) | 1.12 (1.01,1.24) | 1.75 (0.796,3.85) |
| Thyroid cancer (193) | GWAS Catalog | P&T: p ≤ 3.2 × 10−10 | 8 | 0.0812 | 0.618 (0.587,0.647) | 1.57 (1.41,1.74) | 5.14 (2.94,8.99) |
| Nodular lymphoma (202.21) | UKB GWAS FINNGEN | Lassosum: s = 1, λ = 0.018 | 2,209,179 | 0.0825 | 0.538 (0.504,0.573) | 1.15 (1.02,1.29) | 1.48 (0.538,4.05) |
| Cancer of brain (191.11) | UKB GWAS ICD10 | Lassosum: s = 0.9, λ = 0.1 | 522 | 0.0824 | 0.546 (0.504,0.587) | 1.20 (1.04,1.37) | 1.42 (0.453,4.47) |
| Cancer of esophagus (150) | UKB GWAS ICD10 | Lassosum: s = 1, λ = 0.078 | 2,001 | 0.0826 | 0.551 (0.51,0.588) | 1.20 (1.04,1.39) | 1.81 (0.56,5.82) |
| Cancer of larynx (149.4) | UKB GWAS ICD10 | Lassosum: s = 0.9, λ = 0.1 | 25,920 | 0.0822 | 0.570 (0.522,0.618) | 1.28 (1.09,1.51) | 2.14 (0.649,7.06) |
| Cancer of other male genital organs (187) | UKB GWAS FINNGEN | P&T: p ≤ 4 × 10−6 | 97 | 0.083 | 0.558 (0.506,0.606) | 1.23 (1.03,1.46) | 1.04 (0.183,5.95) |
| Lymphoid leukemia (204.1) | UKB GWAS FINNGEN | P&T: p ≤ 1 × 10−6 | 6 | 0.0819 | 0.578 (0.517,0.642) | 1.36 (1.11,1.66) | 3.69 (1.01,13.4) |
| Multiple myeloma (204.4) | UKB GWAS ICD10 | P&T: p ≤ 7.9 × 10−6 | 27 | 0.0823 | 0.547 (0.479,0.613) | 1.24 (1,1.53) | 2.6 (0.593,11.4) |
| Cancer of testis (187.2) | UKB GWAS PHESANT | Lassosum: s = 0.9, λ = 0.078 | 771 | 0.084 | 0.656 (0.593,0.717) | 1.67 (1.3,2.14) | 2.72 (0.568,13.1) |
| Hodgkin’s disease (201) | GWAS Catalog | P&T: p ≤ 1 × 10−6 | 20 | 0.0821 | 0.620 (0.559,0.688) | 1.48 (1.15,1.89) | 2.64 (0.572,12.2) |
| Lymphoid leukemia, chronic (204.12) | GWAS Catalog | P&T: p ≤ 7 × 10−6 | 44 | 0.0776 | 0.696 (0.621,0.764) | 2.12 (1.65,2.74) | 12.9 (4.45,37.6) |
Cancer traits are sorted by observed case counts in MGI; references of studies contributing to GWAS Catalog PRSs are listed in Table S4.
PRSs were scaled to mean = 0 and SD = 1.
Top 1% versus rest.
Best Performing PRSs for the 20 Analyzed Cancer Traits in UKB
| Breast cancer [female] (174.1) | Large GWAS: Michailidou et al. | Lassosum: s = 0.9, λ = 0.0043 | 286,144 | 0.0807 | 0.643 (0.637,0.65) | 1.70 (1.65,1.75) | 4.02 (3.46,4.67) |
| Cancer of prostate (185) | Large GWAS: Schumacher et al. | Lassosum: s = 0.9, λ = 0.0055 | 178,259 | 0.0794 | 0.699 (0.690,0.710) | 2.13 (2.04,2.22) | 5.88 (4.85,7.14) |
| Colorectal cancer (153) | Large GWAS: Huyghe et al. | P&T: p ≤ 7.8 × 10−6 | 87 | 0.0812 | 0.617 (0.605,0.630) | 1.55 (1.48,1.62) | 4.00 (3.11,5.13) |
| Melanomas of skin (172.1) | GWAS Catalog | P&T: p ≤ 5 × 10−7 | 27 | 0.0812 | 0.619 (0.603,0.634) | 1.56 (1.48,1.66) | 3.12 (2.18,4.47) |
| Cancer of bladder (189.2) | GWAS Catalog | P&T: p ≤ 7 × 10−7 | 15 | 0.0821 | 0.571 (0.555,0.588) | 1.30 (1.23,1.38) | 2.91 (1.99,4.24) |
| Cancer of other lymphoid, histiocytic tissue (202) | GWAS Catalog | P&T: p ≤ 5 × 10−7 | 5 | 0.0822 | 0.490 (0.476,0.505) | 1.15 (1.09,1.21) | 1.97 (1.25,3.12) |
| Cancer of bronchus/lung (165.1) | GWAS Catalog | P&T: p ≤ 2.5 × 10−8 | 19 | 0.0824 | 0.552 (0.534,0.569) | 1.22 (1.15,1.30) | 1.94 (1.22,3.10) |
| Non-Hodgkins lymphoma (202.2) | GWAS Catalog | P&T: p ≤ 1 × 10−9 | 10 | 0.082 | 0.547 (0.527,0.566) | 1.24 (1.16,1.32) | 2.05 (1.24,3.40) |
| Cancer of uterus (182) | GWAS Catalog | P&T: p ≤ 1 × 10−7 | 20 | 0.082 | 0.572 (0.549,0.596) | 1.30 (1.20,1.41) | 2.60 (1.50,4.51) |
| Cancer of kidney, except pelvis (189.11) | GWAS Catalog | P&T: p ≤ 5 × 10−8 | 12 | 0.0825 | 0.517 (0.492,0.540) | 1.15 (1.06,1.25) | 2.17 (1.13,4.14) |
| Cancer of ovary (184.11) | Large GWAS: Phelan et al. | P&T: p ≤ 1.3 × 10−9 | 12 | 0.0824 | 0.558 (0.530,0.586) | 1.23 (1.12,1.35) | 1.55 (0.71,3.38) |
| Pancreatic cancer (157) | GWAS Catalog | P&T: p ≤ 5 × 10−9 | 10 | 0.0822 | 0.579 (0.548,0.611) | 1.34 (1.20,1.50) | 1.64 (0.655,4.12) |
| Cancer of brain and nervous system (191.1) | GWAS Catalog | P&T: p ≤ 3.2 × 10−9 | 19 | 0.0812 | 0.622 (0.590,0.653) | 1.56 (1.40,1.75) | 2.93 (1.34,6.41) |
| Multiple myeloma (204.4) | GWAS Catalog | P&T: p ≤ 2.5 × 10−8 | 21 | 0.0818 | 0.576 (0.536,0.616) | 1.32 (1.16,1.50) | 2.20 (0.854,5.66) |
| Cancer of brain (191.11) | GWAS Catalog | P&T: p ≤ 5 × 10−29 | 5 | 0.0813 | 0.606 (0.568,0.642) | 1.52 (1.34,1.71) | 4.15 (2.04,8.41) |
| Lymphoid leukemia, chronic (204.12) | GWAS Catalog | P&T: p ≤ 2.5 × 10−8 | 27 | 0.0796 | 0.672 (0.637,0.703) | 1.85 (1.62,2.11) | 2.52 (1.04,6.08) |
| Thyroid cancer (193) | GWAS Catalog | P&T: p ≤ 1 × 10−16 | 5 | 0.0804 | 0.628 (0.582,0.675) | 1.61 (1.38,1.88) | 4.41 (1.81,10.7) |
| Cancer of testis (187.2) | GWAS Catalog | P&T: p ≤ 5 × 10−6 | 44 | 0.0793 | 0.703 (0.659,0.745) | 2.11 (1.73,2.56) | 4.60 (1.75,12.1) |
| Basal cell carcinoma (172.21) | GWAS Catalog | P&T: p ≤ 5 × 10−9 | 24 | 0.0813 | 0.615 (0.608,0.623) | 1.53 (1.48,1.57) | 3.05 (2.55,3.64) |
| Squamous cell carcinoma (172.22) | Large GWAS: Chahal et al. | P&T: p ≤ 1.6 × 10−8 | 9 | 0.0819 | 0.571 (0.563,0.579) | 1.33 (1.30,1.37) | 1.93 (1.56,2.39) |
Cancer traits are sorted by observed case counts in UKB; references of studies contributing to GWAS Catalog PRSs are listed in Table S4.
PRSs were scaled to mean = 0 and SD = 1.
Top 1% versus rest.
Figure 3Case Enrichment of the Top Ranked PRSs for 13 Cancers that Were Available in MGI and UKB
Odds ratios (OR, top 10% versus bottom 90% of PRS distribution [defined in control subjects]) and their 95% confidence intervals are shown for MGI (left) and UKB (right).
Figure 4Example Exclusion PRS-PheWAS in the MGI and UKB Phenomes
The plots show Exclusion PheWAS for the thyroid cancer PRS (A, B) and for breast cancer [female] (C, D). The horizontal line indicates phenome-wide significance. Only the strongest and phenome-wide significantly associated traits within a category are labeled. Directional triangles indicate whether a phenome-wide significant trait was positively (pointing up) or negatively (pointing down) associated with the PRS.