| Literature DB >> 34906514 |
Weang-Kee Ho1, Mei-Chee Tai2, Joe Dennis3, Xiang Shu4, Jingmei Li5, Peh Joo Ho6, Iona Y Millwood7, Kuang Lin8, Yon-Ho Jee9, Su-Hyun Lee10, Nasim Mavaddat3, Manjeet K Bolla3, Qin Wang3, Kyriaki Michailidou11, Jirong Long12, Eldarina Azfar Wijaya2, Tiara Hassan2, Kartini Rahmat13, Veronique Kiak Mien Tan14, Benita Kiat Tee Tan15, Su Ming Tan16, Ern Yu Tan17, Swee Ho Lim18, Yu-Tang Gao19, Ying Zheng20, Daehee Kang21, Ji-Yeob Choi22, Wonshik Han23, Han-Byoel Lee23, Michiki Kubo24, Yukinori Okada25, Shinichi Namba26, Sue K Park27, Sung-Won Kim28, Chen-Yang Shen29, Pei-Ei Wu29, Boyoung Park30, Kenneth R Muir31, Artitaya Lophatananon31, Anna H Wu32, Chiu-Chen Tseng32, Keitaro Matsuo33, Hidemi Ito34, Ava Kwong35, Tsun L Chan36, Esther M John37, Allison W Kurian37, Motoki Iwasaki38, Taiki Yamaji38, Sun-Seog Kweon39, Kristan J Aronson40, Rachel A Murphy41, Woon-Puay Koh42, Chiea-Chuen Khor43, Jian-Min Yuan44, Rajkumar Dorajoo45, Robin G Walters7, Zhengming Chen7, Liming Li46, Jun Lv46, Keum-Ji Jung47, Peter Kraft48, Paul D B Pharoah49, Alison M Dunning50, Jacques Simard51, Xiao-Ou Shu12, Cheng-Har Yip52, Nur Aishah Mohd Taib53, Antonis C Antoniou3, Wei Zheng12, Mikael Hartman54, Douglas F Easton49, Soo-Hwang Teo55.
Abstract
PURPOSE: Non-European populations are under-represented in genetics studies, hindering clinical implementation of breast cancer polygenic risk scores (PRSs). We aimed to develop PRSs using the largest available studies of Asian ancestry and to assess the transferability of PRS across ethnic subgroups.Entities:
Keywords: Breast cancer; Genetic; Polygenic risk score; Risk prediction
Mesh:
Year: 2021 PMID: 34906514 PMCID: PMC7612481 DOI: 10.1016/j.gim.2021.11.008
Source DB: PubMed Journal: Genet Med ISSN: 1098-3600 Impact factor: 8.864
Figure 1Overview of methods for PRSs development.
Inputs are summary statistics from the meta-analysis of multiple GWAS data sets—BCAC ASN + ABCC denotes training data set 1, BCAC ASN + BBJ denotes training data set 2, and BCAC-EUR denotes training data set 3 as described in the method section. LD ref: BCAC ASN denotes OncoArray studies in which BCAC Asian studies were used as reference panel; LD ref: BCAC EUR denotes BCAC studies in which European ancestries were used as reference panel; 1000G ASN and 1000G EUR denote the Asian and European samples, respectively, in 1000 Genomes Project. Figure 1 shows methods using East Asian–ancestry women (Chinese and Malays), and as an example, same methods were applied to South Asian–ancestry women in the validation data set. ABCC, Asia Breast Cancer Consortium; ASN, Asian; BBJ, The BioBank Japan Project; BCAC, Breast Cancer Association Consortium; C + T, clumping and thresholding; EUR, European; GWAS, genome-wide association study; LD ref, reference panel for linkage disequilibrium; PRS, polygenic risk score; SNV, single-nucleotide variation.
Figure 2Principal components analysis and mean of PRS46 + PRS287_EB according to country and ethnicity.
(A) PC plotted according to country. PCs analysis of samples genotyped with OncoArray as listed in Supplemental Table 1. The samples were grouped according to country (Thailand, Taiwan, Hong Kong, China, Korea, and Japan). For M + S, the samples were further categorized by their self-reported ethnic origin (Chinese, Malay, and Indian). (B) Mean of standardized PRS46 + PRS287_EB in controls according to country. PRS was standardized according to the control SDs of each study. Error bars represent 95% CI. The mean of standardized PRS46 + PRS287_EB in European controls were included for reference. EB, Empirical Bayes; M + S, Malaysia and Singapore; PC, principal component; PRS, polygenic risk score.
Mean, SD, and the association of PRSs with breast cancer risk in women of East Asian ancestry
| Method | PRS | Validation Set
| Test Set
| ||||||
|---|---|---|---|---|---|---|---|---|---|
| Cases | Control | OR Per SD
| Cases | Control | HR Per SD
| ||||
| Mean (SD) | Mean (SD) | (95% CI) | AUC | Mean (SD) | Mean (SD) | (95% CI) | AUC
| ||
| (1) Clumping and thresholding |
| −0.387 (0.446) | −0.538 (0.443) | 1.37 (1.32-1.42) | 0.589 | −0.299 (0.433) | −0.444 (0.438) | 1.40 (1.25-1.56) | 0.600 |
| (2) Penalized regression |
| 0.075 (0.455) | −0.082 (0.452) | 1.41 (1.37-1.47) | 0.598 | 0.107 (0.460) | −0.059 (0.458) | 1.45 (1.31-1.61) | 0.608 |
| (3) EUR SNVs + EUR weights |
| 0.865 (0.548) | 0.640 (0.549) | 1.50 (1.45-1.56) | 0.615 | 0.876 (0.549) | 0.679 (0.541) | 1.46 (1.34-1.60) | 0.609 |
| (4) EUR SNVs + ASN weights |
| −0.533 (0.445) | −0.714 (0.447) | 1.50 (1.45-1.56) | 0.614 | −0.552 (0.448) | −0.731 (0.441) | 1.49 (1.33-1.66) | 0.608 |
| (5) EUR SNVs + EB weights |
| 0.343 (0.491) | 0.135 (0.492) | 1.53 (1.47-1.58) | 0.620 | 0.341 (0.493) | 0.153 (0.485) | 1.50 (1.35-1.65) | 0.609 |
| Combine (1) + (3) |
| 0.058 (0.440) | −0.134 (0.437) | 1.54 (1.49-1.60) | 0.623 | 0.103 (0.442) | −0.075 (0.436) | 1.52 (1.36-1.70) | 0.620 |
| Combine (2) + (3) |
| 0.062 (0.447) | −0.139 (0.444) | 1.56 (1.50-1.61) | 0.626 | 0.080 (0.454) | −0.106 (0.447) | 1.54 (1.38-1.72) | 0.622 |
| Combine (1) + (4) |
| 0.052 (0.425) | −0.127 (0.423) | 1.52 (1.47-1.58) | 0.619 | 0.070 (0.425) | −0.113 (0.421) | 1.52 (1.35-1.70) | 0.621 |
| Combine (2) + (4) |
| 0.055 (0.430) | −0.130 (0.430) | 1.54 (1.48-1.60) | 0.621 | 0.057 (0.435) | −0.135 (0.427) | 1.53 (1.37-1.72) | 0.623 |
| Combine (1) + (5) |
| 0.061 (0.446) | −0.137 (0.443) | 1.55 (1.50-1.61) | 0.625 | 0.089 (0.447) | −0.089 (0.441) | 1.53 (1.37-1.71) | 0.621 |
| Combine (2) + (5) |
| 0.063 (0.451) | −0.139 (0.449) | 1.56 (1.51-1.62) | 0.627 | 0.077 (0.455) | −0.120 (0.447) | 1.55 (1.39-1.72) | 0.623 |
| (6) PRS-CSx |
| 0.082 (0.493) | −0.159 (0.489) | 1.62 (1.52-1.68) | 0.636 | −0.145 (0.511) | −0.388 (0.511) | 1.62 (1.46-1.80) | 0.635 |
ASN, Asian; AUC, area under the receiver operating curve; CKB, China Kadoorie Biobank; EB, Empirical Bayes; EUR, European; HR, hazard ratio; KCPS-II, Korean Cancer Prevention Study-II Biobank; MYBRCA, Malaysian Breast Cancer Genetic Study; OR, odds ratio; PRS, polygenic risk score; SCHS, Singapore Chinese Health Study; SGBCC, Singapore Breast Cancer Cohort; SNV, single-nucleotide variation.
Validation cohort that consisted of 6392 breast cancer cases and 6638 control of Chinese- and Malay-ancestry from MYBRCA and SGBCC (Supplemental Table 1).
Prospective cohorts that consisted of 89,898 control and 1592 breast cancer cases from 3 prospective cohorts, SCHS, China Kadoorie Biobank (CKB), and KCPS-II (Supplemental Table 1).
Adjusted for the first 10 principal components and study, and standardized to SDs in controls of each PRS.
Fixed effect meta-analysis of 3 prospective cohorts, SCHS, CKB and KCPS-II. HR per SD and AUC of individual studies can be found in Supplemental Figure 5.
PRSs were derived using 46, 2985 and 287 selected SNVs respectively as described in the Method section.
Combined PRSs were generated using the formula α 0 + α 1 PRS 1 + α 2 PRS 2 where α 0, α 1 and α 2 are the weights obtained by fitting a logistic regression model with breast cancer as outcome, PRS 1 and PRS 2 as explanatory variables using the validation data set. The weights for the considered combination of PRSs can be found in Supplemental Table 5.
Mean, SD, and the association of PRSs with breast cancer risk in women of South Asian ancestry
| Method | PRS Developed on the basis of East Asians’data set
| Validation Set
| |||
|---|---|---|---|---|---|
| Cases | Control | OR Per SD
| |||
| Mean (SD) | Mean (SD) | (95% CI) | AUC | ||
| (1) Clumping and thresholding |
| −0.490 (0.388) | −0.548 (0.387) | 1.18 (1.06-1.31) | 0.546 |
| (2) Penalized regression |
| 0.059 (0.381) | −0.048 (0.376) | 1.32 (1.19-1.46) | 0.581 |
| (3) EUR SNVs + EUR weights |
| 0.482 (0.570) | 0.251 (0.608) | 1.49 (1.34-1.67) | 0.614 |
| (4) EUR SNVs + ASN weights |
| −0.552 (0.493) | −0.720 (0.479) | 1.43 (1.28-1.58) | 0.592 |
| (5) EUR SNVs + EB weights |
| 0.084 (0.521) | −0.127 (0.545) | 1.50 (1.35-1.67) | 0.613 |
| Combine (1) + (3) |
| −0.212 (0.420) | −0.376 (0.444) | 1.48 (1.33-1.65) | 0.611 |
| Combine (2) + (3) |
| −0.166 (0.419) | −0.347 (0.441) | 1.53 (1.37-1.71) | 0.620 |
| Combine (1) + (4) |
| 0.008 (0.431) | −0.135 (0.420) | 1.42 (1.28-1.57) | 0.591 |
| Combine (2) + (4) |
| 0.036 (0.425) | −0.121 (0.413) | 1.46 (1.32-1.62) | 0.602 |
| Combine (1) + (5) |
| −0.157 (0.438) | −0.328 (0.455) | 1.49 (1.33-1.66) | 0.610 |
| Combine (2) + (5) |
| −0.119 (0.434) | −0.304 (0.449) | 1.52 (1.37-1.70) | 0.618 |
| (6) PRS-CSx |
| −0.308 (0.501) | −0.546 (0.502) | 1.62 (1.46-1.81) | 0.633 |
ASN, Asian; AUC, area under the receiver operating curve; EB, Empirical Bayes; EUR, European; MYBRCA, Malaysian Breast Cancer Genetic Study; OR, odds ratio; PRS, polygenic risk score; SGBCC, Singapore Breast Cancer Cohort; SNV, single-nucleotide variation.
PRSs developed on the basis of Chinese and Malay-ancestry women in the validation data set as described in Table 1. Cohort from Chinese- and Malay-ancestry of MYBRCA and SGBCC as in Table 1.
Evaluation of PRSs performance in 585 breast cancer cases and 1018 controls of Indian-ancestry women in the validation dataset (Supplemental Table 1).
Adjusted for the first 10 principal components and study, and standardized to SDs in controls of each PRS.
Combined PRSs were generated using the formula α 0 + α 1 PRS 1 + α 2 PRS 2 where α 0, α 1 and α 2 are the weights estimated from East Asian ancestry women as described in Table 1. The weights for the considered combination of PRSs can be found in Supplemental Table 5.
Figure 3Absolute breast cancer risk by percentiles of PRS and PRS distribution by ancestry.
(A) Lifetime and (B) 10-year absolute risk of developing breast cancer for Chinese women calculated using Singaporean incidence and mortality data and odds ratio per SD of PRS46 + PRS287_EB in Chinese (1.56 as reported in Supplemental Table 9). The gray dashed lines in the (A) and (B) represent the average lifetime risk and absolute 10-year risk, respectively, for Singaporean Chinese women. The red horizontal dashed line (2.3%) in the (B) represents the 10-year absolute risk for a 50-year old EUR women where screening is recommended; (C) the distribution of PRS46 + PRS287_EB in Chinese-ancestry, Indian-ancestry and Malay-ancestry women, generated using ethnic-specific mean and SD of controls as reported in Supplemental Table 9, and the corresponding cumulative breast cancer risk by age 80, generated using calendar-specific breast cancer incidence and mortality rates for Chinese, Malay, and Indian women in Singapore. Area under the curves represent the percentiles of PRS287_EB. The right vertical dashed line represents the 90th percentile cutoff for PRS distribution in Chinese-ancestry women; eg, the 95th percentile in Indians (lifetime risk = 11%) corresponds, approximately, to the 90th percentile in the Chinese population. If Chinese PRS distribution was used as a reference, these Indian women would be categorized as 90th percentile and hence would be told that their corresponding lifetime risk was 9% instead of 11%; (D) the distribution of EUR PRS (PRS287_EUR) for women of EUR ancestry, Chinese ancestry, Malay ancestry, or Indian ancestry. The right vertical dashed line represents the 90th percentile cutoff for PRS distribution in EUR-ancestry women. EB, Empirical Bayes; EUR, European; PRS, polygenic risk score.