| Literature DB >> 35395775 |
Can Hou1,2,3, Bin Xu2, Yu Hao2, Daowen Yang4, Huan Song5,6, Jiayuan Li7.
Abstract
BACKGROUND: Studies investigating breast cancer polygenic risk score (PRS) in Chinese women are scarce. The objectives of this study were to develop and validate PRSs that could be used to stratify risk for overall and subtype-specific breast cancer in Chinese women, and to evaluate the performance of a newly proposed Artificial Neural Network (ANN) based approach for PRS construction.Entities:
Keywords: Artificial neural network; Breast cancer; Estrogen receptor-negative breast cancer; Polygenic risk score; Single nucleotide polymorphisms
Mesh:
Year: 2022 PMID: 35395775 PMCID: PMC8991589 DOI: 10.1186/s12885-022-09425-3
Source DB: PubMed Journal: BMC Cancer ISSN: 1471-2407 Impact factor: 4.430
Fig. 1Flowchart of the quality control process of the genotypic data in SBCGS. Quality control procedures were carried out using PLINK 1.9. HWE: Hardy–Weinberg equilibrium; MAF: minor allele frequency; SBCGS: Shanghai Breast Cancer Genetics Study; SNPs: single nucleotide polymorphisms
Fig. 2Hyperparameters tuning (A: number of iterations, B: number of hidden layers and dropout rate) results of the Artificial Neural Network model
Characteristic of the participants in the Sichuan Breast Cancer Case-Control Study
| Characteristics | Control ( | Case ( | |
|---|---|---|---|
| Age (years) | 48.00 (42.00–53.00) | 50.00 (44.00–57.00) | 0.01 |
| BMI (kg/m2) | 23.37 (21.46–25.10) | 22.94 (21.23–25.24) | 0.18 |
| Age at menarche (years) | 14.00 (13.00–15.00) | 14.00 (13.00–15.00) | 0.29 |
| Number of live births (N) | 1.00 (1.00–1.00) | 1.00 (1.00–2.00) | < 0.001 |
| Gail-2 model 5-year risk (%) | 0.54 (0.42–0.67) | 0.54 (0.46–0.67) | 0.07 |
| PRSRLR | 0.44 (0.05–0.84) | 0.62 (0.23–1.05) | < 0.001 |
| PRSLRR | 0.02 (−0.18–0.23) | 0.14 (− 0.06–0.32) | < 0.001 |
| PRSANN | − 0.17 (− 0.33–0.09) | 0.01 (− 0.24–0.13) | < 0.001 |
| Menopausal status | 0.33 | ||
| | 223 (59.63%) | 239 (55.97%) | |
| | 151 (40.37%) | 188 (44.03%) | |
| Family history of breast cancer | 0.83 | ||
| | 7 (1.87%) | 10 (2.34%) | |
| | 367 (98.13%) | 417 (97.66%) | |
*P-value from Mann-Whitney U test (continuous variables) or chi-square test (categorical variables)
BMI body mass index, IQR interquartile range, PRS polygenic risk score, RLR repeated logistic regression, LRR logistic ridge regression, ANN Artificial Neural Network
Fig. 3Spearman’s rank correlation coefficient matrix for breast cancer risk factors, PRSs and Gail-2 model 5-year risk. BMI: body mass index; PRS: polygenic risk score; RLR: repeated logistic regression; LRR: logistic ridge regression; ANN: Artificial Neural Network
Predictive performance of the primary PRSs for overall and ER+/ER− breast cancer
| PRS | N (Controls/Cases) | Cases (Median, IQR) | Controls (Median, IQR) | Q | IQ-OR (95% CI) | O/E OR (95% CI) | AUC (95% CI) | Adjustmenta |
|---|---|---|---|---|---|---|---|---|
| PRSRLR | 374/427 | 0.62 (0.23–1.05) | 0.44 (0.05–0.84) | 2.09 (1.40–3.11) | 1.49 (1.23–1.81) | 1.10 (0.71–1.48) | 0.586 (0.547–0.625) | None |
| 0.15 (−0.23–0.58) | −0.01 (− 0.40–0.36) | 2.02 (1.36–2.99) | 1.43 (1.19–1.72) | 1.08 (0.51–1.64) | 0.582 (0.543–0.621) | B | ||
| PRSLRR | 0.14 (−0.06–0.32) | 0.02 (− 0.18–0.23) | 2.59 (1.71–3.91) | 1.58 (1.29–1.92) | 1.08 (0.62–1.55) | 0.598 (0.559–0.637) | None | |
| 0.11 (−0.09–0.28) | − 0.01 (− 0.22–0.21) | 2.47 (1.63–3.73) | 1.57 (1.28–1.93) | 1.08 (0.81–1.35) | 0.595 (0.556–0.634) | B | ||
| PRSANN | 0.01 (− 0.24–0.13) | −0.17 (− 0.33–0.09) | 2.61 (1.72–3.95) | 1.76 (1.39–2.24) | 1.09 (0.77–1.41) | 0.601 (0.562–0.640) | None | |
| 0.15 (−0.10–0.26) | −0.02 (− 0.19–0.22) | 2.51 (1.67–3.79) | 1.68 (1.34–2.12) | 1.17 (0.62–1.72) | 0.596 (0.557–0.635) | B | ||
| PRSRLR | 374/290 | 0.65 (0.24–1.06) | 0.44 (0.05–0.84) | 2.24 (1.44–3.48) | 1.56 (1.26–1.93) | 1.09 (0.61–1.57) | 0.597 (0.553–0.641) | None |
| 0.20 (−0.22–0.60) | − 0.01 (− 0.40–0.36) | 2.18 (1.41–3.37) | 1.49 (1.22–1.83) | 1.06 (0.53–1.59) | 0.592 (0.548–0.636) | B | ||
| PRSLRR | 0.15 (− 0.05–0.33) | 0.02 (− 0.18–0.23) | 2.94 (1.85–4.69) | 1.67 (1.34–2.08) | 1.12 (0.74–1.51) | 0.613 (0.570–0.656) | None | |
| 0.12 (− 0.07–0.30) | −0.01 (− 0.22–0.21) | 2.72 (1.71–4.32) | 1.67 (1.33–2.10) | 1.10 (0.75–1.45) | 0.608 (0.565–0.651) | B | ||
| PRSANN | 0.04 (−0.22–0.15) | − 0.17 (− 0.33–0.09) | 3.00 (1.87–4.78) | 1.96 (1.50–2.55) | 1.09 (0.80–1.38) | 0.620 (0.577–0.663) | None | |
| 0.17 (− 0.09–0.28) | −0.02 (− 0.19–0.22) | 2.89 (1.82–4.58) | 1.85 (1.43–2.39) | 1.12 (0.65–1.59) | 0.614 (0.571–0.657) | B | ||
| PRSRLR | 374/124 | 0.57 (0.19–0.92) | 0.44 (0.05–0.84) | 1.63 (0.91–2.95) | 1.27 (0.96–1.69) | 1.08 (0.42–1.74) | 0.554 (0.495–0.613) | None |
| 0.10 (−0.27–0.49) | − 0.01 (− 0.40–0.36) | 1.53 (0.86–2.73) | 1.24 (0.94–1.62) | 1.17 (0.35–1.99) | 0.549 (0.490–0.608) | B | ||
| PRSLRR | 0.08 (− 0.11–0.28) | 0.02 (− 0.18–0.23) | 1.79 (0.98–3.28) | 1.29 (0.97–1.72) | 1.23 (0.18–2.28) | 0.555 (0.496–0.614) | None | |
| 0.05 (− 0.12–0.25) | − 0.01 (− 0.22–0.21) | 1.87 (1.00–3.50) | 1.30 (0.96–1.75) | 1.15 (0.41–1.89) | 0.555 (0.496–0.614) | B | ||
| PRSANN | −0.13 (− 0.25–0.11) | −0.17 (− 0.33–0.09) | 1.78 (0.96–3.30) | 1.32 (0.93–1.87) | 1.37 (− 0.62–3.35) | 0.550 (0.491–0.609) | None | |
| −0.01 (− 0.11–0.24) | −0.02 (− 0.19–0.22) | 1.70 (0.92–3.14) | 1.30 (0.93–1.82) | 1.21 (− 0.18–2.60) | 0.548 (0.489–0.607) | B | ||
aAdjustment B: adjusted for classical breast cancer risk factors; Q4th vs Q1st OR: participants in the fourth quartile of the PRS vs those in the first quartile
ERb estrogen receptor positive, ER estrogen receptor negative, OR odds ratio, Q fourth quartile, Q first quartile, IQR interquartile range, PRS polygenic risk score, RLR repeated logistic regression, LRR logistic ridge regression, ANN Artificial Neural Network, IQ-OR OR per IQR increase of the PRS in controls, O/E OR observed to expected OR, AUC area under the receiver operator characteristic curve
Fig. 4Receiver operator characteristic curves for primary unadjusted A: PRSRLR, B: PRSLRR C: PRSANN, and calibration plots for primary unadjusted D: PRSRLR, E: PRSLRR F: PRSANN. In the calibration plots, each circle represents the predicted OR and observed OR by decile. O/E OR is the ratio of the observed to expected OR, corresponding to the coefficients of the log scale linear regression. OR: odds ratio; PRS: polygenic risk score; RLR: repeated logistic regression; LRR: logistic ridge regression; ANN: Artificial Neural Network; AUC: area under the receiver operator characteristic curve
Predictive performance of the subtype-specific PRSs for ER+/ER− breast cancer
| PRS | N (Controls/Cases) | Cases (Median, IQR) | Controls (Median, IQR) | Q | IQ-OR (95% CI) | O/E OR (95% CI) | AUC (95% CI) | Adjustmenta |
|---|---|---|---|---|---|---|---|---|
| ERb PRSRLR | 374/290 | 0.65 (0.19–1.08) | 0.41 (0.02–0.85) | 2.23 (1.43–3.46) | 1.55 (1.25–1.91) | 1.05 (0.73–1.37) | 0.596 (0.552–0.640) | None |
| 0.22 (−0.24–0.65) | − 0.02 (− 0.42–0.42) | 2.22 (1.43–3.46) | 1.55 (1.25–1.92) | 1.07 (0.75–1.39) | 0.595 (0.551–0.639) | A | ||
| 0.20 (− 0.26–0.62) | −0.02 (− 0.40–0.40) | 1.94 (1.26–2.99) | 1.48 (1.21–1.81) | 1.05 (0.53–1.56) | 0.591 (0.547–0.635) | B | ||
| ERb PRSLRR | 0.11 (−0.09–0.29) | −0.03 (− 0.23–0.18) | 2.92 (1.86–4.59) | 1.70 (1.37–2.10) | 1.09 (0.76–1.42) | 0.618 (0.575–0.661) | None | |
| 0.12 (−0.06–0.31) | −0.01 (− 0.21–0.20) | 2.93 (1.86–4.63) | 1.69 (1.36–2.09) | 1.07 (0.69–1.46) | 0.617 (0.574–0.660) | A | ||
| 0.13 (−0.08–0.31) | −0.01 (− 0.20–0.21) | 2.49 (1.59–3.89) | 1.65 (1.33–2.04) | 1.06 (0.71–1.41) | 0.613 (0.570–0.656) | B | ||
| ERb PRSANN | −0.11 (− 0.26–0.00) | −0.21 (− 0.40–0.07) | 2.81 (1.79–4.42) | 1.60 (1.28–2.01) | 1.16 (0.56–1.75) | 0.612 (0.569–0.655) | None | |
| 0.14 (−0.01–0.26) | 0.05 (− 0.13–0.19) | 2.51 (1.62–3.90) | 1.57 (1.26–1.95) | 1.18 (0.53–1.84) | 0.611 (0.568–0.654) | A | ||
| 0.13 (−0.02–0.24) | 0.05 (− 0.15–0.18) | 2.44 (1.57–3.81) | 1.47 (1.18–1.83) | 1.19 (0.72–1.66) | 0.596 (0.552–0.640) | B | ||
| ER− PRSRLR | 374/124 | 0.68 (0.31–0.97) | 0.52 (0.15–0.83) | 1.84 (1.02–3.33) | 1.41 (1.04–1.90) | 1.26 (0.52–2.01) | 0.574 (0.515–0.633) | None |
| 0.15 (−0.22–0.44) | 0.00 (− 0.37–0.31) | 1.91 (1.05–3.48) | 1.41 (1.04–1.90) | 1.28 (0.42–2.15) | 0.573 (0.514–0.632) | A | ||
| 0.12 (−0.23–0.45) | 0.01 (− 0.35–0.33) | 1.68 (0.94–3.02) | 1.38 (1.02–1.86) | 1.29 (0.43–2.15) | 0.570 (0.511–0.629) | B | ||
| ER− PRSLRR | 0.23 (0.10–0.44) | 0.16 (−0.05–0.37) | 2.55 (1.34–4.88) | 1.52 (1.10–2.10) | 1.13 (0.04–2.21) | 0.582 (0.523–0.641) | None | |
| 0.08 (−0.06–0.29) | 0.00 (− 0.21–0.21) | 2.55 (1.31–4.93) | 1.52 (1.10–2.09) | 1.21 (0.08–2.33) | 0.582 (0.523–0.641) | A | ||
| 0.07 (−0.07–0.28) | 0.00 (− 0.20–0.20) | 2.26 (1.18–4.33) | 1.47 (1.08–1.99) | 1.13 (0.26–2.00) | 0.578 (0.519–0.637) | B | ||
| ER− PRSANN | −0.12 (− 0.25–0.02) | −0.16 (− 0.37–0.03) | 2.10 (1.09–4.07) | 1.52 (1.09–2.12) | 1.29 (0.74–1.84) | 0.562 (0.503–0.621) | None | |
| 0.10 (−0.02–0.21) | 0.06 (− 0.14–0.20) | 2.22 (1.15–4.30) | 1.52 (1.09–2.12) | 1.28 (0.61–1.95) | 0.562 (0.503–0.621) | A | ||
| 0.09 (−0.05–0.2) | 0.06 (− 0.15–0.20) | 1.90 (0.99–3.62) | 1.38 (0.99–1.93) | 1.16 (0.34–1.98) | 0.547 (0.488–0.606) | B | ||
aAdjustment A: adjusted for Gail-2 model 5-year absolute risk, adjustment B: adjusted for classical breast cancer risk factors
ERb estrogen receptor positive, ER estrogen receptor negative, OR odds ratio, Q fourth quartile, Q first quartile, IQR interquartile range, PRS polygenic risk score, RLR repeated logistic regression, LRR logistic ridge regression, ANN Artificial Neural Network, IQ-OR OR per IQR increase of the PRS in controls, O/E OR observed to expected OR, AUC area under the receiver operator characteristic curve