| Literature DB >> 32565540 |
Pierre O Chappuis1,2, Maria C Katapodi3,4, Chang Ming5, Valeria Viassolo1, Nicole Probst-Hensch6, Ivo D Dinov7,8,9,4.
Abstract
BACKGROUND: The clinical utility of machine-learning (ML) algorithms for breast cancer risk prediction and screening practices is unknown. We compared classification of lifetime breast cancer risk based on ML and the BOADICEA model. We explored the differences in risk classification and their clinical impact on screening practices.Entities:
Mesh:
Year: 2020 PMID: 32565540 PMCID: PMC7463251 DOI: 10.1038/s41416-020-0937-0
Source DB: PubMed Journal: Br J Cancer ISSN: 0007-0920 Impact factor: 7.640
Fig. 1Consort flow diagram of the whole cohort with breast cancer risk-based classification. ML machine learning, BOADICEA Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm, AU-ROC Area Under the Receiver Operating Characteristic curve.
Performance by area under the receiver operating characteristic curve (AU-ROC) of the machine-learning (ML) algorithms predicting breast cancer lifetime risk derived from 10-fold cross-validations compared with the BOADICEA model.
| Algorithms | AU-ROC | Standard deviation | 95% Confidence interval | Absolute change from BOADICEA | |
|---|---|---|---|---|---|
| LCL | UCL | ||||
| BOADICEA | 0.639 | – | – | – | – |
| ML-ADA | 0.889 | 0.005 | 0.885 | 0.903 | +25.0% |
| ML-MCMC GLMM | 0.851 | 0.006 | 0.847 | 0.856 | +21.2% |
| ML-RF | 0.843 | 0.008 | 0.838 | 0.849 | +20.4% |
MCMC GLMM Markov Chain Monte Carlo generalised linear mixed model, ADA adaptive boosting, RF random forest, LCL lower confidence limit, UCL upper confidence limit. BOADICEA Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm.
N = 45,110 female individuals.
Fig. 2Receiver operating characteristic (ROC) curves of the ML-adapt boosting and BOADICEA model predicting breast cancer lifetime risk, N = 45,110 female individuals. ML machine learning, BOADICEA Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm, CI confidence interval.
Characteristics of the breast cancer-free female cohort.
| Demographics and clinical characteristics | Breast cancer-free female cohort, |
|---|---|
| Age (years) | 51.09 ± 15.35 |
| Age at menarche (years) | 12.82 ± 1.51 |
| Age at first live birth (nulliparous excluded, years) | 24.10 ± 5.03 |
| Parity (nulliparous excluded) | 1.92 ± 1.32 |
| Age at menopause (premenopausal women excluded, years) | 47.94 ± 6.68 |
| Ashkenazi Jewish ancestry | 239 (0.66%) |
| Ethnicity (Black) | 828 (2.29%) |
| BRCA1 or BRCA2 germline pathogenic variants | 115 (462 tested) |
| Cancer diagnosis (all types) | 2 617 (7.24%) |
| Age at cancer onset (years) | 57.44 ± 15.96 |
| Colorectal cancer | 574 (1.59%) |
| Age at colorectal cancer onset (years) | 61.63 ± 17.19 |
| Lung/bronchus cancer | 153 (0.42%) |
| Age at lung/bronchus cancer onset (years) | 62.01 ± 26.18 |
| Pancreatic cancer | 136 (0.38%) |
| Age at pancreatic cancer onset (years) | 66.85 ± 22.94 |
| Ovarian cancer | 508 (1.40%) |
| Age at ovarian cancer onset (years) | 55.96 ± 22.84 |
N = 36,146 individuals.
Comparisons of lifetime risk classification between ML-Adapt Boosting (ML-ADA) algorithm and the BOADICEA model (reference standard) for the breast cancer-free cohort.
| Risk age | Near-population risk BOADICEA risk < 17%, | Moderate risk 17% ≤ BOADICEA risk < 30%, | High-risk BOADICEA ≥ 30%, | ||||||
|---|---|---|---|---|---|---|---|---|---|
| ML-ADA < 17% | 17%≤ ML-ADA < 30% | ML-ADA ≥ 30% | ML-ADA < 17% | 17%≤ ML-ADA < 30% | ML-ADA ≥ 30% | ML-ADA < 17% | 17%≤ ML-ADA < 30% | ML-ADA ≥ 30% | |
| 20–29 ( | 2181 | 430 | 215 | 372 | 1050 | 233 | 17 | 41 | 420 |
| 30–39 ( | 2069 | 645 | 430 | 407 | 989 | 256 | 18 | 34 | 429 |
| 40–49 ( | 2466 | 832 | 625 | 442 | 1191 | 326 | 20 | 44 | 464 |
| 50–59 ( | 2681 | 899 | 751 | 535 | 1243 | 337 | 25 | 49 | 505 |
| 60–69 ( | 2037 | 745 | 849 | 570 | 1326 | 349 | 23 | 43 | 494 |
| 70–80 ( | 2116 | 871 | 441 | 465 | 1233 | 361 | 21 | 48 | 483 |
| Total | 13,550 | 4422 | 3311 | 2791 | 7032 | 1862 | 124 | 259 | 2795 |
| Concordance | 63.67% | – | – | – | 60.18% | – | – | – | 87.95% |
| Reclassification | 20.78% | 15.56% | 23.89% | 15.93% | 3.90% | 8.15% | |||
– Does not apply.
N = 36,146. ML machine learning, BOADICEA Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm, ADA adaptive boosting.
Clinical impact on mammography screening based on Swiss Surveillance Protocol.
ML machine learning, BOADICEA Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm, ADA adaptive boosting.