| Literature DB >> 32154669 |
Carlo Boeri1, Corrado Chiappa1, Federica Galli1, Valentina De Berardinis1, Laura Bardelli1, Giulio Carcano1, Francesca Rovera1.
Abstract
More than 750 000 women in Italy are surviving a diagnosis of breast cancer. A large body of literature tells us which characteristics impact the most on their prognosis. However, the prediction of each disease course and then the establishment of a therapeutic plan and follow-up tailored to the patient is still very complicated. In order to address this issue, a multidisciplinary approach has become widely accepted, while the Multigene Signature Panels and the Nottingham Prognostic Index are still discussed options. The current technological resources permit to gather many data for each patient. Machine Learning (ML) allows us to draw on these data, to discover their mutual relations and to esteem the prognosis for the new instances. This study provides a primary evaluation of the application of ML to predict breast cancer prognosis. We analyzed 1021 patients who underwent surgery for breast cancer in our Institute and we included 610 of them. Three outcomes were chosen: cancer recurrence (both loco-regional and systemic) and death from the disease within 32 months. We developed two types of ML models for every outcome (Artificial Neural Network and Support Vector Machine). Each ML algorithm was tested in accuracy (=95.29%-96.86%), sensitivity (=0.35-0.64), specificity (=0.97-0.99), and AUC (=0.804-0.916). These models might become an additional resource to evaluate the prognosis of breast cancer patients in our daily clinical practice. Before that, we should increase their sensitivity, according to literature, by considering a wider population sample with a longer period of follow-up. However, specificity, accuracy, minimal additional costs, and reproducibility are already encouraging.Entities:
Keywords: Artificial Neural Network (ANN); Support Vector Machine (SVM); algorithm; breast cancer; predictive models
Mesh:
Year: 2020 PMID: 32154669 PMCID: PMC7196042 DOI: 10.1002/cam4.2811
Source DB: PubMed Journal: Cancer Med ISSN: 2045-7634 Impact factor: 4.452
Study population
| Mean ± SD/% (No.) | Mean ± SD/% (No.) | ||
|---|---|---|---|
| Gender | F = 100% (610) | Age (y) | 59.711 ± 12.886 |
| Menopause | 70.82% (432) | Menopause age (y) | 49.611 ± 4.870 |
| Arterial hypertension | 35.08% (214) | Diabetes mellitus | 7.70% (47) |
| Coronary heart disease | 4.26% (26) | Previous ovarian cancer | 1,15% (7) |
| BMI (kg/m2) | 25.765 ± 6.019 | Familiarity | 27.87% (170) |
| BRCA mutation | BRCA1 = 1.64% (10) | Chest wall/skin invasion | 1.48% (9) |
| BRCA2 = 0.49% (3) | |||
| cT | x = 2.95% (18) | pT | 0 = 1.64% (10) |
| 0 = 0.49% (3) | is = 0.82% (5) | ||
| is = 1.8% (11) | 1 = 62.3% (380) | ||
| 1 = 59.18% (361) | 2 = 31.15% (190) | ||
| 2 = 29.67% (181) | 3 = 2.62% (16) | ||
| 3 = 2.79% (17) | 4 = 1.48 (9) | ||
| 4 = 3.11% (19) | |||
| cN | x = 0.49% (3) | pN | x = 1.64% (10) |
| 0 = 82.46% (503) | 0 = 63.11% (385) | ||
| 1 = 14.75% (90) | 0(i+)= 1.64% (10) | ||
| 2 = 0.82% (5) | 1mi = 6.07% (37) | ||
| 3 = 1.48% (9) | 1 = 17.05% (104) | ||
| 2 = 6.89% (42) | |||
| 3 = 5.08% (31) | |||
| M | 0 = 98.03% (598) | Inflammatory breast cancer | 0.98% (6) |
| 1 = 1.97% (12) | |||
| Clinical stage | 0 = 1.97% (12) | Pathologic stage | 0 = 1.31% (8) |
| IA = 54.1% (330) | IA = 47.54% (290) | ||
| IIA = 24.92% (152) | IB = 4.26% (26) | ||
| IIB = 7.54% (46) | IIA = 26.39% (161) | ||
| IIIA = 1.8% (11) | IIB = 7.70% (47) | ||
| IIIB = 2.3% (14) | IIIA = 6.72% (41) | ||
| IIIC = 1.31% (8) | IIIB = 0.98% (6) | ||
| IV = 1.8% (11) | IIIC = 4.59% (28) | ||
| IV = 1.97% (12) | |||
| Neoadjuvant chemotherapy | 7.7% (47) | Pathologic response after neoadjuvant chemotherapy | None 1.48% (9) |
| Partial 4.75% (29) | |||
| Complete 1.31% (8) | |||
| Tumor size (pathologic size of the major nodule) (cm) | 2.016 ± 1.466 | Focality | Unifocal 80.66% (492) |
| Multifocal 9.02% (55) | |||
| Multicentric 8.85% (54) | |||
| Histological type | DCI NST 77.7% (474) | Type of surgery | Breast‐conserving 61.64% (376) |
| DCI special type 7.7% (47) | |||
| Mastectomy 38.36% (234) | |||
| LCI 10.33% (63) | |||
| Mixed types 2.46% (15) | |||
| SLNB | 79.67% (486) | ALND | 35.41% (216) |
| No. of removed LNs | 8.138 ± 8.917 | No. of metastatic LNs | 1.495 ± 3.814 |
| Lymphovascular invasion | 18.36% (112) | Neuroinvasion | 4.92% (30) |
| Extranodal extension | 6.72% (41) | Grade | G1 = 4.75% (29) |
| G2 = 63.11% (385) | |||
| G3 = 30.82% (188) | |||
| ER (%) | 85.742 ± 31.865 | PgR (%) | 57.324 ± 39.878 |
| Ki67 (%) | 25.358 ± 17.465 | p53 (%) | 12.995 ± 26.729 |
| HER2 | 12.13% (74) | Adjuvant chemotherapy | 41.48% (253) |
| Adjuvant radiotherapy | 71.64% (437) | Adjuvant hormonal therapy | 83.61% (510) |
| Follow‐up (mos) | 61.302 ± 22.757 | New contralateral breast cancer | 0.32% (2) |
| Loco‐regional recurrence within 32 mo | 2.95% (18) | Systemic recurrence within 32 mo | 4.1% (25) |
| Death from breast cancer within 32 mo | 3.44% (21) |
Figure 1Cases of recurrence and death from breast cancer in 100 mo
Figure 2Length of the follow‐up
Models’ inputs
| Loco‐regional recurrence | Systemic recurrence | Death from disease |
|---|---|---|
| Tumor size (cm) | Tumor size (cm) | Tumor size (cm) |
| No. of metastatic axillary lymph nodes (0/1/2/3/…) | No. of metastatic axillary lymph nodes (0/1/2/3/…) | No. of metastatic axillary lymph nodes (0/1/2/3/…) |
| Estrogen receptor expression (ER%) | Estrogen receptor expression (ER%) | Estrogen receptor expression (ER%) |
| Grading (G1/G2/G3) | Ki67 expression (%) | Metastatic disease (M = 1/M = 0) |
| Multicentricity | Multicentricity | Grading (G1/G2/G3) |
| Skin or chest wall invasion | Pathologic response after neoadjuvant chemotherapy (complete/partial/none) | Pathologic response after neoadjuvant chemotherapy (complete/partial/none) |
| Adjuvant hormonal therapy | — | — |
Loco‐regional recurrence
| Models | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|
| Artificial Neural Network (ANN) | 96.17% (552/574) | 0.35 (6/17) | 0.98 (546/557) | 0.916 |
| Support Vector Machine (SVM) | 96.86% (556/574) | 0.41 (7/17) | 0.99 (549/557) | 0.896 |
Figure 3ROC curves for loco‐regional recurrence. Blue line = SVM; Red line = Artificial Neural Network Source: SPSS Modeler.
Systemic recurrence
| Models | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|
| Artificial Neural Network (ANN) | 95.29% (546/573) | 0.64 (16/25) | 0.97 (530/548) | 0.914 |
| Support Vector Machine (SVM) | 95.64% (548/573) | 0.56 (14/25) | 0.97 (534/548) | 0.903 |
Figure 4ROC curves for systemic recurrence. Blue line = SVM; red line = Artificial Neural Network Source: SPSS Modeler.
Death from breast cancer
| Models | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|
| Artificial Neural Network (ANN) | 96.40% (563/584) | 0.48 (10/21) | 0.98 (553/563) | 0.804 |
| Support Vector Machine (SVM) | 95.72% (559/584) | 0.48 (10/21) | 0.98 (549/563) | 0.849 |
Figure 5ROC curves for death from breast cancer. Blue line = SVM; red line = Artificial Neural Network Source: SPSS Modeler.