| Literature DB >> 35053045 |
Shi-Jer Lou1,2, Ming-Feng Hou3,4,5, Hong-Tai Chang6, Hao-Hsien Lee7, Chong-Chi Chiu8,9, Shu-Chuan Jennifer Yeh2,10, Hon-Yi Shi1,2,10,11,12.
Abstract
Machine learning algorithms have proven to be effective for predicting survival after surgery, but their use for predicting 10-year survival after breast cancer surgery has not yet been discussed. This study compares the accuracy of predicting 10-year survival after breast cancer surgery in the following five models: a deep neural network (DNN), K nearest neighbor (KNN), support vector machine (SVM), naive Bayes classifier (NBC) and Cox regression (COX), and to optimize the weighting of significant predictors. The subjects recruited for this study were breast cancer patients who had received breast cancer surgery (ICD-9 cm 174-174.9) at one of three southern Taiwan medical centers during the 3-year period from June 2007, to June 2010. The registry data for the patients were randomly allocated to three datasets, one for training (n = 824), one for testing (n = 177), and one for validation (n = 177). Prediction performance comparisons revealed that all performance indices for the DNN model were significantly (p < 0.001) higher than in the other forecasting models. Notably, the best predictor of 10-year survival after breast cancer surgery was the preoperative Physical Component Summary score on the SF-36. The next best predictors were the preoperative Mental Component Summary score on the SF-36, postoperative recurrence, and tumor stage. The deep-learning DNN model is the most clinically useful method to predict and to identify risk factors for 10-year survival after breast cancer surgery. Future research should explore designs for two-level or multi-level models that provide information on the contextual effects of the risk factors on breast cancer survival.Entities:
Keywords: 10-year survival; breast cancer surgery; deep neural network; machine learning; performance
Year: 2021 PMID: 35053045 PMCID: PMC8773427 DOI: 10.3390/biology11010047
Source DB: PubMed Journal: Biology (Basel) ISSN: 2079-7737
Related work summary.
| Authors (Years) | Study Sample (Data) | Forecasting Models |
|---|---|---|
| Moncada-Torres | 36,658 non-metastatic breast cancer patients from the Netherlands Cancer Registry (NCR) dataset | Random survival forests (RF), survival support vector machines (SVM), extreme gradient boosting (XGBoost), and Cox proportional hazards (CPH) |
| Kuruc et al., | RNA-seq data from the Cancer Genome Atlas (TCGA) | Deep neural networks (DNN), Cox proportional hazards (CPH) |
| Wang et al., | 1137 patients with IB-IIA stage non-small cell lung cancer (China) and compared generalization performance on the Surveillance, Epidemiology, and End Results Program (SEER) dataset | Deep neural networks (DNN), Cox proportional hazards (CPH), SurvNet |
| Bhambhvani, | 277 patients with genitourinary rhabdomyosarcoma from the Surveillance, Epidemiology, and End Results Program (SEER) dataset | Deep neural networks (DNN), Cox proportional hazards (CPH) |
| Hou et al., | 7127 breast cancer cases and 7127 matched healthy controls (China) | Extreme gradient boosting (XGBoost), random forest (RF), deep neural network (DNN), logistic regression (LR) |
Figure 1Flowchart of the study procedure.
Parameters for classification models.
| Parameters | Deep Neural Networks |
|---|---|
| No. of hidden layers | 4 |
| No. of neuron in each hidden layers | (64, 64, 128, 256) |
| Activation functions in each layer | Rectified linear unit (ReLU) in hidden layers and sigmoid on output layer |
| Loss function | Binary cross entropy |
| Optimizer | Adaptive moments estimation (Adam) with 0.001 learning rate |
| No. of Epochs | 100 |
| Dropout layers for regularization | 20% dropout layer after second hidden layer and 10% after third hidden layers |
Characteristics of patients who had received breast cancer surgery at selected institutions (N = 1178).
| Variables | N (%) | Mean ± SD |
|---|---|---|
|
| ||
| Age, years | 52.2 ± 11.1 | |
| Education, years | 10.2 ± 3.8 | |
| Current residence with family member(s) | 1127 (95.7%) | |
| Married | 1038 (88.1%) | |
| Body mass index, kg/m2 | 24.5 ± 4.6 | |
| Charlson Comorbidity Index, score | 1.0 ± 1.4 | |
| Tumor size | 2.4 ± 1.8 | |
| Tumor stage | ||
| 0 | 80 (6.8%) | |
| I | 354 (30.1%) | |
| II | 441 (37.4%) | |
| III | 303 (25.7%) | |
| Smoker | 55 (4.7%) | |
| Drinker | 29 (2.5%) | |
| Breast cancer history | 150 (12.7%) | |
|
| ||
| Surgery | ||
| BCS | 154 (13.1%) | |
| MRM | 297 (25.2%) | |
| Mastectomy with reconstruction | 727 (61.7%) | |
| ASA score | 2.0 ± 0.4 | |
| Chemotherapy | 788 (66.9%) | |
| Radiotherapy | 675 (57.3%) | |
| Hormonal therapy | 717 (60.9%) | |
|
| ||
| Postoperative length of stay, days | 2.9 ± 4.7 | |
| Readmission in 30 days | 283 (24.0%) | |
| Recurrence | 219 (18.6%) | |
| Survival | 881 (74.8%) | |
| Reconstruction | 125 (10.6%) | |
|
| ||
| Preoperative SF36 PCS score | 56.0 ± 7.6 | |
| Preoperative SF36 MCS score | 48.8 ± 16.2 |
Abbreviation: BCS, breast conserving surgery; MRM, modified radical mastectomy; ASA, American Society of Anesthesiologists; PCS, physical component summary; MCS, mental component summary.
Univariate Cox regression analysis of associations between demographic/clinical characteristics of breast cancer patients and survival 10 years after surgery (N = 1178).
| Variables | HR | |
|---|---|---|
|
| ||
| Age, years | 0.98 | <0.001 |
| Education, years | 0.90 | <0.001 |
| Current residence with family member(s) (no vs. yes) | 0.33 | <0.001 |
| Marital status (unmarried vs. married) | 0.57 | <0.001 |
| Body mass index, kg/m2 | 0.96 | <0.001 |
| Charlson Comorbidity Index, score | 0.81 | 0.001 |
| Tumor size, cm | 0.83 | <0.001 |
| Tumor stage | ||
| I vs. 0 | 0.04 | 0.001 |
| II vs. 0 | 0.17 | <0.001 |
| ≥III vs. 0 | 0.22 | <0.001 |
| Smoker (no vs. yes) | 1.36 | 0.043 |
| Drinker (no vs. yes) | 2.09 | 0.037 |
| Breast cancer history (no vs. yes) | 2.70 | 0.001 |
|
| ||
| Surgery type | ||
| MRM vs. BCS | 0.49 | 0.001 |
| Mastectomy with reconstruction vs. BCS | 0.35 | <0.001 |
| ASA score | 0.35 | <0.001 |
| Chemotherapy (no vs. yes) | 0.46 | <0.001 |
| Radiotherapy (no vs. yes) | 0.39 | <0.001 |
| Hormonal therapy (no vs. yes) | 0.29 | <0.001 |
|
| ||
| Postoperative length of stay, days | 0.71 | <0.001 |
| Readmission in 30 days (no vs. yes) | 3.26 | <0.001 |
| Recurrence (no vs. yes) | 2.17 | 0.002 |
| Postoperative reconstruction (no vs. yes) | 0.39 | 0.005 |
|
| ||
| Preoperative SF36 PCS score | 1.02 | <0.001 |
| Preoperative SF36 MCS score | 1.03 | <0.001 |
Abbreviation: HR, hazards ratio; BCS, breast conserving surgery; MRM, modified radical mastectomy; ASA, American Society of Anesthesiologists; PCS, physical component summary; MCS, mental component summary.
Demographic and clinical characteristics of breast cancer surgery patients in training dataset versus testing dataset.
| Variables | Training Dataset | Testing Dataset | |
|---|---|---|---|
|
| |||
| Age, years | 52.7 ± 10.6 | 52.5 ± 13.1 | 0.148 |
| Education, years | 10.1 ± 3.8 | 10.6 ± 4.0 | 0.174 |
| Current residence with family member(s) | 787 (95.5%) | 168 (94.9%) | 0.900 |
| Married | 722 (87.6%) | 159 (89.8%) | 0.598 |
| Body mass index, kg/m2 | 24.7 ± 5.1 | 24.0 ± 3.8 | 0.481 |
| Charlson Comorbidity Index, score | 1.0 ± 1.4 | 1.0 ± 1.3 | 0.570 |
| Tumor size, cm | 2.4 ± 1.9 | 2.4 ± 1.4 | 0.344 |
| Tumor stage | 0.690 | ||
| 0 | 67 (8.1%) | 7 (3.9%) | |
| I | 251 (30.5%) | 57 (32.5%) | |
| II | 305 (37.0%) | 67 (37.7%) | |
| ≥III | 201 (24.4%) | 46 (25.9%) | |
| Smoker | 35 (4.2%) | 12 (6.5%) | 0.425 |
| Drinker | 18 (2.2%) | 5 (2.6%) | 0.711 |
| Breast cancer history | 74 (9.0%) | 18 (10.4%) | 0.755 |
|
| |||
| Surgery type | 0.492 | ||
| BCS | 118 (14.3%) | 21 (11.7%) | |
| MRM | 199 (24.1%) | 55 (31.2%) | |
| Mastectomy with reconstruction | 507 (61.6%) | 101 (57.1%) | |
| ASA score | 2.0 ± 0.4 | 2.0 ± 0.3 | 0.676 |
| Chemotherapy | 550 (66.7%) | 124 (70.1%) | 0.572 |
| Radiotherapy | 473 (57.4%) | 108 (61.0%) | 0.565 |
| Hormonal therapy | 496 (60.2%) | 113 (63.6%) | 0.582 |
|
| |||
| Postoperative length of stay, days | 2.7 ± 1.9 | 2.9 ± 1.5 | 0.711 |
| Readmission in 30 days | 185 (22.4%) | 48 (27.2%) | 0.357 |
| Recurrence | 138 (16.8%) | 48 (27.2%) | 0.067 |
| Postoperative reconstruction | 74 (9.0%) | 18 (10.4%) | 0.564 |
|
| |||
| Preoperative SF36 PCS score | 56.0 ± 7.6 | 54.1 ± 6.6 | 0.758 |
| Preoperative SF36 MCS score | 48.4 ± 18.5 | 49.6 ± 4.2 | 0.863 |
Abbreviation: BCS, breast conserving surgery; MRM, modified radical mastectomy; ASA, American Society of Anesthesiologists; PCS, physical component summary; MCS, mental component summary.
Figure 2Machine learning model comparison in terms of performance indices with 95% confidence intervals for predicting 10-year survival after breast cancer surgery. (A) Dataset for training. All p values < 0.001. (B) Dataset for testing. All p values < 0.001. (C) Dataset for validating. All p values < 0.001. Abbreviation: DNN, deep neural networks; KNN, k-nearest neighbor; SVM, support vector machine; NBC, naïve Bayesian classifier; PPV, positive predictive value; NPV, negative predictive value; AUROC, area under the receiver-operating characteristic curve.
Figure 3Global sensitivity analysis of deep neural network model in predicting 10-year survival after breast cancer surgery (N = 824). Abbreviation: VSR, variable sensitivity ratio; ASA, American Society of Anesthesiologists; PCS, physical component summary; MCS, mental component summary.
Mortality risk factors after breast cancer surgery: factors reported in selected studies.
| Authors (Country) | No. of Subjects | Measures | Findings |
|---|---|---|---|
| Chiu et al., 2019 | 369 patients with hepatocellular carcinoma | Functional Assessment of Cancer Therapy-Hepatobiliary (FACT-Hep) and the SF-36 | 1. Overall postoperative survival was significantly associated with preoperative SF-36 physical component summary score (hazard ratio, HR = 1.05, |
| Quinten et al., 2014 [ | 11 different cancer sites pooled from 30 EORTC randomized controlled trials were selected for this study (7417 cancer patients) | European Organisation for Research and Treatment of Cancer 30-Item Core Quality of Life Questionnaire (EORTC-QLQ-C30) | Overall postoperative survival was significantly associated with preoperative EORTC-QLQ-C30 physical functioning (HR = 0.86, |
| Heijl et al., 2010 | 220 patients with potentially curable esophageal adenocarcinoma | Medical Outcome Study Short Form-20 (SF-20) and | 1. Overall postoperative survival was significantly associated with preoperative SF-20 physical symptom scale (HR = 0.67, |
| Chen et al., 2020 | 149 patients with gastric cancer | Overall postoperative survival was significantly associated with early recurrence during the study period ( | |
| Huh et al., 2013 | 1159 patients with colorectal cancer | Overall postoperative survival was significantly associated with early postoperative recurrence (HR = 2.42, | |
| Knight et al., | 15,958 patients with colorectal and gastric cancer from 428 hospitals in 82 countries | Overall postoperative survival was significantly associated with cancer stage (odds ratio = 1.80, | |
| Chou et al., 2016 | 8425 patients over 70 years old with solid cancer | 3-month postoperative survival was significantly associated with tumor stage (II, III, IV vs. I) (HR = 1.66~4.23, |