| Literature DB >> 33115287 |
Chengmao Zhou1, Ying Wang2, Mu-Huo Ji2, Jianhua Tong2, Jian-Jun Yang1,2, Hongping Xia1,3.
Abstract
OBJECTIVE: The aim is to explore the prediction effect of 5 machine learning algorithms on peritoneal metastasis of gastric cancer.Entities:
Keywords: gastric cancer; machine learning; peritoneal metastasis; predictive modeling
Mesh:
Year: 2020 PMID: 33115287 PMCID: PMC7791448 DOI: 10.1177/1073274820968900
Source DB: PubMed Journal: Cancer Control ISSN: 1073-2748 Impact factor: 3.302
Baseline Data.
| Peritoneal metastasis | NO | YES | P-value |
|---|---|---|---|
| N | 979 | 101 | |
| Age (years) | 63.8 ± 11.2 | 63.6 ± 12.1 | 0.958 |
| Height(cm) | 165.2 ± 8.0 | 164.4 ± 9.1 | 0.584 |
| Weight(kg) | 59.4 ± 10.5 | 57.0 ± 11.0 | 0.035 |
| BMI (kg/m2) | 21.7 ± 3.0 | 21.1 ± 3.2 | 0.037 |
| Tumor size (cm) | 3.9 ± 2.1 | 5.0 ± 2.2 | <0.001 |
| PLR | 148.8 ± 73.7 | 170.3 ± 77.7 | <0.001 |
| NLR | 2.6 ± 1.6 | 2.9 ± 1.6 | 0.011 |
| PREOPERATIVE. HEMOGLOBIN | 118.4 ± 24.0 | 112.4 ± 22.6 | 0.003 |
| Platelet count | 225.3 ± 69.6 | 241.9 ± 72.4 | 0.035 |
| Albumin (g/L) | 39.7 ± 5.2 | 38.6 ± 5.1 | 0.016 |
| Neutrophil count | 4.1 ± 3.9 | 4.0 ± 1.3 | 0.151 |
| Lymphocyte count | 1.8 ± 2.0 | 1.6 ± 0.5 | 0.034 |
| Monocyte count | 0.5 ± 0.4 | 0.4 ± 0.2 | 0.144 |
| WBC count | 6.1 ± 1.6 | 6.2 ± 1.5 | 0.735 |
| Sex | 0.893 | ||
| Male | 760 (77.6%) | 79 (78.2%) | |
| Female | 219 (22.4%) | 22 (21.8%) | |
| ASA | 0.905 | ||
| 1 | 61 (6.2%) | 5 (5.0%) | |
| 2 | 828 (84.6%) | 86 (85.1%) | |
| 3 | 90 (9.2%) | 10 (9.9%) | |
| TNM | 0.958 | ||
| I | 236 (24.1%) | 23 (22.8%) | |
| II | 67 (6.8%) | 8 (7.9%) | |
| III | 513 (52.4%) | 52 (51.5%) | |
| IV | 163 (16.6%) | 18 (17.8%) | |
| Borrmann types | 0.041 | ||
| 1 | 55 (5.6%) | 8 (7.9%) | |
| 2 | 859 (87.7%) | 80 (79.2%) | |
| 3 | 65 (6.6%) | 13 (12.9%) | |
| Pathological type [n, (%)] | 0.015 | ||
| Ulcerative | 120 (12.3%) | 21 (20.8%) | |
| Nonulcerative | 859 (87.7%) | 80 (79.2%) | |
| Depth of invasion [n, (%)] | <0.001 | ||
| T1/T2 | 341 (34.8%) | 6 (5.9%) | |
| T3/T4 | 638 (65.2%) | 95 (94.1%) |
Note: NLR, neutrophil-tolymphocyte ratio; PLR, platelet-to-lymphocyte ratio; WBC, white blood cell.
Figure 1.Correlation between variables.
Figure 2.Variable importance of features included in machine learning algorithm for prediction of peritoneal metastasis.
Figure 3.Different machine learning algorithms predict the peritoneal metastasis in the training group(A) and test group(B).
Note: gbm: Light Gradient Boosting Machine.
Forecast Results for Training and Testing Group.
| Training | Testing | |||||
|---|---|---|---|---|---|---|
| Accuracy | AUC | MSE | Accuracy | AUC | MSE | |
| Logistic | 0.906 | 0.741 | 0.094 | 0.904 | 0.680 | 0.096 |
| DecisionTree | 0.906 | 0.712 | 0.094 | 0.907 | 0.657 | 0.093 |
| forest | 0.906 | 0.796 | 0.094 | 0.907 | 0.696 | 0.093 |
| GradientBoosting | 0.909 | 0.861 | 0.091 | 0.904 | 0.725 | 0.096 |
| gbm | 0.909 | 0.938 | 0.091 | 0.907 | 0.745 | 0.093 |
Note: gbm: Light Gradient Boosting Machine.
Functions, Packages, and Tuning Parameters in the Anaconda Software Used for Each Machine Learning Algorithm.
| Algorithm | Classifier | Package | Tuning Parameters |
|---|---|---|---|
| Logistic regression | LogisticRegression | from sklearn.linear_model import LogisticRegression | Penalty = “l2,” tol = 0.0001, C = 1, intercept_scaling = 1, max_iter = 100 |
| DecisionTree | DecisionTreeClassifier | from sklearn.tree import DecisionTreeClassifier | splitter = “best,” max_depth = 2, min_samples_split = 20, min_samples_leaf = 5, min_weight_fraction_leaf = 0.1 |
| forest | RandomForestClassifier | from sklearn.ensemble import RandomForestClassifier | n_estimators = 10, max_depth = 3, min_samples_split = 70, min_samples_leaf = 6, random_state = 41 |
| GradientBoosting | GradientBoostinglassifier | from sklearn.ensemble import GradientBoostinglassifier | learning_rate = 0.06, n_estimators = 50, max_depth = 2, random_state = 41 |
| gbm | lgb.LGBMClassifier | lightgbm 2.2.0 | learning_rate = 0.1, n_estimators = 30, max_depth = 3 |
Note: gbm:(Light Gradient Boosting Machine).
Basic Characteristics of Training Group and Test Group.
| Training | Test | P-value | |
|---|---|---|---|
| Number | 756 | 324 | |
| Age (years) | 63.9 ± 11.1 | 63.3 ± 11.6 | 0.409 |
| Height(cm) | 165.0 ± 8.2 | 165.5 ± 7.7 | 0.359 |
| Weight(kg) | 58.7 ± 10.4 | 60.1 ± 10.7 | 0.051 |
| BMI (kg/m2) | 21.6 ± 3.0 | 21.9 ± 3.1 | 0.061 |
| Tumor size (cm) | 4.1 ± 2.2 | 3.8 ± 2.0 | 0.048 |
| PLR | 151.0 ± 71.5 | 150.4 ± 80.6 | 0.890 |
| NLR | 2.6 ± 1.5 | 2.6 ± 1.7 | 0.936 |
| PREOPERATIVE. HEMOGLOBIN | 117.5 ± 23.7 | 118.5 ± 24.4 | 0.539 |
| Platelet count | 229.0 ± 70.2 | 221.9 ± 69.4 | 0.128 |
| Albumin (g/L) | 39.5 ± 5.2 | 40.0 ± 5.2 | 0.136 |
| Neutrophil count | 4.0 ± 3.0 | 4.2 ± 5.1 | 0.627 |
| Lymphocyte count | 1.8 ± 1.9 | 1.8 ± 1.9 | 0.865 |
| Monocyte count | 0.4 ± 0.3 | 0.4 ± 0.5 | 0.838 |
| WBC count | 6.2 ± 1.6 | 6.0 ± 1.5 | 0.022 |
| Sex | 0.034 | ||
| Male | 574 (75.9%) | 265 (81.8%) | |
| Female | 182 (24.1%) | 59 (18.2%) | |
| ASA | 0.885 | ||
| 1 | 47 (6.2%) | 19 (5.9%) | |
| 2 | 641 (84.8%) | 273 (84.3%) | |
| 3 | 68 (9.0%) | 32 (9.9%) | |
| TNM | 0.314 | ||
| I | 183 (24.2%) | 76 (23.5%) | |
| II | 56 (7.4%) | 19 (5.9%) | |
| III | 383 (50.7%) | 182 (56.2%) | |
| IV | 134 (17.7%) | 47 (14.5%) | |
| Borrmann types | 0.382 | ||
| 1 | 44 (5.8%) | 19 (5.9%) | |
| 2 | 652 (86.2%) | 287 (88.6%) | |
| 3 | 60 (7.9%) | 18 (5.6%) | |
| Pathological type [n, (%)] | 0.650 | ||
| Ulcerative | 101 (13.4%) | 40 (12.3%) | |
| Nonulcerative | 655 (86.6%) | 284 (87.7%) | |
| Depth of invasion [n, (%)] | 0.401 | ||
| T1/T2 | 237 (31.3%) | 110 (34.0%) | |
| T3/T4 | 519 (68.7%) | 214 (66.0%) |