| Literature DB >> 35845893 |
Yanfeng Wang1, Wenhao Zhang1, Junwei Sun1, Lidong Wang2, Xin Song2, Xueke Zhao2.
Abstract
Esophageal squamous cell carcinoma (ESCC) is one of the highest incidence and mortality cancers in the world. An effective survival prediction model can improve the quality of patients' survival. In this study, ten indicators related to the survival of patients with ESCC are founded using genetic algorithm feature selection. The prognostic index (PI) for ESCC is established using the binary logistic regression. PI is divided into four stages, and each stage can reasonably reflect the survival status of different patients. By plotting the ROC curve, the critical threshold of patients' age could be found, and patients are divided into the high-age groups and the low-age groups. PI and ten survival-related indicators are used as independent variables, based on the bald eagle search (BES) and least-squares support vector machine (LSSVM), and a survival prediction model for patients with ESCC is established. The results show that five-year survival rates of patients are well predicted by the bald eagle search-least-squares support vector machine (BES-LSSVM). BES-LSSVM has higher prediction accuracy than the existing particle swarm optimization-least-squares support vector machine (PSO-LSSVM), grasshopper optimization algorithm-least-squares support vector machine (GOA-LSSVM), differential evolution-least-squares support vector machine (DE-LSSVM), sparrow search algorithm-least-squares support vector machine (SSA-LSSVM), bald eagle search-back propagation neural network (BES-BPNN), and bald eagle search-extreme learning machine (BES-ELM).Entities:
Mesh:
Substances:
Year: 2022 PMID: 35845893 PMCID: PMC9279059 DOI: 10.1155/2022/3895590
Source DB: PubMed Journal: Comput Intell Neurosci
Population proportion information of the dataset.
| Project | Category | Number of population | Percentage of population (%) |
|---|---|---|---|
| Genders | Male | 222 | 61.7 |
| Female | 138 | 38.3 | |
|
| |||
| Ages | ≤61.5 | 230 | 63.9 |
| >61.5 | 130 | 36.1 | |
|
| |||
| T stages | T1 | 54 | 15 |
| T2 | 99 | 27.5 | |
| T3 | 205 | 56.9 | |
| T4 | 2 | 0.1 | |
|
| |||
| N stages | N0 | 191 | 53.1 |
| N1 | 103 | 28.6 | |
| N2 | 48 | 13.3 | |
| N3 | 18 | 5 | |
|
| |||
| TNM stages | I | 47 | 13.1 |
| II | 156 | 43.3 | |
| III | 137 | 38.1 | |
| IV | 20 | 5.6 | |
Basic information about seventeen blood indicators.
| Variable | Mean | Median (range) | Variance | Standard deviation |
|---|---|---|---|---|
| WBC | 6.633 | 6.2 (2.5–13.6) | 4.427 | 2.104 |
| LYMPH | 1.869 | 1.9 (0–4) | 0.401 | 0.633 |
| GLOB | 29.306 | 29 (17–45) | 27.160 | 5.212 |
| PT | 10.327 | 10.3 (7–16.6) | 2.690 | 1.640 |
| ALB | 42.011 | 42 (24–56) | 27.259 | 5.212 |
| RBC | 4.430 | 4.45 (2.6–6.04) | 0.234 | 0.483 |
| TT | 15.304 | 15.5 (1.3–21.3) | 3.583 | 1.893 |
| BASO | 0.042 | 0 (0–1) | 0.007 | 0.082 |
| EO | 0.137 | 0.1 (0–3) | 0.044 | 0.209 |
| INR | 0.795 | 0.79 (0.45–1.64) | 0.033 | 0.181 |
| NEUT | 4.033 | 3.7 (0.3–17) | 3.491 | 1.868 |
| TP | 71.428 | 71 (50–92) | 53.064 | 7.285 |
| MONO | 0.405 | 0.4 (0–1.3) | 0.069 | 0.263 |
| FIB | 379.431 | 367.85 (189.5–774.43) | 924.038 | 30.398 |
| HGB | 138.311 | 139 (63–189) | 218.705 | 14.789 |
| PLT | 239.781 | 232.5 (51–576) | 52.606 | 7.253 |
| APTT | 36.112 | 35.25 (15.4–78.5) | 60.110 | 7.753 |
The unit of WBC, LYMPH, GLOB, ALB, RBC, BASO, EO, NEUT, TP, HGB, and PLT is g/L. The unit of PT, TT, and APTT is second(s). The unit of FIB is mg/L.
Figure 1Principle of crossover operator.
Figure 2Survival function at the mean of the covariate. The survival years are taken as the time, and the ten indicators obtained from genetic algorithm feature selection are used as covariates.
Figure 3ROC analysis of PI and TNM. (a) ROC analysis of PI. (b) Comparative analysis of ROC for PI and TNM. The horizontal coordinate is “1-specificity,” and the vertical coordinate is “sensitivity.” The larger the area under the curve, the stronger the significance.
Results of ROC analysis for PI and TNM.
| Project | Sensitivity | Specificity | AUC | Significance level |
|---|---|---|---|---|
| PI | 0.796 | 0.440 |
| < |
| TNM | 0.515 | 0.679 | 0.639 | <0.0001 |
Results of ROC curve analysis for PI critical threshold.
| Project | ROC for all PI samples | ROC for low PI samples | ROC for high PI samples |
|---|---|---|---|
| Area under the ROC curve (AUC) |
|
|
|
| Standard error | 0.030 | 0.056 | 0.033 |
| 95% confidence interval | 0.600 to 0.719 | 0.505 to 0.723 | 0.511 to 0.642 |
| Significance level | < |
|
|
| Youden index | 0.237 | 0.284 | 0.158 |
| Associated criterion |
|
|
|
| Sensitivity | 0.796 | 0.742 | 0.309 |
| Specificity | 0.440 | 0.542 | 0.848 |
Figure 4ROC analysis for dividing PI staging. (a) ROC for high PI samples. (b) ROC for low PI samples. The horizontal coordinate is “1-specificity,” and the vertical coordinate is “sensitivity.” The larger the area under the curve, the stronger the significance.
Figure 5Kaplan–Meier survival analysis of PI stages.
Figure 6ROC analysis of age.
Figure 7Kaplan–Meier survival analysis of age.
Figure 8Framework of the overall implementation of the survival prediction model for patients with ESCC.
Comparison of different algorithms for predicting five-year survival of patients with esophageal squamous cell carcinoma.
| Algorithm | 10-fold cross-validation accuracy (%) | Sensitivity (%) | Specificity (%) | Running time (s) | |
|---|---|---|---|---|---|
| High-age group | BES-LSSVM | 86.538 | 88.032 | 86.437 | 1.661 |
| GOA-LSSVM | 85.769 | 86.971 | 85.101 | 3.464 | |
| DE-LSSVM | 85.384 | 86.626 | 84.668 | 8.123 | |
| PSO-LSSVM | 84.615 | 85.397 | 83.537 | 3.641 | |
| SSA-LSSVM | 86.154 | 87.329 | 85.553 | 2.875 | |
| BES-BPNN | 83.902 | 85.673 | 83.393 | 10.615 | |
| BES-ELM | 83.477 | 84.419 | 82.907 | 6.171 | |
|
| |||||
| Low-age group | BES-LSSVM | 86.495 | 88.327 | 85.991 | 1.846 |
| GOA-LSSVM | 85.435 | 87.229 | 84.915 | 4.254 | |
| DE-LSSVM | 85.217 | 86.802 | 84.474 | 9.950 | |
| PSO-LSSVM | 84.782 | 86.595 | 84.245 | 3.846 | |
| SSA-LSSVM | 85.843 | 87.675 | 85.338 | 3.412 | |
| BES-BPNN | 83.479 | 85.271 | 82.959 | 11.743 | |
| BES-ELM | 83.913 | 85.706 | 83.393 | 7.036 | |
Optimal LSSVM model parameters under different optimization algorithms.
| Algorithm | High-age group | Low-age group | ||
|---|---|---|---|---|
| Penalty factor | Kernel function parameter | Penalty factor | Kernel function parameter | |
| BES-LSSVM | 77.946 | 2.090 | 60.290 | 2.493 |
| GOA-LSSVM | 54.429 | 1.225 | 22.895 | 0.106 |
| DE-LSSVM | 66.155 | 1.044 | 50.816 | 0.735 |
| PSO-LSSVM | 61.902 | 1.086 | 46.111 | 0.459 |
| SSA-LSSVM | 77.217 | 10.192 | 75.204 | 5.991 |
Comparison of the results of different algorithms.
| Algorithm | 10-fold cross-validation accuracy (%) | Sensitivity (%) | Specificity (%) | Running time (s) |
|---|---|---|---|---|
| BES-LSSVM | 97.01 | 98.19 | 95.09 | 2.793 |
| GOA-LSSVM | 96.28 | 97.75 | 93.90 | 7.263 |
| DE-LSSVM | 96.10 | 97.64 | 93.62 | 6.824 |
| PSO-LSSVM | 96.27 | 97.75 | 93.89 | 5.772 |
| SSA-LSSVM | 96.65 | 97.97 | 94.54 | 3.818 |
| BES-BP | 95.26 | 97.11 | 92.34 | 12.749 |
| BES-ELM | 95.61 | 97.33 | 92.88 | 7.837 |