| Literature DB >> 36059703 |
Yilin Li1, Fengjiao Xie1, Qin Xiong1, Honglin Lei1, Peimin Feng1.
Abstract
Objective: To evaluate the diagnostic performance of machine learning (ML) in predicting lymph node metastasis (LNM) in patients with gastric cancer (GC) and to identify predictors applicable to the models.Entities:
Keywords: Machine learning; gastric cancer; lymph node metastasis; meta-analysis; systematic review
Year: 2022 PMID: 36059703 PMCID: PMC9433672 DOI: 10.3389/fonc.2022.946038
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1The PRISMA flow diagram for study selection.
Figure 2Distribution of studies by the year of publication.
Characteristics of included studies.
| Study | Country | Study design | Stage | No. patients in the train set | No. patients in the test set | Technique used for feature selection | Types of machine learning | Data source |
|---|---|---|---|---|---|---|---|---|
| Xiao-Peng Zhang (2011) | China | Retro | Early GC | 175 | NA | LR | SVM | Single institution |
| Song Liu (2021) | China | Retro | Stages I-IV | 122 | 41 | LASSO | SVM, LR | Single institution |
| C Jin (2021) | China | Retro | Stages I-IV | 1172 | 527 | LR, RF | DL | Multiple institution |
| Xiaoxiao Wang (2021) | China | Retro | T1-2 | 80 | 79 | LR | LR | Single institution |
| Xiao-Yi Yin (2020) | China | Retro | T1a, T1b | 596 | 227 | LR | LR | Single institution |
| Bang Wool Eom (2016) | Korea | Retro | T1a, T1b | 336 | NA | LR | LR | Single institution |
| Zhixue Zheng (2015) | China | Retro | T1a, T1b | 262 | NA | LR | LR | Single institution |
| Jing Li (2020) | China | Retro | Borrmann I-IV | 136 | 68 | LR | DL | Single institution |
| Zhengbing Wang (2021) | China | Retro | T1a, T1b | 363 | 140 | LR | LR | Single institution |
| HuaKai Tian (2022) | China | Retro | T1a, T1b | 2294 | 227 | LR | GLM, RPART, RF, GBM, SVM, RDA, ANN | Multiple institution |
| Zhixue Zheng (2016) | China | Retro | T1a, T1b | 597 | NA | LR | LR | Single institution |
| Yu Mei (2021) | China | Pros | T1a, T1b | 794 | 418 | LR | LR | Single institution |
| Jing Li (2018) | China | Retro | Borrmann I-IV | 140 | 70 | LR | LR | Single institution |
| Su Mi Kim (2020) | Korea | Pros | T1a, T1b | 10579 | 2100 | LR | LR | Single institution |
| Miaoquan Zhang (2021) | China | Retro | T1a, T1b | 285 | NA | LR | LR | Single institution |
| Yuming Jiang (2019) | China | Retro | Stages I-IV | 312 | 1377 | LR | LR | Multiple institution |
| Qiu-Xia Feng (2019) | China | Retro | Stages I-IV | 326 | 164 | SVM | SVM | Single institution |
| Jianfeng Mu (2019) | China | Retro | T1a, T1b | 746 | 126 | LR | LR | Single institution |
| Shilong Li (2021) | China | Retro | Stages I-IV | 144 | 151 | LR | LR | Single institution |
| Yue Wang (2020) | China | Retro | NA | 197 | 50 | LR | RF | Single institution |
| Chun Guang Guo (2016) | China | Retro | T1a, T1b | 256 | 1273 | LR | LR | Multiple institution |
| Cheng-Mao Zhou (2021) | China | Pros | T1a, T1b | 818 | 351 | GBDT | GBDT, XGB, RF, LR, XGB+LR, RF+LR, GBDT+LR | Single institution |
| Xujie Gao (2021) | China | Retro | T1a, T1b | 308 | 155 | LR | LR | Single institution |
| Xujie Gao (2020) | China | Retro | Stages I-IV | 486 | 240 | LR | LR | Single institution |
| Xu Wang (2021) | China | Retro | T1-4 | 250 | 99 | LR | LR | Single institution |
| Siwei Pan (2021) | China | Retro | T1a, T1b | 1274 | 637 | LR | LR | Multiple institution |
| Wujie Chen (2019) | China | Retro | T2-4 | 71 | 75 | LR | LR | Single institution |
| Bong-Il Song (2020) | Korea | Retro | T1-4 | 377 | 189 | LR | LR | Single institution |
| Chao Huang (2020) | China | Retro | NA | 466 | NA | RF | DT | Single institution |
| Lili Wang (2021) | China | Retro | T2-4 | 340 | 175 | LR | LR | Single institution |
| Seokhwi Kim (2021) | Korea | Retro | T1a | 28 | 108 | LR | Bayesian | Multiple institution |
| Qiufang Liu (2021) | China | Retro | NA | 185 | NA | RF | DL | Single institution |
| Wannian Sui (2021) | China | Retro | T1a, T1b | 1496 | 246 | LR | LR | Multiple institution |
| Dexin Chen (2019) | China | Retro | T1a, T1b | 232 | 143 | LR | LR | Multiple institution |
| Lingwei Meng (2021) | China | Retro | T1-4 | 377 | 162 | LASSO | LR | Multiple institution |
| D Dong (2020) | China | Retro | T2-4 | 225 | 505 | Multivariable linear regression analysis, SVM | DL | Multiple institution |
| Zepang Sun (2021) | China | Pros | T1-4 | 531 | 1087 | LR | LR | Multiple institution |
| Ji-Eun Na (2022) | Korea | Pros | T1a, T1b | 10332 | 4428 | LR, SVM, RF | LR, SVM, RF | Single institution |
| Haixing Zhu (2022) | China | Retro | T1a, T1b | 1878 | 470 | DT, GBM, LR, | DT, GBM, LR, ANN, RF, XGBOOST | Multiple institution |
| Elfriede H Bollschweiler (2004) | Germany | Retro | Stages I-IV | 135 | NA | ANN | ANN | Single institution |
| Yinan Zhang (2018) | China | Pros | T1a, T1b | 272 | 81 | LR | LR | Single institution |
ANN, artificial neural network; DL, deep learning; DT, decision tree; GBM, gradient boosting machine;GC, gastric cancer; GLM, generalized linear model; LASSO, Least Absolute Shrinkage and Selection Operator; LR, logistic regression; NA, not available; No., number; Pros, prospective; RDA, regularized dual averaging; Retro, retrospective; RF, random forest; SVM, support vector machine; XGBOOST, extreme gradient boosting;
Figure 315 most frequently used predictors in 61 prediction models for gastric cancer patients.
Risk of bias and applicability assessment by PROBAST criteria.
| Author | Year | Risk of bias | Overall applicability rating | |||
|---|---|---|---|---|---|---|
| Participants | Predictors | Outcome | Analysis | |||
| Xiaoxiao Wang | 2021 | low | unclear | low | high | high |
| Xiao-Yi Yin | 2020 | low | unclear | low | low | unclear |
| Bang Wool Eom | 2016 | low | unclear | low | high | high |
| Zhixue Zheng | 2015 | low | unclear | low | high | high |
| Zhengbing Wang | 2021 | low | unclear | low | low | unclear |
| Zhixue Zheng | 2016 | low | unclear | low | high | high |
| Yu Mei | 2021 | low | low | low | low | low |
| Jing Li | 2018 | low | unclear | low | high | high |
| Su Mi Kim | 2020 | low | low | low | low | low |
| Miaoquan Zhang | 2021 | low | unclear | low | high | high |
| Yuming Jiang | 2019 | low | unclear | low | low | unclear |
| Jianfeng Mu | 2019 | low | low | low | low | low |
| Shilong Li | 2021 | low | low | low | low | low |
| Chun Guang Guo | 2016 | low | unclear | low | low | unclear |
| Xujie Gao | 2021 | low | unclear | low | low | unclear |
| Xujie Gao | 2020 | low | low | low | low | low |
| Xu Wang | 2021 | low | unclear | low | high | high |
| Siwei Pan | 2021 | low | low | low | low | low |
| Wujie Chen | 2019 | low | unclear | low | high | high |
| Bong-Il Song | 2020 | low | low | low | low | low |
| Lili Wang | 2021 | low | unclear | low | low | unclear |
| Wannian Sui | 2021 | low | unclear | low | low | unclear |
| Dexin Chen | 2019 | low | unclear | low | low | unclear |
| Zepang Sun | 2021 | low | low | low | low | low |
| Xiao-Peng Zhang | 2011 | low | unclear | low | high | high |
| Song Liu | 2021 | low | unclear | low | high | high |
| C Jin | 2021 | low | low | low | low | low |
| Jing Li | 2020 | low | low | low | high | high |
| HuaKai Tian | 2022 | low | low | low | low | low |
| Qiu-Xia Feng | 2019 | low | unclear | low | low | unclear |
| Yue Wang | 2020 | low | unclear | low | high | high |
| Cheng-Mao Zhou | 2021 | low | low | low | low | low |
| Chao Huang | 2020 | low | low | low | high | high |
| Seokhwi Kim | 2021 | low | unclear | low | low | unclear |
| Qiufang Liu | 2021 | low | unclear | low | high | high |
| Lingwei Meng | 2021 | low | low | low | low | low |
| D Dong | 2020 | low | low | low | low | low |
| Ji-Eun Na | 2022 | low | low | low | low | low |
| Haixing Zhu | 2022 | low | low | low | low | low |
| Elfriede H Bollschweiler | 2004 | low | low | low | high | high |
| Yinan Zhang | 2018 | low | low | low | high | high |
c-index for prediction models in gastric cancer patients.
| Model | Train | Test | ||||
|---|---|---|---|---|---|---|
| No. model | c-index | 95%CI | No. model | c-index | 95%CI | |
| LR | 26 | 0.838 | 0.812-0.865 | 21 | 0.824 | 0.791-0.858 |
| Non-LR | 13 | 0.83 | 0.786-0.877 | 13 | 0.789 | 0.747-0.833 |
| DL | 3 | 0.866 | 0.799-0.938 | 3 | 0.835 | 0.780-0.895 |
| GBDT | 1 | 0.798 | 0.714-0.892 | 1 | 0.788 | 0.688-0.902 |
| GBDT+LR | 1 | 0.626 | 0.529-0.740 | 1 | 0.65 | 0.557-0.759 |
| RF | 3 | 0.893 | 0.817-0.977 | 3 | 0.848 | 0.829-0.868 |
| RF+LR | 1 | 0.691 | 0.594-0.804 | 1 | 0.678 | 0.584-0.787 |
| SVM | 2 | 0.847 | 0.804-0.894 | 2 | 0.817 | 0.728-0.917 |
| XGB | 1 | 0.881 | 0.786-0.987 | 1 | 0.762 | 0.673-0.863 |
| XGB+LR | 1 | 0.739 | 0.648-0.842 | 1 | 0.619 | 0.521-0.736 |
| Overall | 39 | 0.837 | 0.814-0.859 | 34 | 0.811 | 0.785-0.838 |
No. model indicates the number of prediction models. DL, deep learning; LR, logistic regression; No., number; Non-LR, non logistic regression; RF, random forest; SVM, support vector machine.
Results of meta-analyses of accuracy for prediction models for gastric cancer patients.
| Model | Train | Test | ||||
|---|---|---|---|---|---|---|
| No. model | accuracy | 95%CI | No. model | accuracy | 95%CI | |
| LR | 28 | 0.792 | 0.761-0.820 | 23 | 0.787 | 0.745-0.824 |
| Non-LR | 22 | 0.768 | 0.725-0.805 | 18 | 0.707 | 0.665-0.746 |
| ANN | 2 | 0.707 | 0.574-0.812 | 1 | 0.634 | 0.589-0.678 |
| DL | 2 | 0.818 | 0.755-0.868 | 1 | 0.765 | 0.646-0.859 |
| DT | 1 | 0.794 | 0.754-0.830 | 1 | 0.632 | 0.587-0.676 |
| GBDT | 1 | 0.835 | 0.808-0.860 | 1 | 0.815 | 0.770-0.854 |
| GBDT+LR | 1 | 0.903 | 0.881-0.923 | 1 | 0.573 | 0.519-0.625 |
| GBM | 1 | 0.618 | 0.597-0.638 | 1 | 0.687 | 0.643-0.729 |
| GLM | 1 | 0.667 | 0.647-0.686 | NA | NA | NA |
| RDA | 1 | 0.668 | 0.649-0.688 | 1 | 0.700 | 0.636-0.759 |
| RF | 4 | 0.793 | 0.710-0.858 | 4 | 0.723 | 0.678-0.764 |
| RF+LR | 1 | 0.644 | 0.610-0.677 | 1 | 0.578 | 0.525-0.631 |
| RPART | 1 | 0.625 | 0.604-0.645 | NA | NA | NA |
| SVM | 4 | 0.765 | 0.678-0.835 | 2 | 0.789 | 0.693-0.861 |
| XGB | 1 | 0.863 | 0.838-0.886 | 1 | 0.678 | 0.626-0.727 |
| XGB+LR | 1 | 0.806 | 0.777-0.832 | 1 | 0.581 | 0.528-0.633 |
| Bayesian | NA | NA | NA | 1 | 0.824 | 0.739-0.891 |
| XGBOOST | NA | NA | NA | 1 | 0.691 | 0.648-0.733 |
| Overall | 50 | 0.781 | 0.756-0.805 | 41 | 0.753 | 0.721-0.783 |
No. model indicates the number of prediction models. ANN, artificial neural network; DL, deep learning; DT, decision tree; GBM, gradient boosting machine; GLM, generalized linear model; LR, logistic regression; NA, not available; No., number; Non-LR, non logistic regression; RDA, regularized dual averaging; RF, random forest; SVM, support vector machine; XGBOOST, extreme gradient boosting.
Subgroup analysis for early-gastric cancer and advanced gastric cancer.
| Stage | Train | Test | Train | Test | ||||
|---|---|---|---|---|---|---|---|---|
| No. model | c-index(95%CI) | No. model | c-index(95%CI) | No. model | Accuracy(95%CI) | No. model | Accuracy(95%CI) | |
| EGC | 24 | 0.832 | 19 | 0.795 | 31 | 0.765 | 26 | 0.731 |
| Advanced GC | 3 | 0.849 | 3 | 0.804 | 2 | 0.821 | 2 | 0.844 |
No. model indicates the number of prediction models. EGC, early-gastric cancer; GC, gastric cancer.
Subgroup analysis for predictors.
| Model | Indicator | CP | RP | CP+RP | F | P | |||
|---|---|---|---|---|---|---|---|---|---|
| n | mean(sd) | n | mean(sd) | n | mean(sd) | ||||
| Train | c-index | 25 | 0.822(0.079) | 8 | 0.852(0.072) | 6 | 0.847(0.063) | 0.604 | 0.552 |
| accuracy | 34 | 0.75(0.087) | 9 | 0.811(0.066) | 7 | 0.822(0.073) | 3.546 | 0.037 | |
| Test | c-index | 20 | 0.792(0.092) | 8 | 0.830(0.07) | 6 | 0.817(0.043) | 0.664 | 0.522 |
| accuracy | 28 | 0.722(0.098) | 8 | 0.799(0.075) | 5 | 0.795(0.04) | 3.224 | 0.051 | |
n indicates the number of prediction models. CP, Clinical Predictors; RP, Radiomics Predictors.