| Literature DB >> 35118083 |
HuaKai Tian1,2, ZhiKun Ning3, Zhen Zong2, Jiang Liu2, CeGui Hu2, HouQun Ying4, Hui Li5.
Abstract
OBJECTIVE: This study aimed to establish the best early gastric cancer lymph node metastasis (LNM) prediction model through machine learning (ML) to better guide clinical diagnosis and treatment decisions.Entities:
Keywords: early gastric cancer; lymph node metastasis; machine learning; predictive model; regularized dual averaging (RDA)
Year: 2022 PMID: 35118083 PMCID: PMC8806156 DOI: 10.3389/fmed.2021.759013
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
Figure 1Flow chart of data screening and statistical analysis [(A) statistical analysis; (B) data screening].
Clinical and pathological characteristics of SEER date and validation set.
|
|
|
|
| ||
|---|---|---|---|---|---|
|
|
|
|
|
| |
|
| |||||
| <50 | 153 (80.1%) | 38 (19.9%) | 32 (56.1%) | 25 (43.9%) | |
|
| |||||
| White | 1,257 (87.2%) | 184 (12.8%) | |||
|
| |||||
| <2 | 908 (89.8%) | 103 (10.2%) | 44 (84.6%) | 8 (15.4%) | |
|
| |||||
| I | 335 (95.4%) | 16 (4.6%) | 2 (100.0%) | 0 (0.0%) | |
|
| |||||
| SRC | 288 (82.1%) | 63 (17.9%) | |||
|
| |||||
| T1a | 1,083 (94.5%) | 62 (5.5%) | 57 (68.7%) | 26 (31.3%) | |
|
| |||||
| Female | 1,196 (85.3%) | 205 (14.7%) | 36 (56.2%) | 28 (43.8%) | |
| Pylorus | 47 (73.4%) | 17 (26.6%) |
General characteristics and lymph node metastasis in the SEER database.
|
|
|
|
|
|
|---|---|---|---|---|
| Age (years) | 0.063 | |||
| <50 | 191 (8.3%) | 153 (80.1%) | 38 (19.9%) | |
| Race | 0.004 | |||
| White | 1,441 (62.8%) | 1,257 (87.2%) | 184 (12.8%) | |
| Black | 279 (12.1%) | 224 (80.2%) | 55 (19.8%) | |
| Others | 574 (25.1%) | 480 (83.6%) | 94 (16.4%) | |
| Sex | 0.843 | |||
| Male | 1,401 (61.0%) | 1,196 (85.3%) | 205 (14.7%) | |
| Tumor size (cm) | <0.001 | |||
| <2 | 1,011 (44.0%) | 908 (89.8%) | 103 (10.2%) | |
| Grade | <0.001 | |||
| I | 351 (15.3%) | 335 (95.4%) | 16 (4.6%) | |
| Organization type | 0.047 | |||
| SRC | 351 (15.3%) | 288 (82.1%) | 63 (17.9%) | |
| Depth | <0.001 | |||
| T1a | 1,145 (56.3%) | 1,083 (94.5%) | 62 (5.5%) | |
| Primary site | 0.017 | |||
| Cardia | 706 (30.8%) | 627 (88.8%) | 79 (11.2%) |
Multivariate analysis of the risk of LNM in the SEER database.
|
|
|
|
|
|---|---|---|---|
| Age (years) | 0.055 | ||
| <50 | 191 (8.3%) | 2.199 (1.286–3.760) | 0.004 |
| Race | 0.084 | ||
| White | 1,441 (62.8%) | 1 (Reference) | − |
| Tumor size (cm) | <0.001 | ||
| <2 | 1,011 (44.0%) | 1 (Reference) | − |
| Grade | <0.001 | ||
| I | 351 (15.3%) | 1 (Reference) | − |
| Depth | |||
| Organization type | |||
| Primary site | 0.386 | ||
| Cardia | 706 (30.8%) | 1 (Reference) | − |
General characteristics and lymph node metastasis of the external verification group.
|
|
|
|
|
|
|---|---|---|---|---|
| Age (years) | 0.005 | |||
| Tumor size (cm) | <0.001 | |||
| Grade | 0.036 | |||
| Depth | 26 (31.3%) | <0.001 | ||
| Sex | 0.165 |
Figure 2Receiver operating characteristic (ROC) curve of the training set prediction model.
Figure 3Receiver operating characteristic (ROC) curve of the testing set and validation set prediction model [ (A) testing set; (B) validation set].
Comparison of prediction performance of different models to LNM.
|
|
|
|
|
|---|---|---|---|
| GBM | 0.731 | 0.607 | 0.682 |
| GLM | 0.771 | 0.656 | 0.727 |
| NNET | 0.729 | 0.592 | 0.818 |
| RF | 0.763 | 0.648 | 0.727 |
| SVM | 0.748 | 0.633 | 0.652 |