| Literature DB >> 34168500 |
Shuli Hu1, Man Luo2, Yaling Li1.
Abstract
PURPOSE: The lymph node gross target volume (GTV) delineation in patients with non-small cell lung cancer (NSCLC) is crucial for prognosis. This study aimed to develop a predictive model that can be used to differentiate between lymph nodes micrometastasis (LNM) and non-lymph nodes micrometastasis (non-LNM). PATIENTS AND METHODS: A retrospective study involving 1524 patients diagnosed with NSCLC was collected in the First Hospital of Wuhan between January 1, 2017, and April 1, 2020. Duplicated and useless variables were excluded, and 16 candidate variables were selected for further analysis. The random forest (RF) algorithm and generalized linear (GL) algorithm were used to screen out the variables that greatly affected the LNM prediction, respectively. The area under the curve (AUC) was compared between the RF model and GL model.Entities:
Keywords: gross target volume; lymph nodes micrometastasis; machine learning; non-small cell lung cancer; prediction model; random forest
Year: 2021 PMID: 34168500 PMCID: PMC8217594 DOI: 10.2147/CMAR.S313941
Source DB: PubMed Journal: Cancer Manag Res ISSN: 1179-1322 Impact factor: 3.989
Figure 1Correlation matrix of candidate features. Values in this matrix demonstrated the correlation coefficient of each corresponding variable.
Patients’ Demographics and Clinicopathological Characteristics
| Variables | No. (%) |
|---|---|
| (N=1524) | |
| Mean | 46.8 |
| SD | 12.1 |
| Male | 1014 (66.5%) |
| Female | 510 (33.5%) |
| Yes | 983 (64.5%) |
| No | 541 (35.5%) |
| Central type | 633 (41.5%) |
| Peripheral type | 891 (58.5%) |
| >27 | 407 (26.7%) |
| ≤27 | 1117 (73.3%) |
| Poorly differentiation | 180 (11.8%) |
| Moderately differentiation | 839 (55.1%) |
| Highly differentiation | 505 (33.1%) |
| Squamous carcinoma | 946 (62.1%) |
| Adenocarcinoma | 539 (35.4%) |
| Others | 39 (2.6%) |
| ≤8 | 1175 (77.1%) |
| >8 | 349 (22.9%) |
| T1 | 278 (18.2%) |
| T2 | 1121 (73.6%) |
| T3 | 87 (5.7%) |
| T4 | 38 (2.5%) |
| >3 | 481 (31.6%) |
| ≤3 | 1043 (68.4%) |
| Yes | 876 (57.5%) |
| No | 648 (42.5%) |
| Yes | 433 (28.4%) |
| No | 1091 (71.6%) |
| Yes | 893 (58.6%) |
| No | 631 (41.4%) |
| >0.6 | 255 (16.7%) |
| ≤0.6 | 1269 (83.3%) |
| Yes | 176 (11.5%) |
| No | 1348 (88.5%) |
| Yes | 420 (27.6%) |
| No | 1104 (72.4%) |
Abbreviations: SD, Standard deviation; BMI, Body mass index; SUVmax, Maximum standard uptake value; T stage, tumor node metastasis classification (AJCC7th); AJCC, American Joint Committee on Cancer.
Figure 2Random forest model. (A) The candidate factors associated with micrometastasis of lymph nodes were ordered according to the mean decreased Gini index. (B) Relationship of dynamic changes between the prediction error and the number of decision trees. (C) Performance of the prediction model with increasing numbers of features in the tenfold cross-validation.
The Predictive Performances of Different Models Associated with Micrometastasis of Lymph Nodes
| GL Model | Multivariate Analysis | Discrimination | ||||
|---|---|---|---|---|---|---|
| OR(95% CI) | P-value | Brier | R2 | C-Index | AIC | |
| Sex | 0.94 (0.87–1.02) | <0.01 | ||||
| Age*,y | 1.11 (0.99–1.24) | 0.11 | ||||
| Diameter,cm | 1.11 (0.99–1.24) | 0.06 | ||||
| Vascular invasion | 0.31 (0.24–0.39) | <0.01 | ||||
| Differentiation | 0.97 (0.85–1.10) | <0.01 | 0.05 | 0.06 | 0.681 | 328.60 |
| (Poorly vs High) | 1.55 (1.39–1.73) | <0.01 | ||||
| Pulmonary membrane invasion | 2.67 (1.95–3.67) | <0.01 | ||||
| EGFR | 1.01 (0.89–1.16) | 0.83 | ||||
| SUVmax | 2.59 (2.25–2.98) | <0.01 | ||||
| Vascular invasion | 0.31 (0.25–0.40) | <0.01 | ||||
| Differentiation | 0.97 (0.85–1.10) | <0.01 | ||||
| (Poorly vs High) | 1.54 (1.39–1.72) | <0.01 | ||||
| Tumor site | 2.69 (1.96–3.69) | <0.01 | 0.05 | 0.06 | 0.686 | 327.50 |
| EGFR | 1.07 (0.94–1.22) | 0.30 | ||||
| SUVmax | 2.49 (2.18–2.84) | <0.01 | ||||
| Differentiation | 0.92 (0.96–0.99) | <0.01 | ||||
| (Poorly vs High) | 0.31 (0.24–0.39) | 0.03 | ||||
| Vascular invasion | 0.97 (0.85–1.10) | <0.01 | ||||
| Pulmonary membrane invasion | 1.55 (1.39–1.72) | <0.01 | 0.04 | 0.07 | 0.685 | 330.60 |
| SUVmax | 2.68 (1.96–3.68) | <0.01 | ||||
| Diameter, cm | 2.62 (2.28–3.02) | <0.01 | ||||
Note: *Continuous variable.
Abbreviations: AIC, Akaike information criterion; GL model, generalized linear model; OR, odds ratio; 95% CI, 95% confidence level; SUVmax, Maximum standard uptake value.
Figure 3Validation and comparison of the predictive model. The training set (A) and testing set (B) associated with micrometastasis of lymph nodes were measured via the RF model. The comparison of the ROC curve of the RF model (C) and GL model (D).