Literature DB >> 36199801

Construction and Evaluation of a Preoperative Prediction Model for Lymph Node Metastasis of cIA Lung Adenocarcinoma Using Random Forest.

Chuhan Zhang1, Shun Xu1, Youhong Jiang2, Changrui Jiang1, Shangxin Li1, Zhitong Wang1, Yan Dong3, Feng Jin3, Dan Zhao4, Yating Zhao5.   

Abstract

Background: Lymph node metastasis (LNM) is the main route of metastasis in lung adenocarcinoma (LA), and preoperative prediction of LNM in early LA is key for accurate medical treatment. We aimed to establish a preoperative prediction model of LNM of early LA through clinical data mining to reduce unnecessary lymph node dissection, reduce surgical injury, and shorten the operation time.
Methods: We retrospectively collected imaging data and clinical features of 1121 patients with early LA who underwent video-assisted thoracic surgery at the First Hospital of China Medical University from 2004 to 2021. Logistic regression analysis was used to select variables and establish the preoperative diagnosis model using random forest classifier (RFC). The prediction results from the test set were used to evaluate the prediction performance of the model.
Results: Combining the results of logistic analysis and practical clinical application experience, nine clinical features were included. In the random forest classifier model, when the number of nodes was three and the n-tree value is 500, we obtained the best prediction model (accuracy = 0.9769), with a positive prediction rate of 90% and a negative prediction rate of 98.69%.
Conclusion: We established a preoperative prediction model for LNM of early LA using a machine learning random forest method combined with clinical and imaging features. More excellent predictors may be obtained by refining imaging features.
Copyright © 2022 Chuhan Zhang et al.

Entities:  

Year:  2022        PMID: 36199801      PMCID: PMC9527416          DOI: 10.1155/2022/4008113

Source DB:  PubMed          Journal:  J Oncol        ISSN: 1687-8450            Impact factor:   4.501


1. Introduction

Lung cancer is a malignant tumor with high morbidity and mortality rates. The latest global cancer data released by the International Agency for Research on Cancer (IARC) of the World Health Organization shows that the incidence of lung cancer ranks second and mortality ranks first among all cancers, and the morbidity and mortality rates rank first among cancers in China. Non-small-cell lung cancer (NSCLC) is the most common pathological type of lung cancer, accounting for 80% of all lung cancers [1]. Lymph node metastasis (LNM) is an important route of metastasis in lung cancer and the main factor affecting staging and prognosis. In recent years, with improvements in radiological techniques and increased frequency of regular physical examinations, the proportion of patients identified with early-stage NSCLC has increased. Additionally, because of the COVID-19 pandemic, the use of lung CT has increased [2, 3]. Increased application of lung CT improves the detection rate of early lung cancer. While mediastinoscopy or PET is the gold standard for examining LNM in lung cancer [4, 5], these examinations are invasive and cause an economic burden to patients [5, 6]. Additionally, the diagnostic effect of PET on LNM of early NSCLC is not ideal [5, 7]. However, performing deep lymph node dissection for all early LA patients is invasive and not needed for all patients. Therefore, preoperative prediction of LNM in early LA is critical to identify patients that require surgery. Previous studies have used logistic regression to construct prediction models for LNM, but the results of the models tended to explain only the importance and application of risk factors [8-11]. The predictive ability of the models for LNM is not clear, and the prediction results of LNM-positive cases remain unsatisfactory. With the increasing applications of artificial intelligence, ML has gradually become a hot research area for building prediction models. Research has shown that the prediction efficiency of the ML model is better than that of the traditional linear regression model [12]. Therefore, the purpose of this study was to establish a suitable preoperative prediction model of LNM in early LA by summarizing imaging findings and clinical features of early NSCLC, combined with statistical methods and ML. This model will help reduce unnecessary lymph node dissection and surgical injury and shorten surgical time.

2. Materials and Methods

2.1. Selection of Cases

We retrospectively reviewed 13272 patients with lung tumors in the Department of Thoracic Surgery of the First Hospital of China Medical University between January 2004 and October 2021. We preliminarily selected 3097 patients who underwent VATS and were diagnosed with NSCLC. Patients with incomplete data and non-cIA or multiple tumors were excluded, and 9 of the remaining 1130 patients were excluded because intraoperative frozen sections were later confirmed as nonadenocarcinoma. Figure 1 shows the patient selection process. This study was approved by the Institutional Ethics of Committee of the First Hospital of China Medical University (2021-440).
Figure 1

Flowchart of patient selection and exclusion.

2.2. Clinicopathological Variables

All 1121 enrolled patients with early solitary LA (≤3 cm) underwent VATS resection and lymph node dissection at the First Hospital of China Medical University. All clinicopathological information was collected in the hospital information system (HIS), with CT images (thin layer, 1.25 mm and under) and pathological results. All cases had received lung CT results within one month before operation. Two thoracic surgeons reread the CT images of the patient group to measure nodule characteristics and restage the lung cancer following the eighth edition of TNM staging of lung cancer. In cases of disagreement, a radiologist determined the final conclusion. The average number of lymph node dissections was 9. A total of 64 cases were confirmed with LNM by postoperative pathology, including 20 cases with masses of 2 cm or smaller and 44 cases with masses larger than 2 cm. There were 17 cases of N1a, 13 cases of N1b, 21 cases of N2a, and 13 cases of N2b. Following the results of previous similar studies [8-12] and our clinical experience, we preliminarily selected 20 features for this study, including tumor location (lobe/zone), vascular shadow, pleural indentation, lobulated and spiculated sign, average long- and short-axis diameters of the solid components (mediastinal window/lung window), solid area of the tumor (mediastinal window/lung window), maximum area of the tumor (lung window), consolidation tumor ratio (mediastinal window/lung window), maximum tumor diameter (lung window), age, sex, enlargement of lymph nodes (ELN), PaO2, PaCO2, CEA, and NSE.

2.3. Univariate Analysis

Univariate analysis was performed using IBM SPSS (version 25.0; SPSS, Inc., Chicago, IL, USA) to screen the influencing factors. Univariate logistic regression was selected for data analysis using postoperative lymph node pathology as the variable. A P value < 0.05 was considered statistically significant. Continuous variables are expressed as mean ± standard deviation (SD), and categorical variables are described with frequencies. For similar variables, we used the ROC curve (from 1121 cases) to measure the work efficiency and identify suitable variables.

2.4. Construction of RFC Model

The random forest algorithm was used to build the prediction model using the R programming language (version 4.1.2). We included 173 cases (including 20 LNM-positive cases) from the total sample into the test set, and the remaining 948 cases (including 44 LNM-positive cases) were included in the training set. The prediction ability of the model was verified with real cases, and the verification results were subject to postoperative pathology.

3. Results

Table 1 shows the variables included in the study and univariate logistic analysis results. As shown in Figure 2 and Table 2, we used the mediastinal window consolidation tumor ratio (AUC = 0.873) as the final CTR. We also chose the solid area with a mediastinal window (AUC = 0.896) as the model variable to reduce the impact of cases with the same CTR but different tumor sizes.
Table 1

Univariate analysis results of patients' variables (n = 1121).

VariableTotal P value
Enlargement of lymph nodes (ELN)<0.001
 Yes188
 No933
Lobe location (LL)0.773
 Left upper lobe264
 Left lower lobe167
 Right upper lobe370
 Right middle lobe90
 Right upper lobe230
Tumor location (TL, zone)<0.001
 Central zone75
 Middle zone202
 Peripheral zone844
Vascular shadow (VS)<0.001
 Yes525
 No596
Pleural indentation (PI)<0.001
 Yes638
 No483
Lobulated and spiculated sign (LASS)<0.001
 Yes553
 No568
Sex0.718
 Male406
 Female715
Tumor diameter (TD-max)1.685 ± 0.626 (0.30-3.00)<0.001
Tumor area (TA-max)2.573 ± 1.789 (0.18-9.00)<0.001
The average diameter of the solid components (mediastinal window)0.727 ± 0.756 (0.00-2.75)<0.001
The average diameter of the solid components (lung window)0.869 ± 0.849 (0.00-3.00)<0.001
The solid component area (SCA, mediastinal window)1.072 ± 1.527 (0.00-7.56)<0.001
The solid component area (SCA, lung window)1.446 ± 1.877 (0.00-9.00)<0.001
CTR (mediastinal window)0.315 ± 0.349 (0.00-1.00)<0.001
CTR (lung window)0.448 ± 0.451 (0.00-1.00)<0.001
Age56.820 ± 9.976 (23.00-84.00)0.267
CEA2.724 ± 4.996 (0.12-85.00)<0.001
NSE19.016 ± 7.462 (1.35-58.20)0.079
PaO289.327 ± 9.836 (42.30-134.00)0.741
PaCO240.862 ± 3.551 (22.50-52.80)0.022
Figure 2

ROC curve of similar variables.

Table 2

AUC results (n = 1121).

AUC95% CI
Tumor diameter (max)0.7970.752-0.841
Tumor area (max)0.8010.756-0.846
The average diameter of the solid components (mediastinal window)0.8950.868-0.923
The average diameter of the solid components (lung window)0.8890.861-0.917
The solid component area (mediastinal window)0.8960.869-0.923
The solid component area (lung window)0.8880.860-0.916
CTR (mediastinal window)0.8730.842-0.905
CTR (lung window)0.8240.791-0.856
In the variable selection, we unexpectedly found that the PaCO2 was significantly associated with LNM (P < 0.05). However, we had no way to confirm a relationship between this variable and LNM, and therefore, it was not included in the final ML model. On the basis of our clinical experience and the univariate logistic analysis results, nine variables were selected for inclusion in the final ML steps. When n‐tree = 500 and the number of classification nodes was three, the model achieved the best performance. On this basis, we compared the probability given by the model and adjusted its cut-off point. The test results showed that the positive prediction rate of the model was 90%, the negative prediction rate was 98.69%, and the accuracy rate was 97.69%. Table 3 presents the evaluation indices of the RFC model, and Figure 3 shows the importance and stability of these variables.
Table 3

Test results and evaluation indexes of the RFC model (n = 173).

Result
True-positive (TP) cases18
True-negative (TN) cases151
False-positive (FP) cases2
False-negative (FN) cases2
Accuracy0.9769
Sensitivity = recall0.9000
Specificity0.9869
Precision0.9000
95% CI0.9419-0.9937
P value<0.001
F10.9000
Figure 3

Importance and stability of each variable.

Using this model, we monitored 100 patients with solitary LA (≤3 cm) in the First Hospital of China Medical University from February to May 2022. During this period, five patients with isolated 2–3 cm LA had a probability for LNM positivity of more than 10%, and two of these patients were diagnosed as LNM-positive by the model. Mediastinal and intrapulmonary lymph nodes were carefully examined after operation. The results indicated that two cases with positive predicted results showed N1 and N2 metastasis. Among the three cases with negative predicted results but a positive probability over 10%, two cases were N1 and one case had no metastasis. The results were similar to those of our tests. We calculated the cut-off values of the first two continuous variables in the ranking given by the model. Using Youden's index as the standard, the solid area measured with mediastinal window greater than 1.55 (59/64, 297/1121) and CTR higher than 45.2% (61/64, 406/1121) would significantly increase the probability of LNM. These may provide some data basis for further clinical study of lymph node metastasis.

4. Discussion

In recent years, with improvements in radiological techniques and increased frequency of regular physical examinations, the proportion of patients with early-stage NSCLC has increased. Owing to the low metastasis rate and small tumor size, the methods of early-stage NSCLC resection and lymph node dissection are constantly being updated and improved by surgeons worldwide, to promote the development of surgical precision medicine. The treatment of NSCLC, especially LA, has been a major focus of research. Many studies have explored the identification of meaningful prognostic factors and new treatments [13-16]. However, improving methods for early detection of LNM not only helps determine whether patients should undergo further examination but also has great guiding significance for lymph node dissection during surgery. At present, tumor size, CTR, tumor markers, and imaging features have been repeatedly confirmed as preoperative predictors of LNM in lung cancer [17-28]. In clinical treatment, biopsy is the gold standard to determine the status of LNM. However, biopsy is an invasive examination and therefore, establishing a prediction model of LNM for prebiopsy use is important. Some studies have indicated that PET can be used for preoperative observation, with 3.3 identified as the cut-off value of SUVmax [29]. However, other studies have reported that PET has no significant effect on observing LNM of early NSCLC [30-35]. Therefore, it is not advisable to use PET in clinical treatment to observe the presence of LNM in early small nodules. In our database in this study, patients with early pulmonary nodules who underwent preoperative examination with PET accounted for 6.8% of the total patient group, and only 11.7% of all metastatic cases underwent PET before surgery, which indicates that a large number of patients with LNM requiring PET examination have not been accurately identified, even including some patients whose tumor size was less than 2 cm. Therefore, establishing a predictive model that can accurately predict LNM before surgery is an important and challenging task. To establish an accurate clinical prediction model, we first performed strict selection and measurement of variables. Several studies have confirmed that CTR, the ratio of the solid component diameter to the maximum tumor diameter, is closely related to LNM [24-26]. While some software can measure the tumor volume ratio, they are not widely used, and thus, CTR remains the first choice for many clinicians. However, most studies on the tumor consolidation rate only measured the ratio of the two length diameters, without considering the short diameter. In 2017, the Fleischner Society published guidelines on CT imaging identification of pulmonary nodules [36], which proposed that the measurement of solid components should include both long and short axes. As shown in Figure 4, when the maximum diameter of the solid component is close to the maximum diameter of the solid component, the key point of CTR is the ratio of the width, especially for tumors in which the width of the solid component is much smaller than the maximum width of the tumor. Even if the length ratio is 1, it does not mean that these tumors are pure solid tumors. Therefore, we changed the CTR from the length ratio to the area ratio to avoid the influence of large differences between the length and width of the tumor. This is one of the main differences between our model and the previous prediction models that included CTR.
Figure 4

Tumor location and CTR calculation.

We also propose a new classification method for tumor location in CT images using the location relationship between the tumor and the segmental bronchus. When there was an observable segmental bronchial shadow around the tumor, we designated the tumor location in the middle zone; tumors located above the segmental bronchus, with unclear boundaries from the mediastinum or lobar bronchus, were considered to be located in the inner zone, and tumors located below the segmental bronchus, without bronchial shadow, were located in the outer zone. This approach describes the location of the tumor more precisely than other descriptions of central and peripheral types. We also compared the same continuous variable with different windows. The AUC results showed that the measurement results of solid components with a mediastinal window are more suitable for the calculation of CTR. Our study also showed that variables that include both long and short diameter of the tumor are better than those only including the long diameter. However, there was no significant difference between the area of solid components and average long- and short-axis diameters of the solid components in the ROC results. To facilitate the application and calculation of clinical treatments, we believe that the average diameter of the solid components can be used directly. Several models have been reported for predicting LNM of NSCLC, and most of these are logistic regression models [8-11]. However, most models showed high specificity and low sensitivity, which indicates that they are unable to distinguish between true-negative cases and true-positive cases. With the increasing application of artificial intelligence (AI), ML has gradually become a widely used option for building prediction models. Wu et al. [12] summarized the commonly used prediction models and compared their prediction ability. The results showed that the prediction efficiency of the ML model is significantly better than that of the traditional multifactor model, and the RFC model performed better in the prediction of preoperative LNM. In this study, we first attempted to build a prediction model using a traditional logistic multifactor regression analysis. The results were similar to those of many previous studies [8-11], and the negative predictive value of the model was very high. However, it was difficult to achieve the desired positive predictive value; even when we changed the cut-off point to 0.1, the sensitivity did not reach 0.8 and more than 100 negative patients were predicted to be positive cases. Therefore, logistic multifactor regression analysis may not be a suitable method for preoperative prediction of LNM. We then used ML and R programming to build the RFC model. We compared the computational probability of positive and negative cases in the internal test set constructed by RFC from the training set data. After comparison, we found a group difference in the probability of positive and negative cases calculated by the model. Most of the positive cases had a positive prediction probability of more than 20%, whereas the negative cases had a positive prediction probability of markedly less than 10% or even lower than 1%. Therefore, we set the cut-off point of the model; when a case was calculated to have a 20% probability of metastasis, the model classified it as a positive case. The results of the new model are satisfactory. After retesting, the accuracy of the model was 97.69%, the positive prediction rate was 90%, and the negative prediction rate was 98.69%. Even if the cut-off value is low (20%), the false-positive rate of the model is still less than 2%, which shows that the model is very effective for the classification of test set cases. We were able to completely screen out all true-negative cases and accurately identify the few positive cases. The meta-analysis by Birim et al. [37] showed that the overall sensitivity and specificity rates of PET in the detection of mediastinal LNM were estimated to be 83% and 92%, respectively. Compared with the existing prediction model research and the results of PET, our model performed better. Compared with previous prediction models of the same type, our model has a larger data volume and a more refined data collection in that it included tumors in all locations rather than only tumors in the peripheral location [12, 38, 39]. Similar to Wu et al. [12], we did not exclude pure GGO and GGO-dominant part-solid tumors in this model construction, because the CTR in this study was different from CTR in other studies. In our data, five of the patients with CTR < 0.5 had lymph node metastasis, which was the main reason why we did not exclude pure GGO and GGO-dominant part-solid tumors. Additionally, more sufficient and complete data means a more efficient model. We also generated statistics on the prediction results of the LNM at N1 station. Among the six cases of N1 metastasis in the test set, four cases were accurately predicted and two cases showed false-negative results, which affected the positive predictive value. Because there were still false-negative cases and such errors cannot be easily ignored in clinical treatment, we compared the probability of cases in the test set given by the model (Table 4). The results were consistent with what we saw in our internal test set.
Table 4

Prediction probability of the test set (n = 173).

Negative probabilityPositive probabilityPrediction result
10.8800.120FN
20.8640.136FN
30.9940.006TN
40.9980.002TN
50.9960.004TN
60.9800.020TN
71.0000.000TN
81.0000.000TN
91.0000.000TN
101.0000.000TN
111.0000.000TN
120.9320.068TN
131.0000.000TN
141.0000.000TN
151.0000.000TN
161.0000.000TN
171.0000.000TN
181.0000.000TN
191.0000.000TN
200.9980.002TN
211.0000.000TN
220.7080.292TP
230.7200.280TP
240.4760.524TP
250.5980.402TP
260.6340.366TP
270.6460.354TP
280.5080.492TP
290.6620.338TP
300.5460.454TP
310.6140.386TP
320.5780.422TP
330.7020.298TP
340.5560.444TP
350.7280.272TP
360.7120.288TP
370.6640.336TP
380.6740.326TP
390.7300.270TP
400.9980.002TN
411.0000.000TN
421.0000.000TN
431.0000.000TN
440.9880.012TN
451.0000.000TN
461.0000.000TN
470.8080.192TN
480.9700.030TN
490.9920.008TN
500.9820.018TN
510.9820.018TN
521.0000.000TN
530.9760.024TN
541.0000.000TN
550.9700.030TN
561.0000.000TN
571.0000.000TN
580.9920.008TN
590.9960.004TN
600.9880.012TN
610.9960.004TN
620.8460.154TN
631.0000.000TN
641.0000.000TN
651.0000.000TN
660.9260.074TN
670.9940.006TN
680.9600.040TN
690.9820.018TN
701.0000.000TN
710.9820.018TN
721.0000.000TN
731.0000.000TN
741.0000.000TN
750.9920.008TN
761.0000.000TN
771.0000.000TN
780.9500.05TN
790.8980.102TN
800.9880.012TN
811.0000.000TN
820.9980.002TN
830.9160.084TN
841.0000.000TN
851.0000.000TN
860.9820.018TN
870.9980.002TN
881.0000.000TN
891.0000.000TN
901.0000.000TN
911.0000.000TN
921.0000.000TN
930.9280.072TN
941.0000.000TN
950.9920.008TN
961.0000.000TN
970.9760.024TN
980.8340.166TN
991.0000.000TN
1001.0000.000TN
1010.9960.004TN
1020.9360.064TN
1030.9520.048TN
1041.0000.000TN
1050.9700.030TN
1060.9920.008TN
1071.0000.000TN
1080.9960.004TN
1090.9980.002TN
1100.9840.016TN
1111.0000.000TN
1120.9960.004TN
1130.9940.006TN
1141.0000.000TN
1150.9980.002TN
1160.9040.096TN
1170.9580.042TN
1180.9980.002TN
1191.0000.000TN
1201.0000.000TN
1210.8520.148TN
1221.0000.000TN
1231.0000.000TN
1240.8600.140TN
1251.0000.000TN
1261.0000.000TN
1270.9500.050TN
1281.0000.000TN
1290.9960.004TN
1300.8240.176TN
1310.9620.038TN
1320.8100.190TN
1330.9840.016TN
1340.9260.074TN
1350.8960.104TN
1360.9600.040TN
1371.0000.000TN
1380.9960.004TN
1390.9560.044TN
1401.0000.000TN
1410.9200.080TN
1420.7940.206FP
1430.9940.006TN
1440.9960.004TN
1451.0000.000TN
1461.0000.000TN
1470.9360.064TN
1481.0000.000TN
1490.9980.002TN
1500.7480.252FP
1510.9960.004TN
1520.9200.080TN
1531.0000.000TN
1541.0000.000TN
1550.9960.004TN
1561.0000.000TN
1570.9960.004TN
1581.0000.000TN
1591.0000.000TN
1601.0000.000TN
1610.9980.002TN
1621.0000.000TN
1630.9960.004TN
1640.9740.026TN
1651.0000.000TN
1660.9040.096TN
1671.0000.000TN
1681.0000.000TN
1691.0000.000TN
1700.9900.010TN
1711.0000.000TN
1721.0000.000TN
1731.0000.000TN
We also reviewed the data for false-positive cases in detail. The tumors were solid-dominant part-solid nodes; both of their maximum tumor diameters were over 1.8 cm, with pleural indentation and spiculated signs, and analysis of the intraoperative frozen sections revealed adenocarcinoma. Our clinicians performed complete systematic lymph node dissection during the operation, and no evidence of LNM was found. All characteristics of the case are in line with our current criteria for systematic lymph node dissection, and the prediction probability of the model was consistent with our actual treatment of the case. We believe that this may be a special case, but this case also confirms the homogeneity of the ML model and clinician thinking, to some extent. The results of the model were consistent with the actual treatment of the case. The RFC model also gave the order of importance and stability of the variables introduced by the model (Figure 3). Variables related to tumor solid components and tumor size ranked very high, and the tumor solid area (mediastinal window) and CTR were the most prominent. However, ELN, as in our commonly used clinical observation, was not a key variable in this model. Only 28 of the 188 patients with mediastinal ELN had LNM, which accounted for only 43.5% (28/64) of all metastasis-positive cases. More than half of the patients with LNM did not show enlarged lymph nodes. We speculate that in these early-stage patients, the enlarged lymph nodes without metastasis are more likely caused by inflammation or hyperplasia. The ML model is entirely based on the training set data, which may be one of the main reasons for the poor performance of this variable. Moreover, the sensitivity of tumor markers such as CEA and NSE may not be very high in early NSCLC. From our data review results, only 39% (25/64) of the total cases of metastasis had CEA greater than 4.30, whereas more than 93% of the cases with NSE greater than 16.30 had no LNM. These two indicators do not show much advantage in early prediction; therefore, they rank lower in importance descriptions. This may be because in patients with early LA, the effect of the tumor on the body is small, and the commonly used cut-off values of tumor markers are not applicable to this group. For patients with early LA, lower cut-off values may be more effective in identifying cases with high risk of metastasis. The LASS shows no advantage in the rank of importance, which is consistent with the results of previous studies [12]. We propose the following method for lymph node dissection in patients with isolated LA before surgery. When the prediction result of the model determines that a patient has metastasis, we choose systematic lymph node dissection; when the case is determined to be without metastasis, for patients with a metastasis probability of 10%–20%, more samples and more detailed pathological examination of the pulmonary lymph nodes are required. For patients with a positive probability of less than 10%, lobe-specific lymph node dissection and segmental pneumonectomy or wedge resection may be options. Patients with a positive probability of less than 1% can choose to undergo lymph node sampling and wedge resection. This strategy needs to be confirmed in clinical practice. Using this approach, we randomly monitored 100 patients and 5 patients had a positive probability of more than 10%, including two patients who were determined to have LNM. Mediastinal and intrapulmonary lymph node examination showed that two cases with positive predicted results had N1 and N2 metastasis. Among three cases with negative predicted results but positive probability over 10%, two cases were N1 and one case had no metastasis. The calculated results of the other cases were less than 10%, and the pathological results also suggested that there was no LNM. The results were similar to those of our tests. The test results show that the model can help clinicians predict the probability of LNM in patients with early lung adenocarcinoma before operation and further guide the scope of lymph node dissection during the operation. Intraoperative pathological results should be combined with clinical experience; even if various indicators point to a high risk of metastasis, some solid nodules are tuberculosis or benign hamartoma. Pathological typing is not included as a model variable, and thus, the application of this model is not limited to the choice of intraoperative methods but it also helps determine which patients need more attention before operation, such as those with a positive probability of 10%–20%, who are more likely to have N1 metastasis rather than N2 metastasis, because this is the mean area in the model that cannot be accurately classified. Our findings indicate that the group with LNM among patients with early isolated LA showed certain characteristics. In our study, the group with a positive probability of more than 20% was likely to have LNM. Mediastinal LNM is not common in patients with a positive probability of 10%–20%; most metastasis is N1 stage LNM. Although there is little difference in the predicted probability of LNM in these cases, we were able to distinguish them from patients without LNM (positive probability less than 1%). This may be a special advantage of ML models in producing better classification results by comparing subtle differences in data. This study had several limitations. First, this was a single-center retrospective study and we used the same database for training and testing; however, we used some new variables as predictive variables, such as CTR from area ratio, and these variables cannot be found in the public database, which makes it impossible for us to test the model through external verification. Second, because of the clinical characteristics of early LA, there was a small proportion of cases with metastasis, leading to a lack of positive materials for ML. This was one of the main reasons for the difficulty in improving the sensitivity of the predictive model. Additionally, to build the model, we did not distinguish between N1 and N2 metastases; N1 probability was markedly lower than that of N2 (gap of approximately 10%–20%), which makes it necessary to adjust the cut-off point to obtain better results.

5. Conclusion

Our study was aimed at constructing a prediction model for preoperative LNM through ML to provide a strategy for reducing unnecessary surgical trauma and shortening the operation time. Using the random forest algorithm, we successfully built a prediction model; in the 173 patients in the test set, the model correctly predicted 18 cases of patients with LNM and 151 negative cases. From the specific probability calculated by the model, we were able to further distinguish the mispredicted cases from true-negative results. This was confirmed in subsequent verification of real cases. The tumor solid component area and CTR were identified as the main predictive factors, whereas CEA and NSE were not sensitive to the prediction of early LA metastasis. Our RFC model reflected this phenomenon. Third, in the measurement and calculation of the solid components, the variables including both the long diameter and short diameter performed better than those with only the long diameter, and the results obtained under the mediastinal window performed better. From these variables, our ML model also shows great potential for development, which could help clinicians make lymph node dissection plans. This study is a good test for the preoperative prediction of LNM; it can provide more sufficient clinical basis for future research in this field.
  39 in total

Review 1.  PET-CT for assessing mediastinal lymph node involvement in patients with suspected resectable non-small cell lung cancer.

Authors:  Mia Schmidt-Hansen; David R Baldwin; Elise Hasler; Javier Zamora; Víctor Abraira; Marta Roqué I Figuls
Journal:  Cochrane Database Syst Rev       Date:  2014-11-13

Review 2.  Meta-analysis of positron emission tomographic and computed tomographic imaging in detecting mediastinal lymph node metastases in nonsmall cell lung cancer.

Authors:  Ozcan Birim; A Pieter Kappetein; Theo Stijnen; Ad J J C Bogers
Journal:  Ann Thorac Surg       Date:  2005-01       Impact factor: 4.330

3.  Lymph node involvement influenced by lung adenocarcinoma subtypes in tumor size ≤3 cm disease: A study of 2268 cases.

Authors:  Y Yu; H Jian; L Shen; L Zhu; S Lu
Journal:  Eur J Surg Oncol       Date:  2016-03-08       Impact factor: 4.424

4.  Is mediastinoscopy still the gold standard to evaluate mediastinal lymph nodes in patients with non-small cell lung carcinoma?

Authors:  C M Sivrikoz; I Ak; F S Simsek; E Döner; E Dündar
Journal:  Thorac Cardiovasc Surg       Date:  2011-06-20       Impact factor: 1.827

5.  Revised ESTS guidelines for preoperative mediastinal lymph node staging for non-small-cell lung cancer.

Authors:  Paul De Leyn; Christophe Dooms; Jaroslaw Kuzdzal; Didier Lardinois; Bernward Passlick; Ramon Rami-Porta; Akif Turna; Paul Van Schil; Frederico Venuta; David Waller; Walter Weder; Marcin Zielinski
Journal:  Eur J Cardiothorac Surg       Date:  2014-02-26       Impact factor: 4.191

6.  Segmentectomy for clinical stage IA lung adenocarcinoma showing solid dominance on radiology.

Authors:  Yasuhiro Tsutani; Yoshihiro Miyata; Haruhiko Nakayama; Sakae Okumura; Shuji Adachi; Masahiro Yoshimura; Morihito Okada
Journal:  Eur J Cardiothorac Surg       Date:  2014-01-28       Impact factor: 4.191

7.  Development and Validation of a Combined Model for Preoperative Prediction of Lymph Node Metastasis in Peripheral Lung Adenocarcinoma.

Authors:  Qi Li; Xiao-Qun He; Xiao Fan; Chao-Nan Zhu; Jun-Wei Lv; Tian-You Luo
Journal:  Front Oncol       Date:  2021-05-24       Impact factor: 6.244

8.  Deep Learning Analysis Using 18F-FDG PET/CT to Predict Occult Lymph Node Metastasis in Patients With Clinical N0 Lung Adenocarcinoma.

Authors:  Ming-Li Ouyang; Rui-Xuan Zheng; Yi-Ran Wang; Zi-Yi Zuo; Liu-Dan Gu; Yu-Qian Tian; Yu-Guo Wei; Xiao-Ying Huang; Kun Tang; Liang-Xing Wang
Journal:  Front Oncol       Date:  2022-07-07       Impact factor: 5.738

Review 9.  Role of computed tomography in COVID-19.

Authors:  Gianluca Pontone; Stefano Scafuri; Maria Elisabetta Mancini; Cecilia Agalbato; Marco Guglielmo; Andrea Baggiano; Giuseppe Muscogiuri; Laura Fusini; Daniele Andreini; Saima Mushtaq; Edoardo Conte; Andrea Annoni; Alberto Formenti; Antonio Giulio Gennari; Andrea I Guaricci; Mark R Rabbat; Giulio Pompilio; Mauro Pepi; Alexia Rossi
Journal:  J Cardiovasc Comput Tomogr       Date:  2020-09-04
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.