Literature DB >> 33273861

Prediction of Lymph Node Metastasis in Superficial Esophageal Cancer Using a Pattern Recognition Neural Network.

Han Chen^1,2, Xiaoying Zhou^1,2, Xinyu Tang^2,3, Shuo Li^1,2, Guoxin Zhang^1,2.

Abstract

BACKGROUND OR
PURPOSE: It is important to predict nodal metastases in patients with early esophageal cancer to stratify patients for endoscopic resection or esophagectomy. This study was to establish a novel artificial neural network (ANN) and assess its ability by comparing it with a traditional logistic regression (LR) model for predicting lymph node (LN) metastasis in patients with superficial esophageal squamous cell carcinoma (SESCC).
METHODS: A primary cohort was established, composed of 733 patients who underwent esophagectomy for SESCC from December 2012 to December 2019. The following steps were applied: (i) predictor selection; (ii) development of an ANN and a LR model, respectively; (iii) cross-validation; and (iv) evaluation of performance between the two models. The diagnostic assessment was performed with sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, C-index, net reclassification improvement (NRI), and integrated discrimination improvement (IDI).
RESULTS: The established ANN model had 6 significant predictors: a past habit of alcohol taking, tumor size, submucosal invasion, histologic grade, lymph-vessel invasion, and preoperative CT result. The ANN model performed better than the LR model in specificity (91.20% vs 72.59%, p=0.006), PPV (56.49% vs 39.78%, p=0.020), accuracy (90.72% vs 74.49%, p<0.0001), C-index (91.5% vs 86.8%, p<0.001), and IDI (improved by 23.3%, p<0.001). There were no differences between these two models in sensitivity (87.06% vs 83.21%, p=0.764), NPV (98.17% vs 95.21%, p=0.627), and NRI (improved by -1.1%, p=0.824).
CONCLUSION: This ANN model is superior to the LR model and may become a valuable tool for the prediction of LN metastasis in patients with SESCC.

Entities: Chemical

Keywords: lymph node metastasis; machine learning; neural network; superficial esophageal squamous cell carcinoma

Year: 2020 PMID： 33273861 PMCID： PMC7707435 DOI： 10.2147/CMAR.S270316

Source DB: PubMed Journal: Cancer Manag Res ISSN： 1179-1322 Impact factor: 3.989

Introduction

Esophageal cancer is the third most common cancer and the fourth leading cause of cancer-related mortality according to Cancer Statistics in China.1 Although there has been a global reversal in the ratio of squamous cell cancers to adenocarcinomas,2 the esophageal squamous cell cancers in China still dominate all cases (~90%).3 Due to the prevalence of upper gastrointestinal endoscopy screening, a large part of esophageal cancers tends to be detected in an early stage.4 Accordingly, this shift in cancer stage distribution leads to a change in treatment strategy in China. Traditionally, esophagectomy is considered a curative treatment for Superficial Esophageal Squamous Cell Carcinoma (SESCCs) but is usually associated with considerable postoperative morbidity and mortality.5 Endoscopic submucosal dissection (ESD) has recently become an alternative to esophagectomy in China, especially for T1a stage SESCCs without lymph nodes (LN) metastasis. Recent studies show that the 3- and 5-year overall survival rate still exceeds 90% among patients who even underwent endoscopic resection with T1b (invading into SM2 layer) cancer.6,7 Thus, the application of ESD for esophageal cancer has been gradually expanding within the past years.8 However, the main limitation of ESD is that it is only curative in those without LN metastasis because endoscopic treatment could not achieve lymphadenectomy. Therefore, clinicians must establish a model for predicting the risks of LN metastasis, especially in Asia. Neural networks have been widely used in a broad range of areas such as business, data mining, drug discovery, and biology. In medical fields, they also have been applied successfully in the detection of disease, evaluation of new drugs, and estimation of the treatment cost. In several previous papers, the discriminatory power of logistic regression (LR) and ANN models was compared.9–11 Both models performed equally well in most cases, whereas the more flexible ANN model generally outperforming LR model in the remaining cases. Among the published prediction models for LN metastasis, there are already three LR-based nomograms.12–14 However, there are still no established ANN models. The aim of this study was to establish an ANN model and assess its ability by comparing it with a traditional Logistic Regression (LR) model for predicting LN metastasis in patients with SESCC.

Method

Study Design

We identified 3926 patients consecutively from a clinical electronic database. All patients were pathologically diagnosed with esophageal cancer by endoscopic biopsy between December 2012 to December 2019 in the First Affiliated Hospital of Nanjing Medical University. A retrospective cohort was established, composed of 733 consecutive patients who underwent primary surgical resection and lymphadenectomy for SESCC. The study was approved by the Institutional Ethics Committee of the First Affiliated Hospital of Nanjing Medical University. All participants gave informed consent for reviewing their medical records in the study.

Data Collecting Procedure

Inclusion criteria included: (1) histopathological diagnosis of esophageal squamous carcinoma on surgical specimens; (2) pT1 stage carcinoma (no tumor invasion beyond the submucosa); (3) patients who underwent primary surgical resection and at least two-field lymphadenectomy; (4) no history of previous malignancies and anticancer therapies. Exclusion criteria included: (1) esophageal adenocarcinoma or other types of esophageal cancer; (2) mixed types of esophageal cancer; (3) synchronous multiple lesion in esophagus; (4) tumor with undefined pathological origin or metastatic esophageal cancer; (5) esophagectomy after Endoscopical resection; (6) patients under 18; (7) perioperative mortality. Before surgery, all participants were histopathologically assessed to define esophageal cancer after endoscopic biopsies. Positive LN metastasis in CT was defined as having at least one enlarged lymph node with a short-axis dimension of ≥1m. The 8th edition AJCC/UICC staging system of esophageal cancer was applied.15 Tumor sizes were determined as the maximum diameter in two dimensions, measured by Vernier’s calipers. Location (L) is defined as the position of the epicenter of the tumor. If no statement of epicenter was provided, the following measurements were applied: (1) upper: 15–24cm from incisors; (2) middle: 25–29cm from incisors; (3) lower: 30–40/45cm from incisors. Histologic grade (G) was categorized as well-differentiated (G1), moderately differentiated (G2) and poorly differentiated (G3). Macroscopic tumor type was classified using the 2016 Japanese Classification of Esophageal Cancer, 11th Edition.16 General clinical features were documented. A past habit of alcohol taking means taking in at least 60g of ethanol per day for men and at least 40g for women within the past five years of cancer diagnosis, as defined by WHO and the European Medicines Agency.17 The pathologic diagnosis of esophageal squamous cell carcinoma was performed by two experienced pathologists who assessed the surgically resected specimens independently. Pathologic features were recorded, including tumor size, invasion depth, microscopic type, histologic grade, lymph-vessel invasion (LVI), LN status (LN metastasis and the location of the metastasis). Invasion depth was divided into four categories: epithelium (EP)/lamina propria mucosa (LPM), muscularis mucosa (MM), submucosal (SM)1, SM2 or deeper. Invasion depth and LVI was further confirmed by immunohistochemical staining.

Statistical Analysis

Significant predictors for establishing ANN were identified using IBM SPSS Statistics for Windows, Version 23.0 (SPSS, Chicago, IL). Pearson chi-square test or Fisher’s exact test was applied in dichotomous variables to identify independent risk factors of LN metastasis in SESCC. Variables significantly associated with LN metastasis (p ≤.05) were identified as candidate for multivariate logistic regression. The optimal cutoff value was assessed by Youden index in the receiver operating characteristic (ROC) curve. The area under receiver operating characteristic curves (AUC) was calculated to evaluate the diagnostic accuracy and was compared by the DeLong’s test. The net reclassification improvement (NRI) and the integrated discrimination improvement (IDI) were calculated to quantify the refinement in predictive accuracy. The lasso regression, NRI, IDI and ROC were conducted with R software (version 3.5.1, ). The packages used in R are listed in .

Establishment of the Predictive ANN Model

A pattern recognition ANN was established by using the “nprtool” for pattern recognition neural network in Matlab 2019a (MathWorks Institute, USA). This neural network contains three kinds of layers: the input layer, hidden layer, and output layer. A backpropagation Levenberg-Marquardt algorithm (trainlm) from the MATLAB Neural Network Toolbox was applied to estimate the error for the output layer, as well as for each of the trained network neuron, and for neurons’ correction weights following current values. The input layer primarily contained the nonlinear neurons and organized them into a feed-forward multi-layer structure. The activation function was set to be hyperbolic tangent sigmoid transfer function. The performance indicator used was Means Squared Error (MSE). As the training process was stopped at maximum validation error, the toolbox returned the best-validated epoch and its performance.18 We examined the Receiver Operating Characteristic (ROC) plots and Confusion plots to check the best-validated epoch.

Results

Baseline Characteristics

A total of 9043 lymph nodes were resected from 733 patients by surgery (513 male and 220 female; median age: 62.8; range: 34–83). The overall incidence of lymph node metastases was 18.1% (133/733). The average tumor size was 1.71±1.01cm (range: 0.3–7.0cm). In the T1a stage (242 patients, 33.1% of the cases), the number of patients with EP/LPM and MM invasion was 128 (17.4%) and 114 (15.6%), respectively. In the T1b stage (491 patients, 67.0%), the number of patients with SM1 and SM2 or deeper invasion was 382 (52.1%) and 109 (14.9%), respectively. The LN metastasis rates of T1a and T1b tumors were 2.48% (6/242), 25.9% (127/491), respectively. The accuracy was calculated to be 66.4% for CT with a sensitivity of 51.9% and specificity of 69.7%.

Candidate Predictors

Table 1 summarizes the characteristics of patients according to lymph node status. Patients’ age, sex, history of smoking, family history of GI tumors, tumor location, and macroscopic type were not significantly associated with LN metastases. The optimal applicable cut-off value of tumor size was defined as 1.85cm by calculating Youden’s index. The area under the ROC curve was 0.62 (95% confidence interval [CI], 0.565–0.679).

Table 1

Characteristics of Patients According to Lymph Node Status

Variables	Univariate Analysis					Multivariate Analysis
Variables	LMN(+)%	LMN(-)%	OR	95% CI	P-value	B	OR	95% CI	P-value
Sex					0.413	–	–	–	–
Male	18.9% (97)	81.1% (416)	1.192	0.783–1.814
Female	16.4% (36)	83.6% (184)	Ref
Age					0.933	–	–	–	–
≥60	18.2% (79)	81.8% (354)	1.017	0.694–1.490
<60	18.0% (54)	82.0% (246)	Ref
Tumor Location					0.196	–	–	–	–
Upper	20.8% (20)	79.2% (76)	Ref
Middle	15.6% (57)	84.4% (309)	0.701	0.397–1.237
Lower	20.7% (56)	79.3% (215)	0.990	0.558–1.757
Alcohol					0.008*				0.019*
Yes	24.4% (48)	75.6% (149)	1.709	1.146–2.548		0.594	1.812	1.105–2.971
No	15.9% (85)	84.1% (451)	Ref
Smoking					0.870	–	–	–	–
Yes	18.4% (56)	81.6% (248)	1.032	0.705–1.510
No	17.9% (77)	82.1% (352)	Ref
Family History of Tumor					0.105	–	–	–	–
Yes	13.0% (16)	87.0% (107)	0.630	0.359–1.106
No	19.2% (117)	80.8% (493)	Ref
Tumor Size					<0.001*				<0.001*
≥1.85cm	30.0.7% (98)	70.0% (229)	4.536	2.982–6.901		1.327	3.768	2.302–6.166
<1.85cm	8.6% (35)	91.4% (371)	Ref
Histologic grade					<0.001*				<0.001*
Well and Moderately	11.0% (56)	89.0% (451)	Ref
Poorly	34.1.8% (77)	65.9% (149)	4.162	2.815–6.152		0.956	2.600	1.633–4.141
Invasion Depth			–	–	<0.001*				<0.001*
EP/LPM	0% (0)	100% (128)	Ref
MM	5.4% (6)	94.7% (108)	1.056	1.011–1.102
SM1	15.7% (60)	84.3% (322)	1.186	1.136–1.239		2.061	7.856	3.358–18.381
SM2 or deeper	61.5% (67)	38.5% (42)	2.595	2.047–3.290		2.061	7.856	3.358–18.381
LV Invasion					<0.001*				<0.001*
Positive	86.1.1% (31)	13.9% (5)	36.17	13.74–95.18		3.216	24.938	8.349–74.495
Negative	14.6% (102)	85.4% (595)	Ref
Pathological type					0.129	–	–	–	–
Protruding	21.7% (43)	78.3% (155)	Ref
Superficial type	16.4% (37)	83.6% (189)	0.706	0.433–1.150
Ulcerative and localized	19.6% (40)	80.4% (164)	0.879	0.542–1.426
Infiltrative	16.0% (12)	84.0% (63)	0.687	0.340–1.388
Diffusely infiltrative	3.3% (1)	96.7% (29)	0.124	0.016–1.039
CT Results					<0.001*				<0.001*
Positive	27.5% (69)	72.5% (182)	2.476	1.690–3.628		1.107	3.026	1.901–4.818
Negative	13.3% (64)	86.7% (418)	Ref

Note: *Statistically significant with a p-value less than 0.05.

Abbreviations: LNM, lymph node metastasis; LV, lymphovascular; EP, epithelium; LPM, lamina propria mucosa; MM, muscularis mucosa; SM, submucosal; Ref, reference.

Characteristics of Patients According to Lymph Node Status Note: *Statistically significant with a p-value less than 0.05. Abbreviations: LNM, lymph node metastasis; LV, lymphovascular; EP, epithelium; LPM, lamina propria mucosa; MM, muscularis mucosa; SM, submucosal; Ref, reference. The multivariate logistic regression and lasso regression (Figure 1) both identified the same risk factors as follows: (1) past habit of alcohol taking, (2) size≥1.85cm, (3) submucosal invasion, (4) poorly differentiated histologic, (5) positive lymph-vessel invasion, and (6) positive image results (Table 1).

Figure 1

Selection of candidate predictors by LASSO regression. (A) Identification of the tuning parameter (λ) by 10-fold cross-validation on the basis of minimum criteria. Binomial deviance was plotted as a function of log(λ) from cross-validation procedure. The y-axis represents the binomial deviance, and the lower x-axis represents the log(λ). The numbers listed in the upper x-axis indicates the number of selected candidate predictors corresponding to a different λ value. The red dots stands for average deviance values of each model when given a certain λ value, and vertical bars through the red dots show the upper and lower values of the deviances. The black dotted lines determine the optimal λ values via the minimum criteria and the 1 standard error of the minimum criteria (the 1-SE criteria). The optimal λ value of 0.023 with log (λ) = - 3.63 was finally determined. (B) LASSO coefficient profiles of the twelve candidate predictors. The black dotted vertical line was drawn at the value selected using 10-fold cross-validation in Figure 1A. The optimal λ value yielded six candidate features with nonzero coefficients.

The LR Model

The predictor variables of the ultimate LR model were tumor size≥1.85cm, submucosal invasion, poorly differentiated histologic grade, a past habit of alcohol taking, positive lymph-vessel invasion, as well as positive CT results. The ultimate LR equation is as follows: Probability of metastasis=1/(1+e–z), where z= 2.061*Invasion depth (EP/LPM labels and MM labels 0, SM1/SM2 or deeper labels 1) +3.216*LVI (negative labels 0, positive labels 1) + 0.956*Histologic grade (G1 and G2 labels 0, G3 labels 1) +1.107*CT-Results (negative labels 0, positive labels 1) + 0.594*Alcohol taking (no labels 0, yes labels 1) + 1.327*Tumor Size (<1.85cm labels 0, ≥1.85cm labels 1) −5.213, where “e” represents the natural base. According to the Youden index, the cut-off point of the model was a value of 3.735.

The ANN Model

Figure 2 presents the structure of the established neural network. The input layer consists of six parameters: tumor size, invasion depth, a past habit of alcohol use, histologic grade, lymph-vessel invasion, and image results. The hidden layer consists of 20 neurons. The two output layers represented positive LN metastasis and negative LN metastasis, respectively. The performance plot shows the best validation performance at epoch 6 with a mean squared error of 0.0432, which indicates an excellent performance of the ANN model (). Cross-validation was performed in the primary data-set. The default ratios were 0.7/0.15/0.15 in the training, validation, and testing groups. These groups were randomly allocated by the nprtool in the Matlab. The validation yielded good discrimination with all the areas under curve exceeding 0.85 (training set 0.904, validation set 0.938, testing set 0.960, and overall set 0.915; Figure 3).

Figure 2

Figure 3

ROC curves of the established models. The blue curve and the green dotted curve represent the development data of the ANN model and the LR model, respectively. A–D represents the C-index of the training group, the validation group, the testing group, and the whole group, respectively.

Establishment of an Artificial Neural Network Model. A pattern recognition ANN model was generated. This non-parametric model consisted of 3 layers. The input layer consists of six parameters: tumor size, invasion depth, a past habit of alcohol use, histologic grade, lymph-vessel invasion, and preoperative CT results. The hidden layer consists of 20 neurons. The two output layers represented positive LN metastasis and negative LN metastasis, respectively. ROC curves of the established models. The blue curve and the green dotted curve represent the development data of the ANN model and the LR model, respectively. A–D represents the C-index of the training group, the validation group, the testing group, and the whole group, respectively.

Comparisons ANN Model with LR Model

The gold standard was the pathologically confirmed LN metastasis by surgical resection. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy were further compared between ANN and LR models. Table 2 shows the classification distribution of the two models.

Table 2

The Classification of the Established Models

Models	Predicted Results	Pathological Diagnosis
Models	Predicted Results	+	–	Percent Correct
ANN	+	74	11	87.06%
	–	57	591	91.20%
	Percent Correct	56.49%	98.17%	90.72%
LR	+	109	22	83.21%
	–	165	437	72.59%
	Percent Correct	39.78%	95.21%	74.49%

Abbreviations: ANN, artificial neural network; LR, logistic regression.

The Classification of the Established Models Abbreviations: ANN, artificial neural network; LR, logistic regression. According to DeLong’s test, the ANN model was superior to the LR model in specificity (91.20% vs 72.59%, p=0.006), PPV (56.49% vs 39.78%, p=0.020), accuracy (90.72% vs 74.49%, p<0.0001), and C-index (0.915 vs 0.868, p<0.001). There was no difference between the two models in Sensitivity (87.06% vs 83.21%, p=0.764) and NPV (98.17% vs 95.21%, p=0.627). To further quantify refinement in predictive accuracy, NRI and IDI were applied. Although NRI was not significantly different between the two models (improved by −1.1%, z=−0.222, p=0.824), IDI indicated that the ANN was significantly improved by 23.3% in prediction performance when compared with the LR model (z=4.338, p<0.001). The comparisons of the diagnostic assessment are listed in Table 3. Table 4 summarizes the model performance measures.

Table 3

Comparison of ANN Model and LR Model for Predicting LN Metastasis

Diagnostic Index	ANN Model (%, 95% CI)	LR Model (%, 95% CI)	p-value
Sensitivity	87.06%(78.02–93.36%)	83.21%(75.69–89.17%)	0.764
Specificity	91.20%(88.75–93.27%)	72.59%(68.84–76.12%)	0.006*
PPV	56.49%(50.00–62.76%)	39.78%(36.22–43.45%)	0.020*
NPV	98.17%(96.87–98.94%)	95.21%(93.12–96.69%)	0.627
Accuracy	90.72%(88.39–92.72%)	74.49%(71.17–77.61%)	<0.001*
AUC	0.915(0.887–0.943)	0.868(0.837–0.900)	<0.001*
NRI	−1.1%, z=−0.222		0.824
IDI	23.3%, z=4.338		<0.001*

Note: *Statistically significant with a p-value less than 0.05.

Abbreviations: ANN, artificial neural network; LR, logistic regression; PPV, positive predictive value; NPV, negative predictive value; NRI, net reclassification improvement; IDI, integrated discrimination improvement.

Table 4

Summary of Model Performance Measures

Aspect	Model Performance Measures	ANN
Diagnostic Test	Accuracy	√
	Sensitivity	Comparable
	Specificity	√
	PPV	√
	NPV	Comparable
Discrimination	C-index	√
Reclassification	IDI	√
	NRI	Comparable

Note: √: perform better.

Comparison of ANN Model and LR Model for Predicting LN Metastasis Note: *Statistically significant with a p-value less than 0.05. Abbreviations: ANN, artificial neural network; LR, logistic regression; PPV, positive predictive value; NPV, negative predictive value; NRI, net reclassification improvement; IDI, integrated discrimination improvement. Summary of Model Performance Measures Note: √: perform better. Abbreviations: ANN, artificial neural network; LR, logistic regression; PPV, positive predictive value; NPV, negative predictive value; NRI, net reclassification improvement; IDI, integrated discrimination improvement.

Discussion

This study presented the first established pattern recognition neural network for predicting LN metastasis in SESCC. The model was developed with the largest sampling pool that we know of: the number of patients was twice greater than the previously published model data,19,20 and a larger sample size generally leads to increased precision of the model. The significance of our established model is to provide indications of additional treatment after endoscopic resection procedures. At present, the necessity of further treatment after ESD is still controversial. Some studies showed that ESD alone could not be considered curative for patients with T1b tumors due to a higher risk for lymph node metastasis.21,22 However, Takeuchi et al have recently reported that the diagnostic ESD for cM3-SM2 esophageal cancer was feasible and safe. Approximately 20% of these patients can potentially avoid further esophagectomy after endoscopic treatment.23 Although we collected clinic-pathological variables (tumor size, submucosal invasion, histologic grade, and lymph-vessel invasion) from pathological results after surgical resection, they are also available after routine ESD procedures by histologic analysis of the endoscopically resected specimens. Thus, using the established model, endoscopists can determine whether patients need additional treatment after ESD procedures. Another potential application of the model is to predict LN metastasis even before ESD. During routine pre-ESD examinations, we can obtain data of tumor size, submucosal invasion, and histologic grade by endoscopic ultrasound (EUS), magnifying endoscopy with narrow-band imaging (ME-NBI), and endoscopic biopsies, respectively.24,25 Nevertheless, the accuracy of these preoperative examinations is still under investigation. Besides, LV invasion can only be precisely confirmed after ESD by analyzing endoscopically resected specimens.26 Thus, our model may not be applied directly to predict LN metastasis before ESD. Another model should be developed for identifying preoperative LN metastasis using data from available pre-ESD examinations. The risk assessment for a certain disease relies on predictive models that simultaneously integrate both clinically and statistically significant elements.27 In this study, we identified six parameters as the main independent risk factors of LN metastasis in SESSC. Among these factors, tumor size, invasion depth, histologic grade, and lymph-vessel invasion have already been reported and incorporated in previously published models.12,19,28 Our findings are consistent with the results of previous models. Different from those reported models, we identified another two independent risk factors of LN metastasis in SESSC. Firstly, we identified the preoperative CT as a significant predictor for LN metastasis. CT is one of the imaging modalities for pretreatment assessment of LN metastasis, with a sensitivity of 50% (range, 41–60%) and an accuracy of 63% (range, 53–72%).29,30 In our study, the accuracy of CT was 66.4% with a sensitivity of 51.9%, which is consistent with the previous studies. As an affordable test, CT is also readily available for most patients before treatment. The possibility of inaccurate prediction may due to normal-sized nodes that contain metastatic deposits and benign nodal enlargement arise from inflammation. For the above reasons, LN metastasis cannot be determined by the size evaluated by CT alone. Therefore, we determine CT as one of the six candidate predictors. Besides, we identified the past habit of alcohol taking as a significant predictive factor of LN metastasis. Alcohol has long been considered as a risk factor for development of esophageal cancer.31 Huang et al showed that alcohol taking increased the death hazard of esophageal cancer, and the hazardous effects increase with a dose-dependent manner.32 We found an increased risk of LN metastasis in alcoholic drinkers compared with non-drinkers, and drinkers in our study tends to have more alcohol consumption behavior. Lind et al pointed out that both alcohol consumption behaviour and alcohol dependence status are influenced by a gene called ALDH-1A1.33 This gene is also reported to be highly expressed in human esophageal squamous cell carcinoma, which is significantly associated with lymph node metastasis and poor survival.34 However, there is still no research directly investigating the relationships among alcohol consumption, ALDH-1A1 expression, and lymph node metastasis of esophageal cancer. Further studies are needed to determine whether alcohol consumption leads to a poor prognosis of esophageal cancer by triggering LN metastasis with the influence of ALDH-1A1 expression. This ANN model showed satisfactory performance through internal cross-validation. The methods of assessing diagnostic test accuracy were further used for model comparison. It is widely known that logistic regression is a simple machine learning algorithm used for binary classification tasks. Although kernelized variants exist, the standard LR model is a linear classifier, which is useful for a dataset where the classes are “linearly separable”. Neural networks are somewhat related to logistic regression. If a logistic algorithm belongs to the generalized linear regression, a neural network can be called a kind of generalized logistic regression.35,36 LR can also be considered as a one-layer neural network without hidden neurons. The neural network has advantages over LR models: the hidden layers facilitate the discovery of more complex and non-linear associations of variables. In this study, we found the ANN model performed better than the LR model, with significantly higher AUC, specificity, PPV, and Accuracy. High specificity means that the ANN model is superior to LR model in reducing the misdiagnosis rate (increasing the true-negative rate of LN metastasis), which is very helpful for clinicians to broaden the indication of endoscopic resection. IDI is another popular tool for evaluating the capacity of a diagnostic test to classify binary outcomes.37 The principle for the IDI is that a better model leads to increased estimated risks of LN metastasis for cases and decreased estimated risks for controls.38 The ANN model outperformed the LR by 23.3%, when assessed by IDI. The larger IDI, the more improved performance of the new model. A potential reason for the better performance of ANN is that the actual algorithm for predicting LN status follows a more complex nonlinear relationship, which can be handled more accurately by ANN than LR. These two models had no statistically significant difference in NPV, which is also an important index for clinicians to know how certain the negative results of LN metastasis predicted by the model are. Both models attained a high NPV. In other words, both models can help clinicians to exclude LN metastasis with high certainty if a patient has a negative predicting result. Thus, patients with negative predicting results may be indicated for endoscopic resection more accurately. NRI is another statistical index commonly used to assess whether one model provides clinically relevant prediction improvements than the other.39 We calculated an NRI of −1.1% with no statistical significance (p>0.05). It implies that no significant improvements in the ANN model for correcting mistakenly classified patients with or without LN metastasis by the LR model. Thus, the classification abilities of both two models are comparative. The model established in this study is still preliminary. The ANN model was developed from a single-center retrospective cohort; thus, the external validation outside our hospital is further needed to test model performance. The current pattern recognition ANN belongs to a shallow neural network. We expect more improved ANN models, with multiple hidden layers, that can be established by a complicated deep learning algorithm in the future. Other aspects of the model can be explored for improving the model performance, such as adding new candidate predictors, conducting prospective cross-validation to avoid possible bias in other cancer-related factors, or individual factors.

Conclusion

In conclusion, we developed a novel ANN model to predict LN metastasis in patients with SESCC. The ANN model was comparable with the LR model in Sensitivity, NPV, and NRI, and it performed better in Specificity, PPV, Accuracy C-index, and IDI. Therefore, the ANN model is superior to the LR model and may become a valuable tool, especially for providing indications of additional treatment after ESD procedures.

37 in total

Review 1. Logistic regression and artificial neural network classification models: a methodology review.

Authors: Stephan Dreiseitl; Lucila Ohno-Machado
Journal: J Biomed Inform Date: 2002 Oct-Dec Impact factor: 6.317

2. 8th edition AJCC/UICC staging of cancers of the esophagus and esophagogastric junction: application to clinical practice.

Authors: Thomas W Rice; Deepa T Patil; Eugene H Blackstone
Journal: Ann Cardiothorac Surg Date: 2017-03

3. Clinical outcomes of endoscopic submucosal dissection for superficial esophageal neoplasms: a multicenter retrospective cohort study.

Authors: Yoshiki Tsujii; Tsutomu Nishida; Osamu Nishiyama; Katsumi Yamamoto; Naoki Kawai; Shinjiro Yamaguchi; Takuya Yamada; Toshiyuki Yoshio; Shinji Kitamura; Takeshi Nakamura; Akihiro Nishihara; Hideharu Ogiyama; Masanori Nakahara; Masato Komori; Motohiko Kato; Yoshito Hayashi; Shinichiro Shinzaki; Hideki Iijima; Tomoki Michida; Masahiko Tsujii; Tetsuo Takehara
Journal: Endoscopy Date: 2015-03-31 Impact factor: 10.093

4. Heart murmur detection based on wavelet transformation and a synergy between artificial neural network and modified neighbor annealing methods.

Authors: Gholamhossein Eslamizadeh; Ramin Barati
Journal: Artif Intell Med Date: 2017-05-13 Impact factor: 5.326

Review 5. Oesophageal cancer.

Authors: Jesper Lagergren; Elizabeth Smyth; David Cunningham; Pernilla Lagergren
Journal: Lancet Date: 2017-06-22 Impact factor: 79.321

6. Cancer statistics in China, 2015.

Authors: Wanqing Chen; Rongshou Zheng; Peter D Baade; Siwei Zhang; Hongmei Zeng; Freddie Bray; Ahmedin Jemal; Xue Qin Yu; Jie He
Journal: CA Cancer J Clin Date: 2016-01-25 Impact factor: 508.702

7. Comparison of artificial neural network and logistic regression models for prediction of outcomes in trauma patients: A systematic review and meta-analysis.

Authors: Soheil Hassanipour; Haleh Ghaem; Morteza Arab-Zozani; Mozhgan Seif; Mohammad Fararouei; Elham Abdzadeh; Golnar Sabetian; Shahram Paydar
Journal: Injury Date: 2019-01-11 Impact factor: 2.586

8. Nomogram for prediction of lymph node metastasis in patients with superficial esophageal squamous cell carcinoma.

Authors: Byung-Hoon Min; Jung Wook Yang; Yang Won Min; Sun-Young Baek; Seonwoo Kim; Hong Kwan Kim; Yong Soo Choi; Young Mog Shim; Yoon-La Choi; Jae Ill Zo
Journal: J Gastroenterol Hepatol Date: 2019-12-15 Impact factor: 4.029

Review 9. Meta-analysis shows clinically relevant and long-lasting deterioration in health-related quality of life after esophageal cancer surgery.

Authors: M Jacobs; R C Macefield; R G Elbers; K Sitnikova; I J Korfage; E M A Smets; I Henselmans; M I van Berge Henegouwen; J C J M de Haes; J M Blazeby; M A G Sprangers
Journal: Qual Life Res Date: 2013-10-16 Impact factor: 4.147

10. Impact of alcohol consumption on survival in patients with esophageal carcinoma: a large cohort with long-term follow-up.

Authors: Qingyuan Huang; Kongjia Luo; Hong Yang; Jing Wen; Shuishen Zhang; Jinhui Li; Amos Ela Bella; Qianwen Liu; Fu Yang; Yuzhen Zheng; Ronggui Hu; Junying Chen; Jianhua Fu
Journal: Cancer Sci Date: 2014-11-11 Impact factor: 6.716

4 in total

Review 1. Machine learning applications in upper gastrointestinal cancer surgery: a systematic review.

Authors: Mustafa Bektaş; George L Burchell; H Jaap Bonjer; Donald L van der Peet
Journal: Surg Endosc Date: 2022-08-11 Impact factor: 3.453

2. A novel web-based dynamic nomogram for recurrent laryngeal nerve lymph node metastasis in esophageal squamous cell carcinoma.

Authors: Ting-Ting Chen; Hao-Ji Yan; Xi He; Si-Yi Fu; Sheng-Xuan Zhang; Wan Yang; Yu-Jie Zuo; Hong-Tao Tang; Jun-Jie Yang; Pei-Zhi Liu; Hong-Ying Wen; Dong Tian
Journal: Front Surg Date: 2022-08-23

3. Global research trends of artificial intelligence applied in esophageal carcinoma: A bibliometric analysis (2000-2022) via CiteSpace and VOSviewer.

Authors: Jia-Xin Tu; Xue-Ting Lin; Hui-Qing Ye; Shan-Lan Yang; Li-Fang Deng; Ruo-Ling Zhu; Lei Wu; Xiao-Qiang Zhang
Journal: Front Oncol Date: 2022-08-25 Impact factor: 5.738

4. Machine learning models predict lymph node metastasis in patients with stage T1-T2 esophageal squamous cell carcinoma.

Authors: Dong-Lin Li; Lin Zhang; Hao-Ji Yan; Yin-Bin Zheng; Xiao-Guang Guo; Sheng-Jie Tang; Hai-Yang Hu; Hang Yan; Chao Qin; Jun Zhang; Hai-Yang Guo; Hai-Ning Zhou; Dong Tian
Journal: Front Oncol Date: 2022-09-08 Impact factor: 5.738

4 in total