| Literature DB >> 34067792 |
Po-Hsiang Lin1,2, Jer-Guang Hsieh2, Hsien-Chung Yu3,4,5,6, Jyh-Horng Jeng7, Chiao-Lin Hsu4,6, Chien-Hua Chen2,8, Pin-Chieh Wu4,6,9.
Abstract
Determining the target population for the screening of Barrett's esophagus (BE), a precancerous condition of esophageal adenocarcinoma, remains a challenge in Asia. The aim of our study was to develop risk prediction models for BE using logistic regression (LR) and artificial neural network (ANN) methods. Their predictive performances were compared. We retrospectively analyzed 9646 adults aged ≥20 years undergoing upper gastrointestinal endoscopy at a health examinations center in Taiwan. Evaluated by using 10-fold cross-validation, both models exhibited good discriminative power, with comparable area under curve (AUC) for the LR and ANN models (Both AUC were 0.702). Our risk prediction models for BE were developed from individuals with or without clinical indications of upper gastrointestinal endoscopy. The models have the potential to serve as a practical tool for identifying high-risk individuals of BE among the general population for endoscopic screening.Entities:
Keywords: Barrett’s esophagus; Taiwan; computer; logistic models; neural networks
Year: 2021 PMID: 34067792 PMCID: PMC8157048 DOI: 10.3390/ijerph18105332
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1The architecture of the proposed artificial neural network model.
Characteristics of the subjects with Barrett’s esophagus and the controls.
| Variables | BE (−) | BE (+) | Grand Mean |
|---|---|---|---|
| Mean age [SD] (years) | 50.3 [11.7] | 54.7 [11.4] | 50.4 [11.8] |
| Gender | |||
| Male | 5020 (53.4%) | 184 (76.0%) | |
| Female | 4384 (46.6%) | 58 (24.0%) | |
| Height [SD] (cm) | 166.0 [8.5] | 168.1 [8.1] | 166.1 [8.5] |
| Weight [SD] (kg) | 65.9 [13.2] | 70.8 [12.0] | 66.0 [13.1] |
| BMI [SD] (kg/m2) | 23.7 [3.6] | 24.9 [3.1] | 23.8 [3.6] |
| Waist circumference [SD] (cm) | 83.8 [9.8] | 87.6 [8.7] | 83.9 [9.8] |
| Hypertension | 1597 (17.0%) | 67 (27.7%) | |
| Diabetes mellitus | 662 (7.0%) | 31 (12.8%) | |
| GERD symptoms | 1605 (17.1%) | 81 (33.5%) | |
| Alcohol intake | |||
| No | 3992 (42.5%) | 86 (35.5%) | |
| Not heavy drinking † | 4973 (52.9%) | 142 (58.7%) | |
| Heavy drinking † | 439 (4.7%) | 14 (5.8%) | |
| Smoking | |||
| Non-smoker | 6711 (71.4%) | 125 (51.7%) | |
| ≤20 pack-years | 1790 (19.0%) | 61 (25.2%) | |
| >20 pack-years | 903 (9.6%) | 56 (23.1%) | |
| Having Exercise habits (≥3 times/week and ≥30 mins/time) | 2675 (28.4%) | 77 (31.8%) |
BE: Barrett’s esophagus; BMI: Body mass index; GERD: gastroesophageal reflux disease; SD: standard deviation, † Heavy drinking was defined as 8 or more drinks a week for women and 15 or more drinks a week for men.
Variables associated with Barrett’ esophagus according to multivariate analysis using generalized linear models.
| Variables | Odds Ratio | 95%CI | |
|---|---|---|---|
| Age | 1.03 | 1.01–1.04 | <0.001 * |
| Gender (male) | 1.80 | 1.15–2.82 | 0.01 * |
| Height (cm) | 0.99 | 0.95–1.03 | 0.63 |
| Weight (kg) | 1.01 | 0.97–1.06 | 0.49 |
| BMI (kg/m2) | 0.99 | 0.87–1.13 | 0.91 |
| Waist circumference (cm) | 1.01 | 0.98–1.04 | 0.66 |
| Hypertension | 1.11 | 0.80–1.52 | 0.54 |
| Diabetes mellitus | 1.19 | 0.79–1.78 | 0.41 |
| GERD symptoms | 2.14 | 1.63–2.83 | <0.001 * |
| Alcohol intake | 0.92 | 0.73–1.17 | 0.52 |
| Smoking | 1.44 | 1.20–1.72 | <0.001 * |
| Having exercise habits | 0.97 | 0.73–1.30 | 0.86 |
BMI: Body mass index; CI: confidence interval; GERD: gastroesophageal reflux disease. * Variables with p < 0.05 were considered to enter into a prediction model.
Mean performances of striated 10-fold cross-validation for the logistic regression and artificial neural network models using the sampling threshold settings.
| Prevalence = 2.51% | LR Model | ANN Model | ||||
|---|---|---|---|---|---|---|
| Threshold Setting | Sensitivity | Specificity | Accuracy | Sensitivity | Specificity | Accuracy |
| Sensitivity~90% | 0.90 | 0.31 | 0.32 | 0.90 | 0.20 | 0.22 |
| Specificity~90% | 0.30 | 0.90 | 0.88 | 0.28 | 0.90 | 0.88 |
| The Closest to (0,1) Criteria | 0.65 | 0.68 | 0.68 | 0.63 | 0.65 | 0.65 |
ANN: artificial neural network; LR: logistic regression.
Figure 2The receiver operating characteristic (ROC) curves for the logistic regression (LR) and artificial neural network (ANN) models. The green lines show the mean ROC curves, and the gray areas represent the performance within two SD around the mean ROC. (a) The ROC curve of LR model (AUC = 0.702, SD = 0.040); (b) The ROC curve of ANN model. (AUC = 0.702, SD = 0.035). AUC: area under cure; SD: standard deviation.
Coefficients of the final mean logistic regression model and its performance using the sampling threshold settings.
| Coefficients and Adjusted OR | Performances of Whole Data Input in Final Mean LR Model | |||||
|---|---|---|---|---|---|---|
| Variables | Adjusted OR | Threshold Setting | Cutoff Point | Sensitivity | Specificity | Accuracy |
| Age [SD] | 1.43 | Specificity~90% | 0.67 | 0.30 | 0.90 | 0.88 |
| Gender(male) [SD] | 2.01 | Specificity~80% | 0.58 | 0.46 | 0.80 | 0.80 |
| GERD [SD] | 2.05 | Sensitivity~90% | 0.33 | 0.90 | 0.32 | 0.33 |
| Smoking | Sensitivity~80% | 0.46 | 0.80 | 0.46 | 0.40 | |
| Non smokers | 1 | The Closest to (0,1) Criteria | 0.52 | 0.65 | 0.69 | 0.70 |
| ≤20 pack-years [SD] | 1.34 | |||||
| >20 pack-years [SD] | 2.28 | |||||
| Intercept | ||||||
LR: logistic regression; OR: odds ratio; SD: standard deviation; GERD: gastroesophageal reflux disease.