| Literature DB >> 30568493 |
Meng Hsuen Hsieh1, Li-Min Sun2, Cheng-Li Lin3,4, Meng-Ju Hsieh5, Chung-Y Hsu6, Chia-Hung Kao6,7,8.
Abstract
OBJECTIVES: Patients with type 2 diabetes (T2DM) are suggested to have a higher risk of developing pancreatic cancer. We used two models to predict pancreatic cancer risk among patients with T2DM.Entities:
Keywords: artificial neural network; logistic regression; pancreatic cancer; type 2 diabetes
Year: 2018 PMID: 30568493 PMCID: PMC6267763 DOI: 10.2147/CMAR.S180791
Source DB: PubMed Journal: Cancer Manag Res ISSN: 1179-1322 Impact factor: 3.989
Distribution of train and test sets
| All patients | Training set | Test set |
|---|---|---|
| 1,358,634 | 1,324,669 | 33,965 |
Baseline characteristics of T2DM patients with and without pancreatic cancer
| Variable | Pancreatic cancer
| ||||
|---|---|---|---|---|---|
| No
| Yes
| ||||
| N=1,355,542
| N=3,092
| ||||
| n | % | n | % | ||
|
| |||||
| Age group (years) | <0.001 | ||||
| ≤49 | 422,146 | 31.1 | 389 | 12.6 | |
| 50–64 | 523,033 | 38.6 | 1,197 | 38.7 | |
| 65+ | 410,363 | 30.3 | 1,506 | 48.7 | |
| Mean (SD) (years) | 57.3 | 14.2 | 63.8 | 11.4 | <0.001 |
| Gender | <0.001 | ||||
| Women | 642,176 | 47.4 | 1,341 | 43.4 | |
| Men | 713,366 | 52.6 | 1,751 | 56.6 | |
| Underlying disease | |||||
| Acute pancreatitis | 40,578 | 2.99 | 331 | 10.7 | <0.001 |
| Chronic pancreatitis | 13,124 | 0.97 | 182 | 5.89 | <0.001 |
| Alcohol-related illness | 143,856 | 10.6 | 307 | 9.93 | 0.22 |
| Gallstone | 147,231 | 10.9 | 596 | 19.3 | <0.001 |
| Cholecystectomy | 53,533 | 3.95 | 179 | 5.79 | <0.001 |
| Cirrhosis | 632,546 | 46.7 | 1,681 | 54.4 | <0.001 |
| COPD | 383,509 | 28.3 | 894 | 28.9 | 0.25 |
| | 21,838 | 1.61 | 57 | 1.84 | 0.31 |
| Hepatitis B | 129,275 | 9.54 | 290 | 9.38 | 0.77 |
| Hepatitis C | 74,671 | 5.51 | 162 | 5.24 | 0.51 |
| Hypertension | 1,001,683 | 73.9 | 2,279 | 73.7 | 0.81 |
| Hyperlipidemia | 912,371 | 67.3 | 1,784 | 57.7 | <0.001 |
| Nephropathy | 26,796 | 1.98 | 40 | 1.29 | 0.006 |
| Obesity | 71,808 | 5.30 | 86 | 2.78 | <0.001 |
| CCI score | <0.001 | ||||
| 0 | 845,298 | 62.4 | 1,774 | 57.4 | |
| 1 | 245,041 | 18.1 | 595 | 19.2 | |
| 2 | 129,974 | 9.59 | 378 | 12.2 | |
| 3 or more | 135,231 | 9.98 | 345 | 11.2 | |
| Diabetes complication (components of the aDCSI) | |||||
| Retinopathy | 279,890 | 20.7 | 487 | 15.8 | <0.001 |
| Nephropathy | 489,087 | 36.1 | 881 | 28.5 | <0.001 |
| Neuropathy | 405,625 | 29.9 | 785 | 25.4 | <0.001 |
| Cerebrovascular | 248,489 | 18.3 | 523 | 16.9 | <0.001 |
| Cardiovascular | 686,634 | 50.7 | 1,622 | 52.5 | 0.045 |
| Peripheral vascular disease | 371,646 | 27.4 | 658 | 21.3 | <0.001 |
| Metabolic | 61,492 | 4.54 | 125 | 4.04 | 0.19 |
| Change in aDCSI score per year | <0.001 | ||||
| 0–0.1 | 691,408 | 51.1 | 1,831 | 59.2 | |
| 0.1–0.3 | 3,733,385 | 27.6 | 535 | 17.3 | |
| >0.3 | 290,749 | 21.5 | 726 | 23.5 | |
| Mean aDCSI score (SD) | |||||
| Onset | 1.44 | 1.70 | 1.42 | 1.62 | 0.60 |
| End of follow-up | 2.62 | 2.18 | 2.23 | 1.99 | <0.001 |
| Medications | |||||
| Statin | 716,701 | 52.9 | 1,183 | 38.3 | <0.001 |
| Insulin | 449,011 | 33.1 | 1,044 | 33.7 | 0.45 |
| Sulfonylureas | 782,389 | 57.7 | 1,970 | 63.7 | <0.001 |
| Metformin | 868,824 | 64.1 | 1,985 | 64.2 | 0.90 |
| Other antidiabetic drugs | 371,333 | 27.4 | 786 | 25.4 | <0.001 |
| TZD | 226,441 | 16.7 | 471 | 15.2 | <0.001 |
Notes: Chi-squared test.
t-Test comparing subjects with and without pancreatic cancer.
Abbreviations: aDCSI, adapted Diabetes Complication Severity Index; CCI, Charlson comorbidity index; T2DM, type 2 diabetes; TZD, thiazolidinediones.
Accuracy analysis of LR and ANN models across all data set
| Data set | Model | Precision | Recall | AUROC | SE of AUC | 95% CI of AUC | |
|---|---|---|---|---|---|---|---|
|
| |||||||
| All data (n=1,358,634) | LR | 0.997 | 0.995 | 0.998 | 0.727 | 0.004 | 0.718–0.735 |
| ANN | 0.871 | 0.996 | 0.775 | 0.605 | 0.005 | 0.595–0.615 | |
| Training set (n=1,324,669) | LR | 0.997 | 0.995 | 0.998 | 0.726 | 0.004 | 0.718–0.735 |
| ANN | 0.932 | 0.996 | 0.876 | 0.606 | 0.005 | 0.596–0.617 | |
| Test set (n=33,965) | LR | 0.996 | 0.995 | 0.998 | 0.707 | 0.029 | 0.650–0.765 |
| ANN | 0.930 | 0.995 | 0.873 | 0.642 | 0.034 | 0.576–0.708 | |
Abbreviations: ANN, artificial neural network; AUROC, area under the receiver operating characteristic curve; LR, logistic regression; SE, standard error; AUC, area under the curve.
Figure 1The ROC curve of the LR model.
Note: The AUC across all data for the LR model is 0.727.
Abbreviations: AUC, area under the ROC curve; LR, logistic regression; ROC, receiver operating characteristic.
Figure 2The ROC curve of the ANN model.
Note: The AUC curve across all data for the ANN model is 0.605.
Abbreviations: ANN, artificial neural network; AUC, area under the ROC curve; ROC, receiver operating characteristic.