| Literature DB >> 25165484 |
Soo Beom Choi1, Won Jae Kim2, Tae Keun Yoo3, Jee Soo Park3, Jai Won Chung4, Yong-ho Lee5, Eun Seok Kang5, Deok Won Kim4.
Abstract
The global prevalence of diabetes is rapidly increasing. Studies support the necessity of screening and interventions for prediabetes, which could result in serious complications and diabetes. This study aimed at developing an intelligence-based screening model for prediabetes. Data from the Korean National Health and Nutrition Examination Survey (KNHANES) were used, excluding subjects with diabetes. The KNHANES 2010 data (n = 4685) were used for training and internal validation, while data from KNHANES 2011 (n = 4566) were used for external validation. We developed two models to screen for prediabetes using an artificial neural network (ANN) and support vector machine (SVM) and performed a systematic evaluation of the models using internal and external validation. We compared the performance of our models with that of a screening score model based on logistic regression analysis for prediabetes that had been developed previously. The SVM model showed the areas under the curve of 0.731 in the external datasets, which is higher than those of the ANN model (0.729) and the screening score model (0.712), respectively. The prescreening methods developed in this study performed better than the screening score model that had been developed previously and may be more effective method for prediabetes screening.Entities:
Mesh:
Year: 2014 PMID: 25165484 PMCID: PMC4140121 DOI: 10.1155/2014/618976
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Figure 1Flow chart of excluding subjects for the KNHANES 2010.
Figure 2Chart depicting the flow of data from the Korean National Health and Nutrition Examination Survey (KNHANES) 2010 and 2011 to develop and validate a prediabetes model. KNHANES: Korean National Health and Nutrition Examination Survey; ANN: artificial neural network; SVM: support vector machine.
The weighted characteristics of the data from the Korean National Health and Nutrition Examination Survey (KNHANES) 2010.
| Normal | Prediabetes |
| |
|---|---|---|---|
| Age (years) | 41.9 ± 0.5 | 52.5 ± 0.6 | <0.001 |
| Gender (% men) | 46.9 (0.9) | 58.8 (1.9) | <0.001 |
| Family history of diabetes (%) | 18.3 (0.9) | 22.9 (1.7) | 0.007 |
| Current smoker (%) | 27.5 (1.0) | 26.9 (1.9) | 0.799 |
| Alcohol intake (drinks/day) | 0.8 ± 0.0 | 1.0 ± 0.1 | <0.001 |
| Physically active (%) | 50.6 ± 1.1 | 52.1 ± 2.1 | 0.535 |
| BMI (kg/m2) | 23.2 ± 0.1 | 25.1 ± 0.1 | <0.001 |
| Waist circumference (cm) | 79.1 ± 0.2 | 85.8 ± 0.4 | <0.001 |
| FPG (mg/dL) | 89.0 ± 0.1 | 107.4 ± 0.3 | <0.001 |
| Systolic blood pressure (mmHg) | 116.8 ± 0.4 | 127.7 ± 0.7 | <0.001 |
| Diastolic blood pressure (mmHg) | 76.4 ± 0.3 | 81.5 ± 0.5 | <0.001 |
| Hypertension (%) | 16.4 (0.8) | 41.1 (2.2) | <0.001 |
BMI: body mass index; FPG: fasting plasma glucose.
Table values are given as mean ± standard error or % (standard error) [95% confidence interval] unless otherwise indicated. P* were obtained by t-test or chi-square test.
Impaired fasting glucose was considered with values ≥ 100 mg/dL and <126 mg/dL.
Performance of the ANN, SVM, and screening score (Lee et al. [8]) models using the internal and external validation sets for predicting prediabetes.
| AUC | Accuracy (%) | Sensitivity (%) | Specificity (%) | ||
|---|---|---|---|---|---|
| Internal validation set | ANN∗ | 0.768 | 69.0 | 74.1 | 67.5 |
| SVM† | 0.761 | 64.9 | 78.9 | 61.2 | |
| Screening score‡ | 0.734 | 63.4 | 76.1 | 60.0 | |
| External validation set | ANN∗ | 0.729 | 60.7 | 77.2 | 56.7 |
| SVM† | 0.731 | 66.1 | 69.4 | 65.3 | |
| Screening score‡ | 0.712 | 59.9 | 74.3 | 56.4 |
AUC: area under the curve; ANN: artificial neural network; SVM: support vector machine.
The internal validation set was comprised of data from the Korean National Health and Nutrition Examination Survey (KNHANES) 2010, and the external validation set included data from KNHANES 2011. ∗The chosen model was a multilayer perceptron model with 1 hidden layer, batch training, and momentum learning (MLP-1-B-M) of backpropagation feedforward algorithm. †The optimal model was found using Gaussian kernel function with a penalty parameter (C) of 10 and scaling factor (σ) of 10. ‡The performance was calculated by applying the screening score model for prediabetes based on that of Lee et al. [8] to the data from KNHANES 2010 and 2011.
Figure 3Receiver operating characteristic curves (ROC) of artificial neural network (ANN), support vector machine (SVM), and screening score in predicting prediabetes for internal validation set (a) and external validation set (b).
Performance of the screening score model (Lee et al. [8]) in predicting prediabetes and undiagnosed diabetes using the data from the Korean National Health and Nutrition Examination Survey (KNHANES) 2010 and 2011.
| AUC | Accuracy (%) | Sensitivity (%) | Specificity (%) | ||
|---|---|---|---|---|---|
| Prediabetes | KNHANES 2010∗(internal validation) | 0.734 | 63.4 | 76.1 | 60.0 |
| KNHANES 2011∗(external validation) | 0.712 | 59.9 | 74.3 | 56.4 | |
| Undiagnosed diabetes | KNHANES 2010†(internal validation) | 0.772 | 66.6 | 76.5 | 66.4 |
| KNHANES 2011†(external validation) | 0.751 | 64.6 | 74.4 | 64.3 |
AUC: area under the curve; KNHANES: Korean National Health and Nutrition Examination Survey.
Prediabetes was defined as fasting plasma glucose, with values ≥100 mg/dL and <126 mg/dL. ∗Internal and external validation sets to evaluate the screening score for prediabetes (n = 1,551 for KNHANES 2010 and n = 4,566 for KNHANES 2011). †Internal and external validation sets to evaluate the screening score for undiagnosed diabetes (n = 1,585 for KNHANES 2010 and n = 4,683 for KNHANES 2011).