| Literature DB >> 29529860 |
Shiva Borzouei1, Ali Reza Soltanian2,3.
Abstract
OBJECTIVES: To identify the most important demographic risk factors for a diagnosis of type 2 diabetes mellitus (T2DM) using a neural network model.Entities:
Keywords: Epidemiology; Glycated hemoglobin A; Iran; Statistical model
Mesh:
Substances:
Year: 2018 PMID: 29529860 PMCID: PMC5968209 DOI: 10.4178/epih.e2018007
Source DB: PubMed Journal: Epidemiol Health ISSN: 2092-7193
Input and output variables for the neural network model
| Status | Attributes | Levels | Code | Descriptions |
|---|---|---|---|---|
| Output | Diagnosis of T2DM (HbA1c) | <5.7%: normal | 0 | Dichotomous (%) |
| ≥5.7%: diabetic | 1 | |||
| Input | Sex | Male | 0 | Dichotomous |
| Female | 1 | |||
| Input | Age | - | - | Numeric (yr) |
| Input | BMI[ | - | - | Numeric (kg/m2) |
| Input | Hypertension[ | Yes | 1 | Dichotomous |
| No | 0 | |||
| Input | Walking[ | <30 | 0 | Dichotomous (min/d) |
| ≥30 | 1 | |||
| Input | Sedentary time at workplace or home[ | Sometimes | 0 | Dichotomous |
| Often | 1 | |||
| Input | Stress | - | - | Numeric (0-10) |
| Input | Fruit consumption[ | Sometimes | 0 | Dichotomous |
| Input | Vegetables consumption[ | Often | 1 | Dichotomous |
| Input | Family history of diabetes | Yes | 1 | Dichotomous |
| No | 0 | |||
| Input | Smoking (cigaretts, hookah) | Never | 0 | Categorical |
| Former or current | 1 | |||
| Input | Waist circumference | - | - | Numeric (cm) |
T2DM, type 2 diabetes mellitus; HbA1c, hemoglobin A1c; BMI, body mass index.
BMI calculated as weight (kg)/height squared (m2).
Participants were considered to have hypertension if they took blood pressure medication.
Walking was collected as a dichotomous variable, walking less than 30 min/d was denoted by "0" and walking for more than 30 min/d was denoted as "1."
Sedentary time was defined in terms of the amount of time (hours) a person spent sitting at the office or at home; Sedentary time less than 5 hours was denoted as “sometimes,” and sedentary time for more than 5 hours was denoted as “always.”
Consumption of 0-1 servings of fruit per day was denoted as "sometimes," and consumption of ≥2 servings of fruit per day was denoted as "always."
Consumption of 0-1 cup of green vegetables per day was denoted as "sometimes," and consumption of ≥2 cups per day was denoted as "always."
Risk factors used for univariate logistic regression
| Variables | Normal (n=83) | T2DM (n=151) | OR (95% CI) |
|---|---|---|---|
| Sex | 0.64 (0.20, 2.04) | ||
| Male | 13 (18.1) | 59 (81.9) | |
| Female | 70 (43.2) | 92 (56.8) | |
| Age (yr) | 36.54±10.70 | 53.25±11.20 | 1.24 (1.02, 1.53)[ |
| BMI (kg/m2) | 23.10±3.59 | 28.57±4.10 | 1.18 (0.98, 1.42) |
| Waist circumference (cm) | 78.07±18.11 | 102.39±10.05 | 1.08 (1.01, 1.15)[ |
| Stress (0-10) | 5.55±2.25 | 5.44±2.69 | 1.42 (1.13, 1.79)[ |
| Hypertension | 4.52 (1.01, 12.27)[ | ||
| No | 80 (50.3) | 79 (49.7) | |
| Yes | 3 (4.0) | 72 (96.0) | |
| Walking (min/d) | 1.28 (0.41, 3,96) | ||
| <30 | 36 (27.3) | 96 (72.7) | |
| ≥30 | 47 (46.1) | 55 (53.9) | |
| Sedentary time at workplace or home | 6.06 (2.04, 8.04)[ | ||
| Sometimes | 50 (66.7) | 25 (33.3) | |
| Often | 33 (20.8) | 126 (79.2) | |
| Fruit consumption | 0.84 (0.164, 4.31) | ||
| Sometimes | 9 (33.3) | 18 (66.7) | |
| Often + always | 74 (35.7) | 133 (64.3) | |
| Vegetable consumption | 0.07 (0.01, 0.44)[ | ||
| Sometimes | 4 (6.5) | 58 (93.5) | |
| Often + always | 79 (42.0) | 93 (54.1) | |
| Family history of diabetes | 2.94 (1.08, 7.83)[ | ||
| No | 73 (50.0) | 73 (50.0) | |
| Yes | 10 (11.4) | 78 (86.6) | |
| Smoking (cigarettes, hookah) | 4.26 (2.29, 7.93)[ | ||
| Never | 66 (47.8) | 72 (52.2) | |
| Former + current | 17 (7.7) | 79 (82.3) |
Values are presented as number (%) or mean±standard deviation.
T2DM, type 2 diabetes mellitus; OR, odds ratio; CI, confidence interval; BMI, body mass index.
ORs and 95% CIs were obtained by univariate logistic regression, and significant (p<0.2) risk factors.
Figure 1.Artificial neural networks scheme of predictors of T2DM starting at the first step, with 20 inputs, 6 hidden layers (H1, ..., H6), and dichotomous output neurons. The encoded variables are presented in Table 1. BMI, Waist, Hyper_, Walk_, Sedent_, Veget_ and T2D_Histo denote body mass index; waist circumference, hypertension status, walking time, sedentary status, vegetables consumption and family history of type 2 diabetes mellitus, respectively.
Results of multilayer perceptron neural network modeling
| Models | Risk factors | Data set (test) | Sensitivity (%) | Specificity (%) | AUC | Accuracy (%) |
|---|---|---|---|---|---|---|
| 1 | Age, hypertension, waist circumference, BMI, sedentary lifestyle, smoking, vegetable consumption, family history of T2DM, stress, walking, fruit consumption, and sex | Training | 96.2 | 76.7 | 0.947 | 89.2 |
| 93.3 | 82.5 | 0.942 | 89.7 | |||
| 2 | Age, hypertension, waist circumference, BMI, sedentary lifestyle, smoking, vegetable consumption, family history of T2DM, stress, walking, and fruit consumption | Training | 94.0 | 79.6 | 0.920 | 90.9 |
| 92.2 | 75.9 | 0.931 | 86.3 | |||
| 3 | Age, hypertension, waist circumference, BMI, sedentary lifestyle, smoking, vegetable consumption, family history of T2DM, stress, and walking | Training | 93.2 | 79.3 | 0.911 | 88.6 |
| 95.1 | 80.0 | 0.920 | 89.8 | |||
| 4 | Age, hypertension, waist circumference, BMI, sedentary lifestyle, smoking, vegetable consumption, family history of T2DM, and stress | Training | 95.0 | 78.7 | 0.943 | 91.3 |
| 96.1 | 63.6 | 0.945 | 86.3 | |||
| 5 | Age, hypertension, waist circumference, BMI, smoking, vegetable consumption, family history of T2DM, and stress | Training | 94.1 | 79.6 | 0.953 | 92.9 |
| 95.2 | 82.5 | 0.963 | 96.9 | |||
| 6 | Age, hypertension, waist circumference, BMI, smoking, family history of T2DM, and stress | Training | 93.6 | 66.1 | 0.946 | 84.2 |
| 95.2 | 88.9 | 0.953 | 92.8 |
AUC, area under the receiver operating characteristic curve; BMI, body mass index; T2DM, type 2 diabetes mellitus.
Figure 2.The area under the receiver operating characteristic curve for non-diabetic and diabetic subjects in the test and training groups based on the sixth model (final stage), containing waist circumference, age, body mass index, hypertension, stress, smoking, and family history of type 2 diabetes mellitus.