| Literature DB >> 30866905 |
Dongmei Pei1, Yang Gong2, Hong Kang2, Chengpu Zhang1, Qiyong Guo3.
Abstract
BACKGROUND: Prediction or early diagnosis of diabetes is crucial for populations with high risk of diabetes.Entities:
Keywords: Data mining; Diabetes; Screening
Mesh:
Year: 2019 PMID: 30866905 PMCID: PMC6416888 DOI: 10.1186/s12911-019-0790-3
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1Flow chart of records that were excluded from the physical examination database of Shengjing Hospital of China Medical University (January–April, 2017)
Characteristics of variables in diabetes and normal groups
| Variable | Possible values | Diabetes | Normal | χ2 test | |
|---|---|---|---|---|---|
| Age | 20–34 years old | 36 (5.1%) | 1718 (49.1%) | < 0.001 | 269.33 |
| 35–49 years old | 207 (29.2%) | 1246 (35.7%) | |||
| 50–65 years old | 466 (65.7%) | 532 (15.2%) | |||
| Gender | Male | 348 (49.1%) | 1123 (32.1%) | < 0.001 | 16.25 |
| Female | 361 (50.9%) | 2373 (67.9%) | |||
| Body mass index | < 25 | 250 (35.3%) | 2806 (80.3%) | < 0.001 | 18.87 |
| ≥25 | 459 (64.7%) | 690 (19.7%) | |||
| Hypertension | Yes | 221 (31.2%) | 755 (21.6%) | < 0.001 | 15.22 |
| Non-hypertension | 488 (68.8%) | 2741 (78.4%) | |||
| Salty food preference | No | 384 (54.2%) | 2598 (74.3%) | < 0.001 | 9.33 |
| Yes | 325 (45.8%) | 898 (25.7%) | |||
| History of cardiovascular disease or stroke | No | 627 (88.4%) | 3190 (91.2%) | 0.018 | 122.25 |
| Yes | 82 (11.6%) | 306 (8.8%) | |||
| Family history of diabetes | No | 335 (47.2%) | 3133 (89.6%) | < 0.001 | 154.21 |
| Yes | 374 (52.8%) | 363 (10.4%) | |||
| Physical activity | Less | 542 (76.4%) | 2043 (58.4%) | < 0.001 | 33.68 |
| More | 167 (23.6%) | 1453 (41.6%) | |||
| Work stress | Low | 129 (18.2%) | 1054 (30.2%) | < 0.001 | 81.54 |
| Moderate | 353 (49.8%) | 1993 (57.0%) | |||
| High | 227 (32.0%) | 449 (12.8%) |
The results of classification algorithms
| Model | Accuracy | Precision | Recall | AUC | |
|---|---|---|---|---|---|
| AdboostM1 | 0.9127 | 0.908 | 0.913 | 0.906 | 0.933 |
| J48 | 0.9503 | 0.950 | 0.950 | 0.948 | 0.964 |
| SMO | 0.9078 | 0.903 | 0.908 | 0.900 | 0.763 |
| Naïve Bayes | 0.8934 | 0.886 | 0.893 | 0.888 | 0.922 |
| Bayes Net | 0.8878 | 0.881 | 0.888 | 0.883 | 0.924 |
AUC the area under the receiver operating characteristic (ROC) curve
Fig. 2ROC curve of all algorithms
Fig. 3Decision tree of diabetes classifiers. The sample size is given as the number in parentheses at each node
Nineteen if-then rules extracted from the decision tree in Fig. 3
| Rule 1: IF age ≤ 49, without a family history of diabetes, BMI ≤ 25, THEN patient is normal (1457/1466 or 99%) | |
| Rule 2: IF age ≤ 34, without a family history of diabetes, BMI > 25, prefers salty food, THEN patient is normal (136/143 or 95%) | |
| Rule 3: IF 35 < age ≤ 49, without a family history of diabetes, BMI > 25, prefers salty food, without physical activity, THEN patient is diabetic (40/44 or 91%) | |
| Rule 4: IF 35 < age ≤ 49, without a family history of diabetes, BMI > 25, prefers salty food, with physical activity, without history of cardiovascular disease or stroke, THEN patient is normal (41/44 or 93%) | |
| Rule 5: IF 35 < age ≤ 49, without a family history of diabetes, BMI > 25, prefers salty food, with physical activity, with history of cardiovascular disease or stroke, THEN patient is diabetic (3/4 or 75%) | |
| Rule 6: IF age ≤ 49, without a family history of diabetes, BMI > 25, without preference for salty food, THEN patient is normal (265/272 or 97%) | |
| Rule 7: IF age ≤ 49, with a family history of diabetes, BMI ≤ 25, THEN patient is normal (215/231 or 93%) | |
| Rule 8: IF age ≤ 49, with a family history of diabetes, BMI > 25, THEN patient is diabetic (98/101 or 97%) | |
| Rule 9: IF age > 49, with work stress high, without a family history of diabetes, BMI ≤ 25, THEN patient is normal (15/16 or 94%) | |
| Rule 10: IF age > 49, with work stress high, without a family history of diabetes, BMI > 25, THEN patient is diabetic (9/11 or 82%) | |
| Rule 11: IF age > 49, with work stress high, with a family history of diabetes, THEN patient is diabetic (85 or 100%) | |
| Rule 12: IF age > 49, with work stress low or moderate, BMI > 25, without a family history of diabetes, prefers salty food, THEN patient is diabetic (45/53 or 85%) | |
| Rule 13: IF age > 49, with work stress low or moderate, BMI > 25, without a family history of diabetes, without preference for salty food, THEN patient is normal (72/88 or 82%) | |
| Rule 14: IF age > 49, without work stress high, BMI > 25, with a family history of diabetes, THEN patient is diabetic (51 or 100%) | |
| Rule 15: IF age > 49, with work stress low or moderate, BMI ≤ 25, prefers salty food, with hypertension, with work stress, THEN patient is diabetic (47/59 or 80%) | |
| Rule 16: IF age > 49, with work stress low or moderate, BMI ≤ 25, prefers salty food, without hypertension, gender male, with history of cardiovascular disease or stroke, THEN patient is diabetic (23/28 or 82%) | |
| Rule 17: IF age > 49, with work stress low or moderate, BMI ≤ 25, prefers salty food, without hypertension, gender male, without history of cardiovascular disease or stroke, THEN patient is normal (6/7 or 86%) | |
| Rule 18: IF age > 49, with work stress low or moderate, BMI ≤ 25, prefers salty food, without hypertension, gender female, THEN patient is normal (78/103 or 76%) | |
| Rule 19: IF age > 49, with work stress low or moderate, BMI ≤ 25, without preference for salty food, THEN patient is normal (216/246 or 88%) |