| Literature DB >> 31652912 |
Musa Uba Muhammad1, Ren Jiadong2, Noman Sohail Muhammad3, Bilal Nawaz4.
Abstract
An accurate classification for diabetes mellitus (DBM) allows for the adequate treatment and handling of its menace, particularly in developing countries like Nigeria. This study proposes data mining techniques for the classification and identification of the prevalence of diagnosed diabetes cases, stratified by age, gender, diabetic conditions and residential area in the northwestern states of Nigeria, based on the real-life data derived from government-owned hospitals in the region. A K-mean assessment was used to cluster the instances, after 12 iterations the instances classified out of 3022: 2662 (88.09%) non-insulin dependent (NID), 176 (5.82%) insulin-dependent (IND) and 184 (6.09%) gestational diabetes (GTD). The total number of diagnosed diabetes cases was 3022: 1380 males (45.66%) and 1642 females (54.33%). The higher prevalence was found to be in females compared to males, and in cities and towns, rather than in villages (36.5%, 34.2%, and 29.3%, respectively). The highest prevalence among the age groups was in the age group 50-69 years, which constituted 43.9% of the total diagnosed cases. Furthermore, the NID condition had the highest prevalence of cases (88.09%). These were the first findings of the stratified prevalence in the region, and the figures have been of utmost significance to the healthcare authorities, policymakers, clinicians, and non-governmental organizations for the proper planning and management of diabetes mellitus.Entities:
Keywords: K-means; Nigeria; age; classification; diabetes mellitus; diagnosed; gender; prevalence; real-life data
Mesh:
Year: 2019 PMID: 31652912 PMCID: PMC6928643 DOI: 10.3390/ijerph16214089
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Data collection flow.
Figure 2Study analytical platform. Legend: DS1QN: Data Source One Questionnaire, DS2VI: Data Source Two Verbal Interview, DS3HR: Data Source Three Hospital Records, EXTR: Extraction, INTG: Integration, STATS: Statistics, R: R-Programming Software, WEKA: Waikato Environment, CLUST: Clustering, CLSFN: Classification, EVLTN: Evaluation, ASMT: Assessment, CL-IMP: Clinical-Implications.
Diabetes conditions and the number of patients classified in the assessment.
| Diabetic Condition | Patients | Age | Ratio | Cluster by Diabetic Conditions | |||
|---|---|---|---|---|---|---|---|
| NID | 2662 | 12 ≤ x ≤ 85 | 88.09% | “0” missing-value | NID | IND | GTD |
| IND | 176 | 5.82% | 2662 | 176 | 184 | ||
| GTD | 184 | 6.09% | |||||
N = number of diabetes patients, x = patients age, NID = non-insulin dependent, IND = insulin dependent, GTD = gestational diabetes.
Figure 3K-means cluster assessments for the patient’s diabetic conditions.
Figure 4Bar flow for the patient’s diabetic condition.
Figure 5Patients age distribution based on diabetic conditions.
Diagnosed diabetes cases stratified by age, gender, residential area and diabetic conditions.
| Age (Years) | Residential Area | Gender | Diabetic Condition | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| M | % | F | % | NID | % | IND | % | GTD | % | ||
| <20 | City | 8 | 0.26 | 4 | 0.13 | 13 | 0.43 | 3 | 0.10 | 0 | 0 |
| Town | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| Village | 4 | 0.13 | 4 | 0.13 | 3 | 0.10 | 1 | 0.03 | 0 | 0 | |
| Total | 12 | 0.39 | 8 | 0.26 | 16 | 0.53 | 4 | 0.13 | 0 | 0 | |
| 20–39 | City | 101 | 3.34 | 205 | 6.78 | 295 | 9.76 | 1 | 0.03 | 10 | 0.33 |
| Town | 73 | 2.42 | 155 | 5.13 | 220 | 7.28 | 0 | 0 | 8 | 0.26 | |
| Village | 36 | 1.19 | 106 | 3.51 | 128 | 4.24 | 2 | 0.07 | 12 | 0.39 | |
| Total | 210 | 6.95 | 466 | 15.42 | 643 | 21.28 | 3 | 0.10 | 30 | 0.99 | |
| 40–59 | City | 185 | 6.12 | 339 | 11.22 | 465 | 15.39 | 20 | 0.66 | 39 | 1.29 |
| Town | 188 | 6.22 | 229 | 7.58 | 358 | 11.85 | 40 | 1.32 | 19 | 0.62 | |
| Village | 200 | 6.62 | 187 | 6.19 | 327 | 10.82 | 19 | 0.62 | 41 | 1.36 | |
| Total | 573 | 18.96 | 755 | 24.98 | 1150 | 38.05 | 79 | 2.61 | 99 | 3.28 | |
| 60–79 | City | 165 | 5.46 | 95 | 3.14 | 257 | 8.50 | 1 | 0.03 | 2 | 0.07 |
| Town | 195 | 6.45 | 156 | 5.16 | 287 | 9.50 | 45 | 1.49 | 19 | 0.62 | |
| Village | 190 | 6.29 | 133 | 4.40 | 249 | 8.24 | 44 | 1.46 | 30 | 0.99 | |
| Total | 550 | 18.20 | 384 | 12.71 | 793 | 26.24 | 90 | 2.98 | 51 | 1.69 | |
| 80+ | City | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Town | 18 | 0.60 | 20 | 0.66 | 38 | 1.26 | 0 | 0 | 0 | 0 | |
| Village | 17 | 0.56 | 9 | 0.30 | 22 | 0.73 | 4 | 0.13 | 0 | 0 | |
| Total | 35 | 1.2 | 29 | 0.96 | 60 | 1.99 | 4 | 0.13 | 0 | 0 | |
The summary of the diabetes cases by age, gender, residing place and diabetic conditions.
| Age (Year) | Residing Place | Both Gender (M and F) | Diabetic Condition | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Total | % | NID | % | IND | % | GTD | % | ||
| All | City | 1102 | 36.47 | 1030 | 34.08 | 23 | 0.76 | 53 | 1.75 |
| Town | 1034 | 34.22 | 903 | 29.88 | 84 | 2.78 | 47 | 1.56 | |
| Village | 886 | 29.32 | 729 | 24.12 | 69 | 2.28 | 84 | 2.78 | |
| Total | 3022 | 100.01 | 2662 | 88.09 | 176 | 5.82 | 184 | 6.09 | |
| a | b | Classified as: |
| 512 | 2 | a = correctly |
| 20 | 2510 | b = incorrectly |