| Literature DB >> 35399848 |
Satish Kumar David1, Mohamed Rafiullah1, Khalid Siddiqui1.
Abstract
Background: Diabetic kidney disease (DKD), one of the complications of diabetes in patients, leads to progressive loss of kidney function. Timely intervention is known to improve outcomes. Therefore, screening patients to identify high-risk populations is important. Machine learning classification techniques can be applied to patient datasets to identify high-risk patients by building a predictive model. Objective: This study aims to identify a suitable classification technique for predicting DKD by applying different classification techniques to a DKD dataset and comparing their performance using WEKA machine learning software.Entities:
Mesh:
Year: 2022 PMID: 35399848 PMCID: PMC8993553 DOI: 10.1155/2022/7378307
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1Risk factors affecting diabetic kidney disease.
Figure 2Block diagram of the proposed research.
Figure 3Schematic illustration of the methodology used for identifying the best performing classification technique.
Figure 4WEKA-Explorer window.
Figure 5Classifier IBK result.
Figure 6Classifier random tree result.
Figure 7Classifier random forest result.
Figure 8Classifier AdaBoostM1 result.
Comparison of different classifiers applied on the DKD dataset.
| Classifier | Execution time (seconds) | Accuracy (%) | Correctly classified instances | Incorrectly classified instances |
|---|---|---|---|---|
| IBK | 0 | 93.6585 | 384 | 26 |
| Random tree | 0.01 | 93.6585 | 384 | 26 |
| Random forest | 0.28 | 93.4146 | 383 | 27 |
| Multilayer perceptron | 8.3 | 93.1707 | 382 | 28 |
| J48 | 0.13 | 89.7561 | 368 | 42 |
| Hoeffding tree | 0.04 | 86.0976 | 353 | 57 |
| REP tree | 0.08 | 85.122 | 349 | 61 |
| Naïve bayes | 0.01 | 80.9756 | 332 | 78 |
| AdaBoostM1 | 0.11 | 79.0244 | 324 | 86 |
Classification results from WEKA.
| Classifier | Kappa statistics (K) | Mean absolute error (MAE) | Root mean squared error (RMSE) |
|---|---|---|---|
| IBK | 0.8731 | 0.1096 | 0.2496 |
| Random tree | 0.8731 | 0.1093 | 0.2497 |
| Random forest | 0.8681 | 0.1267 | 0.2542 |
| Multilayer perceptron | 0.8633 | 0.1117 | 0.2513 |
| J48 | 0.7947 | 0.1595 | 0.3074 |
| Hoeffding tree | 0.7223 | 0.1389 | 0.3696 |
| REP tree | 0.7025 | 0.2194 | 0.3565 |
| Naïve bayes | 0.6199 | 0.1899 | 0.4261 |
| AdaBoostM1 | 0.5827 | 0.3246 | 0.4009 |
Confusion matrix of different classifiers.
| Classifiers | Prediction | Actual state (clinical definition) (197 DKD and 213 not DKD) | |
|---|---|---|---|
| DKD | Not DKD | ||
| IBK | 186 | 11 | DKD |
| 15 | 198 | NOT DKD | |
| Random tree | 186 | 11 | DKD |
| 15 | 198 | NOT DKD | |
| Random forest | 184 | 13 | DKD |
| 14 | 199 | NOT DKD | |
| Multilayer perceptron | 184 | 13 | DKD |
| 15 | 198 | NOT DKD | |
| J48 | 174 | 23 | DKD |
| 19 | 194 | NOT DKD | |
| Hoeffding tree | 36 | 177 | DKD |
| 81 | 116 | NOT DKD | |
| REP tree | 171 | 26 | DKD |
| 35 | 178 | NOT DKD | |
| Naïve bayes | 165 | 32 | DKD |
| 46 | 167 | NOT DKD | |
| AdaBoostM1 | 172 | 25 | DKD |
| 61 | 152 | NOT DKD | |
Comparison of recent works of predictive models for diabetic kidney disease or diabetic nephropathy.
| Source | Dataset | Model | Complication | Accuracy (%) |
|---|---|---|---|---|
| Sobrinho et al., 2020 [ | 114 instances and 8 attributes | J48 decision tree | DKD | 95 |
| Senan et al., 2021 [ | 400 instances and 24 attributes | Recursive feature elimination to choose attributes followed by random forest classification | DKD | 100 |
| Almansour et al., 2019 [ | 400 instances and 24 attributes | Artificial neural network | CKD | 99.7 |
| Khanam and foo, 2021 [ | 768 instances and 9 attributes | Neural network | Diabetes | 88.6 |
| Our study | 410 instances and 18 attributes | IBK and random tree | DKD | 93.6585 |