| Literature DB >> 31300715 |
Muhammad Noman Sohail1, Ren Jiadong2, Musa Muhammad Uba2, Muhammad Irshad2, Wasim Iqbal3, Jehangir Arshad4, Antony Verghese John5.
Abstract
The increasing ratio of diabetes is found risky across the planet. Therefore, the diagnosis is important in population with extreme risk of diabetes. In this study, a decision-making classifier (J48) is applied over a data-mining platform (Weka) to measure accuracy and linear regression on classification results to forecast cost/benefit ratio in diabetes mellitus patients along with prevalence. In total 108 invasive and non-invasive medical features are considered from 251 patients for assessment, and the real-time data are gathered from Pakistan over a time span of June 2017 to April 2018. The results indicate that J48 classifiers achieved the best accuracy of (99.28%), whereas, error rate (0.08%), Kappa stats, PRC, and MCC are (0.98%), precision, recall, and F-matrix are (0.99%). In addition, true positive rate is (0.99%) and false positive is (0.08%). The regression forecast decision indicates blood pressure and glucose level are key features for diabetes. The cost/benefit matrix indicates two predictions for positive test with accuracy (66.68%) and (30.60%), and key attributes with total Gain (118.13%). The study confirmed the proposed prediction is practical for screening of diabetes mellitus patients at the initial stage without invasive medical tests and found effectual in the early diagnosis of diabetes.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31300715 PMCID: PMC6626127 DOI: 10.1038/s41598-019-46631-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Show the accuracy of Machine-learning algorithms performed for diabetes classification.
| Acc % |
| Year | Data | Ref | Author |
|---|---|---|---|---|---|
| 90 | Bayesian NN | 2014 | Artificial data utilized from UCI repositories |
[ | H. Das |
| 71 | PCA with NN | 2013 |
[ | Jabbar | |
| 94 | KNN | 2016 |
[ | Mangathayaru | |
| 89 | ANN | 2015 |
[ | A. Kumar | |
| 76 | Naive Bayes | 2015 |
[ | A. Lyer | |
| 86 | C 4.5 | 2014 |
[ | F. Sambo | |
| 98 | J48 | 2017 |
[ | O. Barrios | |
| 94 | FCM-WEKA | 2017 |
[ | Poongothai | |
| 75 | BLR-Tangra | 2014 |
[ | S. Bashira |
aAcc = accuracy, % = percentage, DMAT = data mining algorithms and techniques, Ref = references.
Figure 1Show the K-means clustering assessment of positive and negative clusters separated by the separation line.
Figure 2Show the classification assessment of machine learning J48 classifier with the confusion matrix of tested positive and tested negative instances.
Show the classified instances results and J48 pruned tree assessment.
| Classified instances | ||||||
|---|---|---|---|---|---|---|
| Clusters | N = 281 | Percent | Classified ratio | |||
| Diabetes Patient | Not Diabetes Patient | |||||
| 0 | 138 | 49 | 227 | 54 | ||
| 1 | 143 | 51 | ||||
|
| ||||||
|
|
|
|
|
| ||
|
|
| |||||
| Positive | >101 | No counts | 191 | 7 | 4b | |
| Negative | ≤101 | ≤100 | 85 | |||
| Positive | >72 | >100 | 3 | |||
| Negative | ≤72 | >100 | 2 | |||
bN = total number of patients; GLU = glucose level; BDPR = blood pressure level.
Figure 3Show the future forecast prediction assessment done on the classification resulted attributes (Blood pressure and Glucose level).
Figure 4Show the cost/benefit assessment achieved by the decision tree J48 machine learning classifier into four parts as explained in the text.
Show the classification result of four Decision-making classifiers and compares the utilized classifier with the recent author publication.
| Algorithms | Acc% | Error rate | Kappa stats | Time taken | Data used | Year | Ref/worked by | |
|---|---|---|---|---|---|---|---|---|
| J48 | 98.38 | Artificial dataset on 2017 |
[ | O. Barrios | ||||
| J48 | 99.28 | 0.01 | 0.98 | 0.08 | Real-life | 2018 | This Paperc | |
| J48 consolidation | 98.93 | 0.01 | 0.97 | 0.15 | ||||
| J48 graft | 98.93 | 0.01 | 0.97 | 0.02 | ||||
| Hoeffding Tree | 91.10 | 0.13 | 0.78 | 0.11 | ||||
cAcc = accuracy; % = percentage, Ref = reference.
Figure 5Show the assessment methodology framework into six steps as explained in the main text.
Figure 6Show the diabetes data collection flow into two parts as described in the text.