| Literature DB >> 31902978 |
Abstract
It is of interest to discuss the feasibility for data mining technology enabled antiretroviral therapy (ART) for HIV positive patients at the University of Gondar Specialized Teaching Hospital, Ethiopia. The Knowledge Discovery in Databases (KDD), which is in an iterative process where evaluation measures enhanced, is used to prepare the data set for ART in HIV positive patients. A decision tree J48 model that is the implementation of algorithm ID3 (Iterative Dichotomiser 3) developed by the WEKA project team is used in this study. The J48 model was built with pruned and without pruned parameters by selecting two different test modes for 10-fold cross validation with percentage split. Results using J48 pruned decision tree with 10-fold cross validation produces 80.5% prediction precision for ART starter prognosis enabled treatment in clinical settings.Entities:
Keywords: ART Starter; Antiretroviral therapy (ART); J48 pruned decision tree; KDD methodology; WEKA 3.7
Year: 2019 PMID: 31902978 PMCID: PMC6936664 DOI: 10.6026/97320630015790
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Summary of Decision tree J48 pruned tree with test mode 10-fold cross validation
| Instances | Number of Classified | Accuracy in % | Confusion Matrix | ||
| YES | NO | ||||
| Correctly Classified | 807 | 80.54% | YES | 544 | 68 |
| Incorrectly Classified | 195 | 19.46% | NO | 127 | 263 |
| Total of Instances | 1002 |
Template of Confusion Matric Analysis Method
| Predictive ART determinant | Total | |||
| Actual ART determinant | YES | NO | ||
| YES | TP | FP | TP +FP | |
| NO | FN | TN | FN +TN | |
| Total | TP+FN | FP + TN | TP +FN+FP+TN |
Summary of Decision tree J48 pruned tree with test mode Percentage split to 50%
| Instances | Number of Classified | Accuracy in % | Confusion Matrix | ||
| YES | NO | ||||
| Correctly Classified | 396 | 79.04% | YES | 267 | 37 |
| Incorrectly Classified | 105 | 20.96% | NO | 68 | 129 |
| Total of Instances | 501 |
Summary of Decision tree J48unpruned tree with test mode 10-fold cross validation
| Instances | Number of Classified | Accuracy in % | Confusion Matrix | ||
| YES | NO | ||||
| Correctly Classified | 795 | 79.34% | YES | 522 | 90 |
| Incorrectly Classified | 207 | 20.66% | NO | 117 | 273 |
| Total of Instances | 1002 |
Summary of Decision tree J48unpruned tree with test mode Percentage split to 50%
| Instances | Number of Classified | Accuracy in % | Confusion Matrix | ||
| YES | NO | ||||
| Correctly Classified | 386 | 77.05% | YES | 250 | 54 |
| Incorrectly Classified | 115 | 22.95% | NO | 61 | 136 |
| Total of Instances | 501 |
Summary of Decision tree J48 pruned tree with test mode Percentage split by 80% to 20%.
| Instances | Number of Classified | Accuracy in % | Confusion Matrix | ||
| YES | NO | ||||
| Correctly Classified | 156 | 78% | YES | 103 | 15 |
| Incorrectly Classified | 44 | 22% | NO | 29 | 53 |
| Total of Instances | 200 |
Summary of Decision tree J48unpruned tree with test mode Percentage split by 80% to 20
| Instances | Number of Classified | Accuracy in % | Confusion Matrix | ||
| YES | NO | ||||
| Correctly Classified | 155 | 77.50% | YES | 100 | 18 |
| Incorrectly Classified | 45 | 22.50% | NO | 27 | 55 |
| Total of Instances | 200 |
Summary result of J48 decision tree algorithm
| Type | Number of leaves | Size of tree | Time taken | Accuracy | F-measure For YES | F-measure For NO |
| Type#1 | 59 | 91 | 0.2 | 80.50% | 84.60% | 73.00% |
| Type#2 | 59 | 91 | 0.03 | 79.00% | 83.60% | 71.10% |
| Type#3 | 166 | 237 | 0.02 | 79.30% | 83.50% | 72.50% |
| Type#4 | 166 | 237 | 0.03 | 77.00% | 81.30% | 70.30% |
| Type#5 | 59 | 91 | 0.48 | 78% | 82.40% | 70.70% |
| Type#6 | 166 | 337 | 0.06 | 77.50% | 81.60% | 71.00% |
Summary of rules JRIP rules with test mode 10-fold cross validation
| Instances | Number of Classified | Accuracy in % | Confusion Matrix | ||
| YES | NO | ||||
| Correctly Classified | 791 | 78.94% | YES | 539 | 73 |
| Incorrectly Classified | 211 | 21.06% | NO | 138 | 252 |
| Total of Instances | 1002 |
Summary of rules JRIP rules with test mode percentage split by 66% to 34%
| Instances | Number of Classified | Accuracy in % | Confusion Matrix | ||
| YES | NO | ||||
| Correctly Classified | 271 | 79.47% | YES | 175 | 26 |
| Incorrectly Classified | 70 | 20.53% | NO | 44 | 96 |
| Total of Instances | 341 |