| Literature DB >> 33814965 |
Xiaohua Li1, Jusheng Zhang2,3, Fatemeh Safara4.
Abstract
Artificial intelligence is a future and valuable tool for early disease recognition and support in patient condition monitoring. It can increase the reliability of the cure and decision making by developing useful systems and algorithms. Healthcare workers, especially nurses and physicians, are overworked due to a massive and unexpected increase in the number of patients during the coronavirus pandemic. In such situations, artificial intelligence techniques could be used to diagnose a patient with life-threatening illnesses. In particular, diseases that increase the risk of hospitalization and death in coronavirus patients, such as high blood pressure, heart disease and diabetes, should be diagnosed at an early stage. This article focuses on diagnosing a diabetic patient through data mining techniques. If we are able to diagnose diabetes in the early stages of the disease, we can force patients to stay home and care for their health, so the risk of being infected with the coronavirus would be reduced. The proposed method has three steps: preprocessing, feature selection and classification. Several combinations of Harmony search algorithm, genetic algorithm, and particle swarm optimization algorithm are examined with K-means for feature selection. The combinations have not examined before for diabetes diagnosis applications. K-nearest neighbor is used for classification of the diabetes dataset. Sensitivity, specificity, and accuracy have been measured to evaluate the results. The results achieved indicate that the proposed method with an accuracy of 91.65% outperformed the results of the earlier methods examined in this article.Entities:
Keywords: Artificial intelligence; Coronavirus disease pandemic; Diabetes diagnosis application; Genetic algorithm; Harmony search algorithm; K-means; Particle swarm optimization
Year: 2021 PMID: 33814965 PMCID: PMC7997791 DOI: 10.1007/s11063-021-10491-0
Source DB: PubMed Journal: Neural Process Lett ISSN: 1370-4621 Impact factor: 2.565
Definition of the Mathematical symbol used in Eq. (1) to (9)
| Symbol | Description |
|---|---|
| raccept | Accepting rate |
| Ppitch | Pitch adjustment |
| brange | Pitch bandwidth |
| rpa | Adjusting rate |
| xold | Current pitch of a Harmony algorithm |
| xnew | Pitch after adjusting |
| Ε | A random number between 0 and 1 |
| Prandom | The third component that is used to produce more variations of a solution |
| xid | The current position of a particle |
| Pid | pbest of particle |
| vid | Velocity of particle |
| Pgd | gbest of the group |
| W | Interia factor |
| c1 | Relative influence of the cognitive component |
| c2 | Relative influence of the social component |
| r1, r2 | Random numbers used to keep the change of the population spread between 0 and 1, equally |
| wmax | Initial weight |
| wmin | Final weight |
| itermax | Maximum iteration number |
| Iter | Current iteration number |
Fig. 1Decreasing the fitness function by the GA-Kmeans hybrid
Fig. 2Decreasing the fitness function by GA-PSO-Kmeans hybrid
Fig. 3Decreasing the fitness function by HR-Kmeans hybrid
Description of PIMA Indian diabetes records [11]
| Feature name | Feature description | |
|---|---|---|
| 1 | Pregnancies | Number of times pregnant |
| 2 | Glucose | Plasma glucose concentration a 2 h in an oral glucose tolerance test (mg/dl) |
| 3 | BloodPressure | Diastolic blood pressure (mm Hg) |
| 4 | SkinThickness | Triceps skin fold thickness (mm) |
| 5 | Insulin | 2-h serum insulin (mu U.ml) |
| 6 | BMI | Body mass index (weight in kg/(height in m)^2) |
| 7 | DiabetesPedigreeFunction | Diabetes pedigree function |
| 8 | Age | Age (years) |
| 9 | Class label | Class variable (0 or 1) |
The results of the proposed hybrid methods and three standard classification methods on the PIMA Indian diabetes dataset
| Feature selection | Classifier | Sensitivity (%) | Specificity (%) | Accuracy (%) |
|---|---|---|---|---|
| – | SVM | 76.60 | 42.36 | 82.85 |
| – | DT | 81.31 | 75.33 | 84.30 |
| – | KNN | 88.27 | 93.13 | 86.63 |
| GA | KNN | 89.01 | 85.09 | 88.02 |
| PSO | KNN | 87.22 | 85.09 | 87.22 |
| HR | KNN | 90.15 | 88.02 | 90.55 |
| GA-Kmeans | KNN | 83.73 | 50.00 | 88.02 |
| GA-PSO-Kmeans | KNN | 86.65 | 75.33 | 89.64 |
| HR-Kmeans | KNN | 91.11 | 50.00 | 91.65 |
The comparisons of the accuracy of different standard classifiers examined in this paper and reported in previous studies
| References | Classifier | Accuracy (%) |
|---|---|---|
| [ | Bagged tree | 73.20 |
| [ | RUSBoosted trees | 73.40 |
| [ | Boosted tree | 75.00 |
| [ | C4.5 | 76.52 |
| [ | Naïve Bayes | 76.96 |
| [ | LR | 78.69 |
| This paper | SVM | 82.85 |
| This paper | DT | 84.30 |
| This paper | KNN | 86.63 |
Comparison of hybrid algorithms proposed in previous researches with the hybrid algorithms proposed in this paper
| References | Feature selection | Classifier | Features Selected | Accuracy (%) |
|---|---|---|---|---|
| [ | PSO | Naïve Bayes | All 8 features | 78.69 |
| [ | PCA | LR | All 8 features | 79.56 |
| [ | BCO | Fuzzy | Age, BMI, Glucose | 84.21 |
| [ | ANT FDCSM | Fuzzy rule miner | NA | 87.7 |
| [ | SOMSwram | DNN | NA | 80 |
| [ | Stacked-autoencoders SAE | DNN | NA | 86.26 |
| This paper | GA-Kmeans | KNN | Glucose, BloodPressure, Insulin, Age | 88.02 |
| This paper | GA-PSO-Kmeans | KNN | Glucose, BloodPressure, Insulin, BMI | 89.64 |
| This paper | HR-Kmeans | KNN | Glucose, BloodPressure, Insulin | 91.65 |