| Literature DB >> 24817883 |
N Ghadiri Hedeshi1, M Saniee Abadeh1.
Abstract
In the past decades, medical data mining has become a popular data mining subject. Researchers have proposed several tools and various methodologies for developing effective medical expert systems. Diagnosing heart diseases is one of the important topics and many researchers have tried to develop intelligent medical expert systems to help the physicians. In this paper, we propose the use of PSO algorithm with a boosting approach to extract rules for recognizing the presence or absence of coronary artery disease in a patient. The weight of training examples that are classified properly by the new rules is reduced by a boosting mechanism. Therefore, in the next rule generation cycle, the focus is on those fuzzy rules that account for the currently misclassified or uncovered instances. We have used coronary artery disease data sets taken from University of California Irvine, (UCI), to evaluate our new classification approach. Results show that the proposed method can detect the coronary artery disease with an acceptable accuracy. Also, the discovered rules have significant interpretability as well.Entities:
Mesh:
Year: 2014 PMID: 24817883 PMCID: PMC4003794 DOI: 10.1155/2014/783734
Source DB: PubMed Journal: Comput Intell Neurosci
Summary of attributes (UCI heart disease data base).
| Attribute | Description | Value description |
|---|---|---|
| age | Age | Numerical |
| sex | Sex | 1: if male; 0: if female |
| cp | Chest pain type | 1: typical angina |
| 2: atypical angina | ||
| 3: nonanginal pain | ||
| 4: asymptomatic | ||
| trestbps | Resting systolic blood pressure on admission to the hospital (mmHg) | Numerical |
| chol | Serum cholesterol (mg/dL) | Numerical |
| fbs | Fasting blood sugar over 120 mg/dL? | 1: if yes |
| 0: if no | ||
| restecg | Resting electrocardiographic results | 0; normal |
| 1: having ST-T wave abnormality | ||
| 2: having LV hypertrophy | ||
| thalach | Maximum heart rate achieved | Numerical |
| exang | Exercise induced angina? | 1: if yes |
| 0: if no | ||
| oldpeak | ST depression induced by exercise relative to rest | Numerical |
| slope | The slope of the peak exercise ST segment | 1: upsloping |
| 2: flat | ||
| 3: downsloping | ||
| ca | Number of major vessels colored by fluoroscopy | Numerical |
| thal | Exercise thallium scintigraphic defects | 3: normal |
| 6: fixed defect | ||
| 7: reversible defect | ||
| num | Diagnosis of heart disease (angiographic disease status/presence of coronary artery disease (CAD)) | 0: if less than 50% diameter narrowing in any major vessel (CAD, no) |
| 1: if more than 50% (CAD, yes) |
Figure 1Four fuzzy partitions used in computer implementations. The superscript of each part shows the granularity of the fuzzy partition.
Figure 2The basic model of rule-based fuzzy classifiers [17].
Pseudocode 1The pseudocode of En-PSO2 approach.
Figure 3Total flowchart of the expert system of coronary artery detection. K is the allowable amount for total weights of training examples.
Parameter setting for the proposed En-PSO2.
| Parameter | Value |
|---|---|
| Swarm size | 25 |
| Number of iteration | Until the global best does not change after 5 iterations |
|
| 1/iteration number |
| c1 | Random (between 0 and 1) |
|
| Random (between 0 and 1) |
|
| 12 |
|
| 20 |
The obtained confusion matrix.
| Actual result | |||
|---|---|---|---|
| Patient | Healthy | ||
| 90 | 419 | Healthy | Classifier result |
| 370 | 41 | Patient | |
Performance measures of En-PSO2 according to confusion matrix.
| Measure name | Accuracy | Sensitivity | Specificity |
| Precision |
|---|---|---|---|---|---|
| Amount (%) | 85.76 | 90.02 | 82.31 | 86.48 | 91.08 |
Comparison between rule-based methods.
| Rule-based methods | Accuracy | Number of rules |
|---|---|---|
| Decision tree [ | 85.6 | 83 |
| Support based [ | 84.4 | 27 |
| Pearson [ | 84.5 | 27 |
| RST [ | 85.2 | 27 |
| En-PSO [ | 85.97 | 25.3 |
|
|
|
|
Figure 4Time and accuracy changes based on increasing amount of B.
Figure 5Average accuracy obtained on train and test data sets from adding each rule to the rule set.
Comparison between classification algorithms according to accuracy, sensitivity, specificity, and number of rules.
| Method | Accuracy | Sensitivity | Specificity | Number of rules |
|---|---|---|---|---|
| Decision tree [ | 78.91 | 72.01 | 84.48 | — |
| LTF-C [ | 81.2 | — | — | — |
| Bagging [ | 81.41 | 74.93 | 86.64 | — |
|
| 81.5 | — | — | — |
| Bayesian model [ | 82 | 87 | — | — |
| Decision tree (C4.5) [ | 82.5 | 87.17 | — | — |
| SVM [ | 82.5 | 88 | — | — |
| RST [ | 85.2 | — | — | 27 |
| ANN [ | 85.53 | — | — | — |
| Decision tree [ | 85.6 | — | — | 83 |
| NN-Alizadeh [ | 85.43 | 90.2 | 73.5 | — |
| En-PSO2 (this study) | 85.76 | 90.02 | 82.31 | 21.2 |