| Literature DB >> 34154576 |
Armin Yazdani1, Kasturi Dewi Varathan2, Yin Kia Chiam1, Asad Waqar Malik3, Wan Azman Wan Ahmad4.
Abstract
BACKGROUND: Cardiovascular disease is the leading cause of death in many countries. Physicians often diagnose cardiovascular disease based on current clinical tests and previous experience of diagnosing patients with similar symptoms. Patients who suffer from heart disease require quick diagnosis, early treatment and constant observations. To address their needs, many data mining approaches have been used in the past in diagnosing and predicting heart diseases. Previous research was also focused on identifying the significant contributing features to heart disease prediction, however, less importance was given to identifying the strength of these features.Entities:
Keywords: Cardiovascular disease; Heart disease prediction; Weighted associative rule mining; Weighted scores
Mesh:
Year: 2021 PMID: 34154576 PMCID: PMC8215833 DOI: 10.1186/s12911-021-01527-5
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Studies on Heart Disease Prediction using ARM
| Authors | Technique | No of Features Used | Evaluation Metric | Score | Dataset |
|---|---|---|---|---|---|
| Akbaş et al. [ | Associative Rule Mining | 13 | Confidence | 97.8 (Predicting no heart disease) | UCI |
| Vasanthanageswari and Vanitha [ | Associative Rule Mining | 16 | NA | NA | Congenital Heart Defect Dataset |
| Shuriyaa and Rajendranb [ | Associative Rule Mining + ANFIS | 13 | Accuracy | 93.2 | UCI |
| Sonet et al. [ | Associative Rule Mining | 13 | Confidence | 99 | National Institute of Cardiovascular Disease, Dhaka, Bangladesh |
| Thanigaivel and Kumar [ | Associative Rule Mining | 25 | Confidence | 100 | Hospital (name of the hospital not mentioned) |
| Srinivas et al. [ | Associative Rule Mining and MLP | 13 | Accuracy | 84.9 | UCI |
| Khare and Gupta [ | Associative Rule Mining | 13 | Confidence | 94 | UCI |
| Lakshmi and Reddy [ | Associative Rule Mining | 13 | Accuracy | 96.6 | UCI |
| Said et al. [ | Associative Rule Mining | 13 | Confidence | 91 | UCI |
| Nahar et al. [ | Associative Rule Mining | 13 | Confidence | 96 | UCI |
Studies on Heart Disease Prediction using WARM
| Authors | Technique | No of Features Used | Evaluation Metric | Score | Dataset |
|---|---|---|---|---|---|
| Ibrahim and Sivabalakrishnan [ | Random Walk Memetic Algo with WARM | 13 | Precision | 92% | UCI |
| Ibrahim and Sivabalakrishnan [ | WARM | 13 | Confidence | 67% | UCI |
| Kharya et al. [ | WARM with Bayesian Belief Network | 4 | NA | NA | NA |
| Chauhan et al.[ | WARM | 13 | Accuracy | 60.4% | UCI |
| Sundar et al. [ | WARM | 13 | Confidence | 84% | UCI |
| Soni et al. [ | WARM | 13 | Confidence | 80% | UCI |
| Soni and Vyas [ | WARM | 13 | Confidence | 79.5% | UCI |
Fig. 1Methodology
Features description
| No | Features | Description | Data Type |
|---|---|---|---|
| 1 | Age | Age in year | Numeric |
| 2 | Sex | Gender | Nominal |
| 3 | CP | Chest pain type | Nominal |
| 4 | Trestbps | Resting blood pressure | Numeric |
| 5 | Chol | Serum cholesterol | Numeric |
| 6 | Fbs | Fasting blood sugar | Nominal |
| 7 | Resteg | Resting electrographic results | Nominal |
| 8 | Talach | Maximum heart rate achieved | Numeric |
| 9 | Exang | Exercise induce angina | Nominal |
| 10 | Oldpeak | ST depression induced by exercise relative to rest | Numeric |
| 11 | Slope | The slope of the peak exercise ST segment | Nominal |
| 12 | CA | Number of major vessels coloured by fluoroscopy | Numeric |
| 13 | Thal | Thallium heart scan | Nominal |
| 14 | Goal | Diagnosis of heart disease | Nominal |
Ranges formed for features
| Age | < = 40: lessThanForty 41–64: betweenAge > = 65: greaterThanSixtyFour |
| Sex | 1: Male 0: Female |
| CP | 1: typicalAngina 2: atypicalAngina 3: nonAnginalPain 4: asymptomatic |
| Trestbps | 90–120: normal 120–140: unusual 140–160: high > 160: very high |
| Cholesterol (chol) | 110–200: normal 200–240: borderline_high 240–250: high > 250: very high |
| Fbs | True False |
| Restecg | 0: normal 1: STTWaveAbnormality 2: showingProbable |
| Thalach | 60–100: Normal > 100: Tachycardia |
| Exang | Yes No |
| Oldpeak | Zero greaterThanZero |
| Slope | 1: Upsloping 2: Flat 3: Downsloping |
| CA | Zero One Two Three |
| Thal | 3: Normal 6: Fixed 7: Reversible |
| Output | 0: No Heart Disease 1: Heart Disease |
Source: Khare et al. [24]
Selecting significant features from the result of the highest performance
| Age | Sex | CP | Trestbps | Chol | Fbs | Restecg | Thalach | Exang | Oldpeak | Slope | CA | Thal | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Occurrence in Highest Accuracy | 2 | 7 | 7 | 1 | 2 | 5 | 4 | 3 | 4 | 6 | 4 | 7 | 5 |
| Occurrence in Highest F-Measure | 2 | 7 | 7 | 1 | 2 | 5 | 4 | 3 | 4 | 6 | 4 | 7 | 5 |
| Occurrence in Highest Precision | 0 | 6 | 4 | 2 | 1 | 2 | 2 | 2 | 4 | 2 | 4 | 5 | 4 |
| Total Occurence | 4 | 20 | 18 | 4 | 5 | 12 | 10 | 8 | 12 | 14 | 12 | 19 | 14 |
Source: Amin et al. [8]
Weight of the significant features
| Sex | 0.17 |
| CP | 0.15 |
| Fbs | 0.09 |
| Exang | 0.09 |
| Oldpeak | 0.12 |
| Slope | 0.09 |
| CA | 0.18 |
| Thal | 0.11 |
Identify total sub value of each feature
| Total | Male | Female |
|---|---|---|
| 297 | 203 | 94 |
Fig. 2Comparison on the percentage of male and female in Cleveland heart disease dataset
Rules generated from all the features using WARM
| No | Rules | Confidence |
|---|---|---|
| 1 | Trestbps = unusual Thalach = Tachycardia Exang = No CA = zero Thal = normal = = > class_HD = No Heart Disease | 0.96 |
| 2 | Trestbps = unusual Fbs = FALSE Thalach = Tachycardia Exang = No CA = zero Thal = normal 52 = = > class_HD = No Heart Disease | 0.96 |
| 3 | Sex = Female Exang = No CA = zero = = > class_HD = No Heart Disease | 0.96 |
| 4 | Sex = Female Thalach = Tachycardia Exang = No CA = zero = = > class_HD = No Heart Disease | 0.96 |
| 5 | Sex = Female Exang = No CA = zero Thal = normal = = > class_HD = No Heart Disease | 0.96 |
| 6 | Age = betweenAge Trestbps = unusual Exang = No CA = zero Thal = normal = = > class_HD = No Heart Disease | 0.96 |
| 7 | CP = asymptomatic Slope = flat Thal = reversable = = > class_HD = Heart Disease | 0.96 |
| 8 | Sex = Female Fbs = FALSE Exang = No CA = zero = = > class_HD = No Heart Disease | 0.96 |
| 9 | Sex = Female Fbs = FALSE Thalach = Tachycardia Exang = No CA = zero = = > class_HD = No Heart Disease | 0.96 |
| 10 | Sex = Female Thalach = Tachycardia Exang = No CA = zero Thal = normal = = > class_HD = No Heart Disease | 0.96 |
| 11 | Age = betweenAge Trestbps = unusual Thalach = Tachycardia Exang = No CA = zero Thal = normal 48 = = > class_HD = No Heart Disease | 0.96 |
| 12 | Trestbps = unusual Exang = No CA = zero Thal = normal = = > class_HD = No Heart Disease | 0.95 |
| 13 | Trestbps = unusual Fbs = FALSE Thalach = Tachycardia CA = zero Thal = normal = = > class_HD = No Heart Disease | 0.95 |
| 14 | Age = betweenAge CP = asymptomatic Oldpeak = greaterThanZero Thal = reversable = = > class_HD = Heart Disease | 0.95 |
| 15 | Restecg = normal Thalach = Tachycardia Exang = No CA = zero Thal = normal = = > class_HD = No Heart Disease | 0.95 |
| 16 | CP = asymptomatic Fbs = FALSE Oldpeak = greaterThanZero Thal = reversable = = > class_HD = Heart Disease | 0.94 |
| 17 | Trestbps = unusual Fbs = FALSE Exang = No CA = zero Thal = normal = = > class_HD = No Heart Disease | 0.94 |
| 18 | CP = asymptomatic Exang = Yes Thal = reversable = = > class_HD = Heart Disease | 0.94 |
| 19 | Sex = Male CP = asymptomatic Exang = Yes Oldpeak = greaterThanZero = = > class_HD = Heart Disease | 0.94 |
| 20 | Age = betweenAge CP = asymptomatic Thalach = Tachycardia Oldpeak = greaterThanZero Thal = reversable = = > class_HD = Heart Disease | 0.94 |
Summary of frequency of each features contained in the rules that predicts heart disease (all features)
| Features | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| CP | Slope | Thal | Age | OldPeak | Fbs | Exang | Sex | Thalach | |
| 7 | √ | √ | √ | ||||||
| 14 | √ | √ | √ | ||||||
| 16 | √ | √ | √ | √ | |||||
| 18 | √ | √ | √ | ||||||
| 19 | √ | √ | √ | √ | |||||
| 20 | √ | √ | √ | √ | √ | ||||
| 6 | 6 | 1 | 4 | 2 | 4 | 1 | 2 | 1 | 1 |
Rules generated from 8 significant features using weighted associative rule mining
| No | Rules | Confidence |
|---|---|---|
| 1 | Sex = Female CP = nonAnginalPain Thal = normal = = > class_HD = No Heart Disease | 1 |
| 2 | Sex = Female Exang = No Oldpeak = greaterThanZero CA = zero = = > class_HD = No Heart Disease | 1 |
| 3 | CP = asymptomatic Exang = Yes Oldpeak = greaterThanZero Thal = reversible = = > class_HD = Heart Disease | 0.98 |
| 4 | Sex = Male CP = asymptomatic Exang = Yes Oldpeak = greaterThanZero Thal = reversable = = > class_HD = Heart Disease | 0.97 |
| 5 | CP = asymptomatic Fbs = FALSE Exang = Yes Oldpeak = greaterThanZero Thal = reversable = = > class_HD = Heart Disease | 0.97 |
| 6 | Sex = Female CP = nonAnginalPain = = > class_HD = No Heart Disease | 0.97 |
| 7 | Sex = Female Fbs = FALSE Exang = No Oldpeak = greaterThanZero Thal = normal = = > class_HD = No Heart Disease | 0.97 |
| 8 | Sex = Male CP = asymptomatic CA = one = = > class_HD = Heart Disease | 0.97 |
| 9 | Sex = Female CP = nonAnginalPain Exang = No = = > class_HD = No Heart Disease | 0.97 |
| 10 | CP = asymptomatic Exang = Yes Slope = flat Thal = reversable = = > class_HD = Heart Disease | 0.97 |
| 11 | CP = asymptomatic Exang = Yes Oldpeak = greaterThanZero Slope = flat Thal = reversable = = > class_HD = Heart Disease | 0.97 |
| 12 | Sex = Male CP = asymptomatic Fbs = FALSE Exang = Yes Oldpeak = greaterThanZero Thal = reversable = = > class_HD = Heart Disease | 0.97 |
| 13 | Sex = Female Exang = No CA = zero = = > class_HD = No Heart Disease | 0.96 |
| 14 | Sex = Female Exang = No CA = zero Thal = normal = = > class_HD = No Heart Disease | 0.96 |
| 15 | CP = asymptomatic Slope = flat Thal = reversable = = > class_HD = Heart Disease | 0.96 |
| 16 | Sex = Female Fbs = FALSE Exang = No CA = zero = = > class_HD = No Heart Disease | 0.96 |
| 17 | CP = asymptomatic Oldpeak = greaterThanZero Slope = flat Thal = reversable = = > class_HD = Heart Disease | 0.96 |
| 18 | Sex = Female Fbs = FALSE Exang = No CA = zero Thal = normal = = > class_HD = No Heart Disease | 0.96 |
| 19 | CP = asymptomatic Fbs = FALSE Slope = flat Thal = reversable = = > class_HD = Heart Disease | 0.95 |
| 20 | CP = asymptomatic Fbs = FALSE Oldpeak = greaterThanZero Slope = flat Thal = reversable = = > class_HD = Heart Disease | 0.95 |
Summary of frequency for each features contained in the rules that predicts heart disease (8 selected features)
| Features | ||||||||
|---|---|---|---|---|---|---|---|---|
| CP | Slope | Thal | OldPeak | Fbs | Exang | Sex | CA | |
| 3 | √ | √ | √ | √ | ||||
| 4 | √ | √ | √ | √ | √ | |||
| 5 | √ | √ | √ | √ | √ | |||
| 8 | √ | √ | √ | |||||
| 10 | √ | √ | √ | √ | ||||
| 11 | √ | √ | √ | √ | √ | |||
| 12 | √ | √ | √ | √ | √ | √ | ||
| 15 | √ | √ | √ | |||||
| 17 | √ | √ | √ | √ | ||||
| 19 | √ | √ | √ | |||||
| 20 | √ | √ | √ | √ | √ | |||
| 11 | 11 | 6 | 9 | 7 | 4 | 6 | 3 | 1 |
Fig. 3Result comparison on WARM using UCI Cleveland heart disease dataset
Comparative Analysis of Weighted Associative analysis and Associative Rule Mining in predicting heart disease
| Research | Confidence Score (%) | Rules | No of attributes in highest confidence rule | Technique | Dataset |
|---|---|---|---|---|---|
| Nahar et al. [ | 96 | Chest_Pain_Type = asympt, Slope = flat, Thal = rev | 3 | ARM | UCI |
| Said et al. [ | 91 | Chest Pain Type = asymptomic and Thal = reversible defect | 2 | ARM | UCI |
| Khare and Gupta [ | 94 | Thal = reversible_defect, CP = asymptomatic, Exercise_Induced_Angina = yes | 3 | ARM | UCI |
| Sonet et al. [ | 97 | Lack-of-Exercise = yes, Stress = yes, BP = high, Smoking = yes, Diabetes = yes ֜ | 5 | ARM | Data collected from 4 medical institutions (131 records) |
| 99 | Diabetes | 1 | ARM | ||
| Soni and Vyas [ | 79.5 | NA | NA | WARM | UCI |
| Soni et al. [ | 80 | NA | NA | WARM | UCI |
| Sundar et al. [ | 84 | NA | NA | WARM | UCI |
| Ibrahim & Sivabalakrishnan [ | 67 | 70..79- > yes | 1 | WARM | UCI |
| Our Experiment (all features) | 96 | CP = asymptomatic Slope = flat Thal = reversable | 3 | WARM | UCI |
| Our Experiment (8 Significant features) | 98 | CP = asymptomatic, Exang = Yes, Oldpeak = greaterThanZero, Thal = reversible | 4 |
Healthy rules extractions
| Research | Rules | Confidence Scores |
|---|---|---|
| Nahar et al. [ | Sex = female, Exercise_induced_angina = fal, Number_of_vessels_colored = 0, Thal = nom | 98 |
| Said et al. [ | Sex = female and Exercise_induced_angina = No and Thal = normal | 89 |
| Khare et al. [ | Ca = 0, Thal = normal, Exercise_induced_angina = no | 90 |
| Proposed work (with all features) | Trestbps = unusual, Thalach = Tachycardia, Exang = No, CA = zero Thal = normal | 96 |
| Proposed Work (with significant features) | Sex = Female, CP = nonAnginalPain, Thal = normal | 100 |
Sick rules extractions
| Research | Rules | Confidence Scores |
|---|---|---|
| Nahar et al. [ | Chest_pain_type = asympt, Slope = flat, Thal = rev | 96 |
| Said et al. [ | Chest pain type = asymptomic and Thal = reversible defect | 91 |
| Khare et al. [ | Thal = reversible_defect, CP = asymptomatic, Exercise_induced_angina = yes | 94 |
| Ibrahim and Sivabalakrishnan [ | 70..79- > yes | 67 |
| Proposed Work (all features) | CP = asymptomatic, Slope = flat, Thal = reversable | 96 |
| Proposed Work (8 significant features) | CP = asymptomatic Exang = Yes Oldpeak = greaterThanZero Thal = reversible | 98 |