| Literature DB >> 32899733 |
Parastoo Golpour1, Majid Ghayour-Mobarhan2,3, Azadeh Saki1, Habibollah Esmaily4, Ali Taghipour4,5, Mohammad Tajfard4,6, Hamideh Ghazizadeh2,7, Mohsen Moohebati3, Gordon A Ferns8.
Abstract
(1) Background: Coronary angiography is considered to be the most reliable method for the diagnosis of cardiovascular disease. However, angiography is an invasive procedure that carries a risk of complications; hence, it would be preferable for an appropriate method to be applied to determine the necessity for angiography. The objective of this study was to compare support vector machine, naïve Bayes and logistic regressions to determine the diagnostic factors that can predict the need for coronary angiography. These models are machine learning algorithms. Machine learning is considered to be a branch of artificial intelligence. Its aims are to design and develop algorithms that allow computers to improve their performance on data analysis and decision making. The process involves the analysis of past experiences to find practical and helpful regularities and patterns, which may also be overlooked by a human. (2) Materials andEntities:
Keywords: angiography; logistic regression; naïve Bayes; support vector machine
Mesh:
Year: 2020 PMID: 32899733 PMCID: PMC7558963 DOI: 10.3390/ijerph17186449
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Comparison of frequency of subjects’ demographics and risk factors based on the result of angiography.
|
|
|
| |
| Age group |
|
| |
| 18–39 | 21 (36.2) | 37 (63.8) | ( |
| 40–59 | 378 (61.0) | 242 (39.0) | |
| 60≤ | 351 (75.8) | 112 (24.2) | |
| Levels of education | ( | ||
| Elementary | 495 (64.3) | 274 (35.7) | |
| Diploma | 113 (71.5) | 45 (28.5) | |
| Bachelor | 111 (64.1) | 62 (35.9) | |
| MA | 31 (75.6) | 10 (24.4) | |
| Gender | ( | ||
| Female | 284 (51.5) | 267 (48.5) | |
| Male | 466 (78.9) | 124 (21.1) | |
| Smoking Habit | ( | ||
| Smoker | 184 (75.4) | 60 (24.6) | |
| Used to smoke | 119 (69.5) | 52 (30.5) | |
| Non-smoker | 447 (61.5) | 279 (38.5) | |
| History of high blood pressure | ( | ||
| Yes | 352 (67.3) | 171 (32.7) | |
| No | 398 (64.4) | 220 (35.6) | |
| Family history of kidney disease | ( | ||
| Yes | 208 (70.9) | 85 (29.1) | |
| No | 542 (63.9) | 306 (36.1) | |
| History of cardiovascular disease | ( | ||
| Yes | 337 (64.0) | 189 (36.0) | |
| No | 413 (67.1) | 202 (32.9) | |
| History of myocardial infarction | ( | ||
| Yes | 165 (81.6) | 37 (18.4) | |
| No | 585 (62.3) | 354 (37.7) | |
| Family history of hypertension | ( | ||
| Yes | 278 (60.0) | 185 (40.0) | |
| No | 472 (69.6) | 206 (30.4) | |
| Fasting blood glucose | ( | ||
| Normal | 253 (60.3) | 166 (39.7) | |
| Prediabetes | 164 (50.3) | 162 (49.7) | |
| Diabetes | 333 (84.0) | 63 (16.0) | |
| Serum Triglycerides (mg/dL) | ( | ||
| Normal <150 | 487 (62.9) | 287 (37.1) | |
| Borderline 150–199 | 148 (75.5) | 48 (24.5) | |
| High >200 | 115 (67.2) | 56 (32.8) | |
| Serum High density lipoprotein (mg/dL) | ( | ||
| Normal | 282 (68.4) | 130 (31.6) | |
| Risk range | 468 (68.4) | 216 (31.6) | |
The number of variables for each of the models.
| Model | Number of Variables | Variables |
|---|---|---|
| Logistic regression | 7 | Gender–age–Fasting blood glucose–Triglyceride–Family history of kidney disease–History of cardiovascular disease–History of myocardial infarction |
| Naïve Bayes | 3 | Gender–age–FBG |
| Support Vector Machine | 6 | Gender–age–FBG–Family history of kidney disease–History of hypertension–History of myocardial infarction |
Note. FBG, fasting blood glucose.
Comparing the models.
| Criteria | SVM | Naïve Bayes | Logistic Regression |
|---|---|---|---|
| Sensitivity | 0.908 | 0.892 | 0.884 |
| Specificity | 0.401 | 0.428 | 0.442 |
| Positive predictive values | 0.706 | 0.712 | 0.715 |
| Negative predictive values | 0.737 | 0.715 | 0.707 |
| Accuracy | 0.71 | 0.713 | 0.713 |
| AUC (Area Under Curve) | 0.75 | 0.74 | 0.76 |
| CI 95% (AUC) | 0.70–0.80 | 0.70–0.80 | 0.71–0.80 |
Figure 1Receiver operating characteristic (Roc) curves for each of the models.