| Literature DB >> 30594138 |
Alexandros C Dimopoulos1,2, Mara Nikolaidou2, Francisco Félix Caballero3,4, Worrawat Engchuan5, Albert Sanchez-Niubo6,7, Holger Arndt8, José Luis Ayuso-Mateos3,9, Josep Maria Haro4,6, Somnath Chatterji10, Ekavi N Georgousopoulou1,11, Christos Pitsavos12, Demosthenes B Panagiotakos13,14.
Abstract
BACKGROUND: The use of Cardiovascular Disease (CVD) risk estimation scores in primary prevention has long been established. However, their performance still remains a matter of concern. The aim of this study was to explore the potential of using ML methodologies on CVD prediction, especially compared to established risk tool, the HellenicSCORE.Entities:
Keywords: Cardiovascular disease; Machine learning; Model performance; Risk prediction
Mesh:
Year: 2018 PMID: 30594138 PMCID: PMC6311054 DOI: 10.1186/s12874-018-0644-1
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1Conceptual methodology applied in the present work to evaluate two approaches for CVD risk classification, a classical, statistically oriented risk prediction tool and machine learning algorithms
Description of the dataset containing the 16 baseline variables that were measured among n = 2020 ATTICA study participants
| Variable used in ML | Male | Female |
|---|---|---|
| Age in years, mean ± SD | 46 ± 13 | 45 ± 14 |
| Smoking status at baseline, %(yes) | 44% | 37% |
| Years of school mean ± SD | 12.3 ± 3.6 | 12.0 ± 3.8 |
| MedDietScore (range 0–55), mean ± SD | 24 ± 5 | 27 ± 7 |
| Basic metabolic rate as a proxy of energy expenditure | 1783 ± 228 | 1384 ± 128 |
| Body mass index in kg/m2, mean ± SD | 27.3 ± 3.9 | 25.2 ± 4.7 |
| Diastolic blood pressure levels in mmHg, mean ± SD | 82 ± 11 | 76 ± 11 |
| Systolic blood pressure levels in mmHg, mean ± SD | 127 ± 17 | 118 ± 18 |
| History of hypertension (including medication), % | 39% | 24% |
| Glucose levels (in mg/dl), mean ± SD | 95 ± 25 | 90 ± 22 |
| History of diabetes mellitus (including medication), % | 8% | 6% |
| Total cholesterol levels (in mg/dl), mean ± SD | 197 ± 42 | 191 ± 41 |
| Triglycerides (in mg/dl), mean ± SD | 140 ± 102 | 98 ± 56 |
| History of hypercholesterolemia (including medication), % | 46% | 38% |
| Interleukin-6 levels (ng/ml), mean ± SD | 1.5 ± 0.5 | 1.4 ± 0.5 |
Fig. 2An example of a Decision Tree (DT) derived using the 5-variable dataset (age, sex, systolic blood pressure levels (SBP), total cholesterol (TC), and smoking). The thresholds for the quantitative variables (Age, TC, SBP) derived from the algorithm used to develop the DT
Performance of the three ML algorithms using the 16-variable dataset against the predicted 10-year CVD risk through the HellenicSCORE
| Algorithm | Accuracy | Specificity | Sensitivity | PPV | NPV |
|---|---|---|---|---|---|
| k-NN | 0.96 | 0.37 | 0.98 | 0.97 | 0.50 |
| RF | 0.99 | 0.79 | 1.00 | 0.99 | 0.98 |
| DT | 0.99 | 0.87 | 0.99 | 0.99 | 0.89 |
Performance of the three ML algorithms using the 16-variable dataset and of the HellenicSCORE, against the 10-year CVD (fatal or non-fatal) incidence of ATTICA study participants
| Algorithm | Accuracy | Specificity | Sensitivity | PPV | NPV |
|---|---|---|---|---|---|
| k-NN | 0.83 | 0.24 | 0.94 | 0.87 | 0.47 |
| RF | 0.84 | 0.20 | 0.96 | 0.87 | 0.46 |
| DT | 0.84 | 0.17 | 0.96 | 0.86 | 0.42 |
| HellenicSCORE | 0.85 | 0.20 | 0.97 | 0.87 | 0.58 |
Performance of the three ML algorithms using the 16-variable dataset with bootstrapping and of the HellenicSCORE, against the 10-year CVD (fatal or non-fatal) incidence of ATTICA study participants
| Algorithm | Accuracy | Specificity | Sensitivity | PPV | NPV |
|---|---|---|---|---|---|
| k-NN | 0.65 | 0.56 | 0.67 | 0.89 | 0.24 |
| RF | 0.83 | 0.46 | 0.89 | 0.90 | 0.45 |
| DT | 0.80 | 0.53 | 0.85 | 0.91 | 0.40 |
| HellenicSCORE | 0.85 | 0.20 | 0.97 | 0.87 | 0.58 |
Performance of the three ML algorithms using the 5 variables dataset and of the HellenicSCORE, against the 10-year CVD (fatal or non-fatal) incidence of ATTICA study participants
| Algorithm | Accuracy | Specificity | Sensitivity | PPV | NPV |
|---|---|---|---|---|---|
| k-NN | 0.82 | 0.21 | 0.93 | 0.86 | 0.35 |
| RF | 0.84 | 0.22 | 0.95 | 0.87 | 0.45 |
| DT | 0.84 | 0.14 | 0.97 | 0.86 | 0.49 |
| HellenicSCORE | 0.85 | 0.20 | 0.97 | 0.87 | 0.58 |
Performance of the three ML algorithms using the 5 variables dataset with bootstrapping and of the HellenicSCORE, against the 10-year CVD incidence of ATTICA study participants
| Algorithm | Accuracy | Specificity | Sensitivity | PPV | NPV |
|---|---|---|---|---|---|
| k-NN | 0.66 | 0.62 | 0.67 | 0.91 | 0.26 |
| RF | 0.79 | 0.47 | 0.85 | 0.90 | 0.37 |
| DT | 0.78 | 0.48 | 0.84 | 0.90 | 0.36 |
| HellenicSCORE | 0.85 | 0.20 | 0.97 | 0.87 | 0.58 |