| Literature DB >> 27594895 |
P Unnikrishnan1, D K Kumar1, S Poosapadi Arjunan1, H Kumar2, P Mitchell3, R Kawasaki4.
Abstract
Current methods of cardiovascular risk assessment are performed using health factors which are often based on the Framingham study. However, these methods have significant limitations due to their poor sensitivity and specificity. We have compared the parameters from the Framingham equation with linear regression analysis to establish the effect of training of the model for the local database. Support vector machine was used to determine the effectiveness of machine learning approach with the Framingham health parameters for risk assessment of cardiovascular disease (CVD). The result shows that while linear model trained using local database was an improvement on Framingham model, SVM based risk assessment model had high sensitivity and specificity of prediction of CVD. This indicates that using the health parameters identified using Framingham study, machine learning approach overcomes the low sensitivity and specificity of Framingham model.Entities:
Mesh:
Year: 2016 PMID: 27594895 PMCID: PMC4993959 DOI: 10.1155/2016/3016245
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
10-year risk of CVD in the Blue Mountains Eye Study (BMES) for the 10 parameters.
| Factor | Persons developed CVD in 10-year follow-up | Persons without CVD in 10-year follow-up |
|---|---|---|
| Gender M (F) | 268 (267) | 688 (1183) |
| Current smoker (past smoker or nonsmoker) | 87 (448) | 241 (1630) |
| Total cholesterol (high > 13.2/borderline (11–13.2)/normal < 11) (mmol/L) [ | 218/197/120 | 752/775/344 |
| High-density lipoprotein cholesterol level (high > 3.3/borderline 2.2–3.3/low < 2.2) (mmol/L) [ | 154/170/211 | 659/665/547 |
| Systolic blood pressure (high > 120/normal 90–120/low < 90) (mmHg) [ | 345/190/0 | 941/930/0 |
| Diastolic blood pressure (high > 80/normal 60–80/low < 60) (mmHg) [ | 97/435/3 | 316/1554/1 |
| Body mass index (low < 18.5/normal 18.5–24.9/high > 25) (kg/m2) [ | 10/222/303 | 33/766/1072 |
| Diabetes (yes/no) | 51/484 | 113/1758 |
| Medication for hypertension (yes/no) | 196/339 | 526/1345 |
Coefficients in the Framingham risk estimation for 10-year general cardiovascular disease risk [1].
| Men | Women | |||||||
|---|---|---|---|---|---|---|---|---|
| Beta |
| Hazard ratio | 95% CI | Beta |
| Hazard ratio | 95% CI | |
| Log of age | 3.11296 | <0.0001 | 22.49 | (14.80, 34.16) | 2.72107 | <0.0001 | 15.20 | (8.59, 26.87) |
| Log of body mass index | 0.79277 | <0.0066 | 2.21 | (1.25, 3.91) | 0.51125 | <0.0609 | 1.67 | (0.98, 2.85) |
| Log of SBP if not treated | 1.85508 | <0.0001 | 6.39 | (3.61, 11.33) | 2.81291 | <0.0001 | 16.66 | (8.27, 33.54) |
| Log of SBP if treated | 1.92672 | <0.0001 | 6.87 | (3.90, 12.08) | 2.88267 | <0.0001 | 17.86 | (8.97, 35.57) |
| Smoking | 0.70953 | <0.0001 | 2.03 | (1.75, 2.37) | 0.61868 | <0.0001 | 1.86 | (1.53, 2.25) |
| Diabetes | 0.53160 | <0.0001 | 1.70 | (1.37, 2.11) | 0.77763 | <0.0001 | 2.18 | (1.63, 2.91) |
The 10-year risk for women can be calculated as 1 − 0.94833exp(Σ, where β is the regression coefficient and X is the level for each risk factor; the risk for men is given as 1 − 0.88431exp(Σ.
Estimated regression coefficient.
Potential risk features ranked by weights obtained using support vector machine (SVM) feature selection [20], Blue Mountains Eye Study 10-year follow-up data.
| Rank | Attribute | SVM weight |
|---|---|---|
| 1 | Age (per 1 year) | 3.21660913 |
| 2 | Body mass index (per 1 kg/m2) | 0.15610062 |
| 3 | Current smoker (past/never smoked) | 0.06839195 |
| 4 | Gender (male/female) | 0.05784681 |
| 5 | Total cholesterol (per 1 mmol/L) | 0.04203396 |
| 6 | Systolic blood pressure (per 1 mmHg) | 0.01872727 |
| 7 | High-density lipoprotein cholesterol (per 1 mmol/L) | 0.01231242 |
| 8 | Diabetes (versus no diabetes) | 0.00610169 |
| 9 | Medication for hypertension (versus no medication for hypertension) | 0.00104436 |
| 10 | Retinopathy (yes/no) | 0.00064500 |
| 11 | Diastolic blood pressure (per 1 mmHg) | 0.00023068 |
Regression coefficients and associated statistics obtained from BMES dataset for male and female subjects.
| Feature | Male | Female | ||||||
|---|---|---|---|---|---|---|---|---|
| Coefficient |
| Odds ratio | 95% confidence interval | Coefficient |
| Odds ratio | 95% confidence interval | |
| Age (per 1 year) | 0.0144 | <0.00001 | 1.015 | (1.012, 1.017) | 0.0110 | <0.00001 | 1.011 | (1.009, 1.012) |
| Body mass index (per 1 kg/m2) | 0.0084 | 0.0024 | 1.008 | (1.002, 1.013) | 0.0018 | 0.2317 | 1.002 | (0.998, 1.004) |
| Current smoker (versus past or never smoker) | 0.0911 | 0.0005 | 1.095 | (1.042, 1.153) | 0.0749 | 0.0003 | 1.078 | (1.034, 1.122) |
| Systolic blood pressure (per 1 mmHg) | 0.0003 | 0.5700 | 1.000 | (0.999, 1.001) | 0.0003 | 0.3416 | 1.000 | (0.999, 1.0009) |
| Medication for hypertension (versus no medication for hypertension) | −0.0042 | 0.8589 | 0.995 | (0.953, 1.039) | 0.0218 | 0.1521 | 1.022 | (0.992, 1.0219) |
| Diabetes (versus no diabetes) | 0.0460 | 0.1834 | 1.047 | (0.979, 1.119) | 0.0080 | 0.7852 | 1.008 | (0.951, 1.067) |
| Total cholesterol (per 1 mmol/L) | 0.0131 | 0.1615 | 1.013 | (0.995, 1.032) | 0.0016 | 0.8052 | 1.002 | (0.989, 1.014) |
| High-density lipoprotein cholesterol (per 1 mmol/L) | 0.0202 | 0.4524 | 1.020 | (0.969, 1.074) | 0.0046 | 0.7828 | 1.005 | (0.972, 1.037) |
Logistic regression constant β 0 for male = −5.70203; logistic regression constant β 0 for female = −5.30218.
Confusion matrix using Framingham equation (FEq).
| Test negative | Test positive | Not classifiable | Total | |
|---|---|---|---|---|
| No cardiovascular disease | 252 | 108 | 46 | 406 |
| Cardiovascular disease | 37 | 40 | 27 | 104 |
|
| ||||
|
|
|
|
|
|
Confusion matrix using logistic regression analysis (LRA).
| Test negative | Test positive | Not classifiable | Total | |
|---|---|---|---|---|
| No cardiovascular disease | 338 | 68 | 0 | 406 |
| Cardiovascular disease | 54 | 50 | 0 | 104 |
|
| ||||
|
|
|
|
|
|
Confusion matrix using support vector machine (SVM).
| Test negative | Test positive | Not classifiable | Total | |
|---|---|---|---|---|
| No cardiovascular disease | 349 | 57 | 0 | 382 |
| Cardiovascular disease | 33 | 71 | 0 | 128 |
|
| ||||
|
|
|
|
|
|
Sensitivity and specificity for SVM, Framingham model, and logistic regression model with diagnostic odds ratio.
| Parameter | Model based on SVM classifiers | Framingham risk model | LRA model | |||
|---|---|---|---|---|---|---|
| Value | 95% CI | Value | 95% CI | Value | 95% CI | |
| Sensitivity | 0.682 | 0.589 to 0.764 | 0.52 | 0.4096 to 0.6275 | 0.48 | 0.3817 to 0.5809 |
| Specificity | 0.859 | 0.8224 to 0.89 | 0.70 | 0.6508 to 0.745 | 0.83 | 0.7926 to 0.8675 |
| Positive likelihood ratio | 4.863 | 3.697 to 6.396 | 1.73 | 1.326 to 2.261 | 2.87 | 2.14 to 3.85 |
| Negative likelihood ratio | 0.369 | 0.278 to 0.491 | 0.69 | 0.539 to 0.871 | 0.62 | 0.52 to 0.75 |
| Diagnostic odds ratio | 13.173 | 7.999 to 21.696 | 2.523 | 1.529 to 4.162 | 4.602 | 2.892 to 7.324 |
| AUC | 0.71 | 0.57 | 0.63 | |||
Figure 1ROC graph for SVM, LRA, and FEq.