| Literature DB >> 34067148 |
Yen-Chun Huang1,2, Shao-Jung Li3,4,5,6, Mingchih Chen1,2, Tian-Shyug Lee1,2, Yu-Ning Chien2,7.
Abstract
Coronary artery bypass surgery grafting (CABG) is a commonly efficient treatment for coronary artery disease patients. Even if we know the underlying disease, and advancing age is related to survival, there is no research using the one year before surgery and operation-associated factors as predicting elements. This research used different machine-learning methods to select the features and predict older adults' survival (more than 65 years old). This nationwide population-based cohort study used the National Health Insurance Research Database (NHIRD), the largest and most complete dataset in Taiwan. We extracted the data of older patients who had received their first CABG surgery criteria between January 2008 and December 2009 (n = 3728), and we used five different machine-learning methods to select the features and predict survival rates. The results show that, without variable selection, XGBoost had the best predictive ability. Upon selecting XGBoost and adding the CHA2DS score, acute pancreatitis, and acute kidney failure for further predictive analysis, MARS had the best prediction performance, and it only needed 10 variables. This study's advantages are that it is innovative and useful for clinical decision making, and machine learning could achieve better prediction with fewer variables. If we could predict patients' survival risk before a CABG operation, early prevention and disease management would be possible.Entities:
Keywords: CABG; NHIRD; National Health Insurance Research Database; feature selection; machine learning; older adults; overall survival prediction
Year: 2021 PMID: 34067148 PMCID: PMC8151160 DOI: 10.3390/healthcare9050547
Source DB: PubMed Journal: Healthcare (Basel) ISSN: 2227-9032
Figure 1Patient selection and further analysis of 3728 older adult patients who had undergone first-time coronary artery bypass surgery grafting (CABG) between 2008 and 2009.
Demographic features of older CABG adults in Taiwan from 2008 to 2009.
| Variables | ≥65 Dead | ≥65 Alive | ||||
|---|---|---|---|---|---|---|
|
| % |
| % | |||
|
| Female | 682 | 30.02 | 421 | 28.91 | 0.471 |
| Male | 1590 | 69.98 | 1035 | 71.09 | ||
| Age, mean (SD), y | 74.30 (5.60) | 71.27 (4.78) | <0.001 | |||
| Follow up years, Mean (SD) | 4.42(3.14) | 10.05 (0.57) | <0.001 | |||
| Follow up years, Median | 4.22 | 10.02 | - | |||
| CHA2DS score, mean (SD) | 4.21 (1.67) | 3.30 (1.57) | <0.001 | |||
|
| ||||||
| DM | 1477 | 65.01 | 739 | 50.76 | <0.0001 | |
| Hypertension | 624 | 27.46 | 379 | 26.03 | 0.335 | |
| Hyperlipidemia | 1522 | 66.99 | 1056 | 72.53 | <0.001 | |
| MI | 1182 | 52.02 | 560 | 38.46 | <0.001 | |
| Liver cirrhosis | 50 | 2.2 | 10 | 0.69 | <0.001 | |
| CHF | 1385 | 60.96 | 563 | 38.67 | <0.001 | |
| CAD | 2222 | 97.8 | 1435 | 98.56 | 0.098 | |
| PVD | 541 | 23.81 | 248 | 17.03 | <0.0001 | |
| Acute pancreatitis | 43 | 1.89 | 21 | 1.44 | 0.301 | |
| Malignant dysrhythmia | 104 | 4.58 | 58 | 3.98 | 0.385 | |
| Intracranial bleeding | 53 | 2.33 | 14 | 0.96 | 0.002 | |
| AF | 348 | 15.32 | 159 | 10.92 | <0.001 | |
| TIA | 951 | 41.86 | 424 | 29.12 | <0.0001 | |
| CKD | 572 | 25.18 | 129 | 8.86 | <0.0001 | |
| ACS | 1490 | 65.58 | 810 | 55.63 | <0.0001 | |
| COPD | 1043 | 45.91 | 558 | 38.32 | <0.0001 | |
| Stroke | 947 | 41.68 | 423 | 29.05 | <0.0001 | |
| Cancer | 164 | 7.22 | 66 | 4.53 | <0.001 | |
|
| 0 | 75 | 3.3 | 139 | 9.55 | <0.0001 |
| 1 | 269 | 11.84 | 330 | 22.66 | ||
| 2 | 383 | 16.86 | 362 | 24.86 | ||
| 3 | 424 | 18.66 | 239 | 16.41 | ||
| 4 | 341 | 15.01 | 165 | 11.33 | ||
| 5 | 275 | 12.1 | 115 | 7.9 | ||
| 6+ | 505 | 22.23 | 106 | 7.28 | ||
| Mean (SD) | 3.86 (2.40) | 2.59 (1.93) | <0.0001 | |||
|
| ||||||
| Anastomosis vessels, mean (SD) | 2.64 (0.72) | 2.79 (0.77) | <0.001 | |||
| Length of stay (LOS), mean (SD) | 25.59 (14.77) | 18.29 (9.15) | <0.001 | |||
| Blood transfusion, (Bag), mean (SD) | 10.89 (14.68) | 7.23 (5.31) | <0.001 | |||
| Mechanical ventilation, (Day), mean (SD) | 7.16 (13.90) | 2.76 (3.09) | <0.001 | |||
| Surgical cost | 611,701 (488,753) | 394,843 (165,389) | <0.001 | |||
|
| ||||||
| Outpatient visits, mean (SD) | 37.70 (23.34) | 32.36 (20.13) | <0.001 | |||
| Hospitalization, mean (SD) | 1.91 (1.34) | 1.45 (0.82) | <0.001 | |||
| ED visits, mean (SD) | 58 | 2.55 | 14 | 0.96 | <0.001 | |
| Blood transfusion, (Bag), mean (SD) | 3.83 (3.69) | 4.09 (4.87) | 0.636 | |||
| Mechanical ventilation, (Day), mean (SD) | 5.55 (13.48) | 3.93 (4.05) | 0.373 | |||
| Medical cost (related cardiology department), mean (SD) (thousand NT$) | 81,957 (107,098) | 60,969 (80,674) | <0.0001 | |||
| Medical cost (thousand NT$) | 155,186 (197087) | 91,439 (98,235) | <0.0001 | |||
CCIS = Charlson comorbidity index score; SD: standard deviation; ED: Emergency departmen; MI: Myocardial infarct; CHF: Congestive heart failure; CAD: Coronary artery disease; PVD: Peripheral vascular disease; AF: Atrial fibrillation; TIA: Transient ischemic attack; CKD: Chronic kidney disease; ACS: Acute coronary syndrome; COPD: Chronic obstructive pulmonary disease ; AKF: Acute kidney failure ; DM: Diabetes mellitus.
Ranking of essential variables of older CABG adults.
| Variables | LGR | RF | CART | MARS | XGBoost |
|---|---|---|---|---|---|
|
| |||||
| Blood transfusion, (Bag), mean | 1 | ||||
| Length of stay (LOS), mean | 4 | ||||
| Surgical cost | 3 | 1 | 1 | ||
|
| |||||
| ED visits, mean | 4 | 6 | |||
| Outpatient visits, mean | 15 | ||||
| Hospitalization, mean | 3 | ||||
| Mechanical ventilation, (Day), mean | 16 | 7 | 7 | ||
| Blood transfusion, (Bag), mean | 1 | ||||
| Medical cost | 8 | 6 | |||
|
| |||||
| Age | 11 | 5 | 3 | 2 | |
| CHF | 7 | 4 | 6 | 5 | |
| CKD | 7 | ||||
| ACS | 12 | ||||
| CAD | 2 | ||||
| CCI score | 9 | 2 | 3 | ||
| COPD | 11 | ||||
| PVD | 14 | ||||
| Diabetes mellitus | 5 | 5 | |||
| Renal disease | 1 | 4 | 4 | ||
| Major illness | 8 | ||||
| Ischemic stroke | 3 | ||||
| CHA2DS2 scores | 2 | ||||
| Ulcer disease | 17 | 7 | |||
| Hypertension | 6 | ||||
| Hyperlipidemia | 2 | ||||
| AKF | 13 | ||||
| Acute pancreatitis | 10 | ||||
| Connective tissue disease | 9 | 8 | |||
| Moderate or severe renal disease | 5 | 9 | 6 | ||
| Moderate or severe liver disease | 10 | ||||
Performance evaluation of prediction models on nonselection and after feature selection.
| Method | Accuracy | Kappa | Sensitivity | Specificity | AUC | |
|---|---|---|---|---|---|---|
| Overall | LGR | 0.7198 | 0.4427 | 0.6711 | 0.7939 | 0.7926 |
| RF | 0.7077 | 0.3965 | 0.7355 | 0.6655 | 0.7784 | |
| MARS | 0.7104 | 0.4294 | 0.6444 | 0.8108 | 0.7890 | |
| CART | 0.6930 | 0.3360 | 0.8111 | 0.5135 | 0.7031 | |
| XGBoost |
| 0.4394 | 0.7044 | 0.7500 | 0.7934 | |
| LGR selection | LGR | 0.6179 | 0.2752 | 0.4888 | 0.8141 | 0.6981 |
| RF |
| 0.2829 | 0.5177 | 0.7905 | 0.6912 | |
| MARS | 0.6219 | 0.2771 | 0.5088 | 0.7939 | 0.6917 | |
| CART | 0.5911 | 0.2292 | 0.4533 | 0.8006 | 0.6576 | |
| XGBoost | 0.6246 | 0.2845 | 0.5044 | 0.8074 | 0.6977 | |
| RF selection | LGR | 0.6876 | 0.3960 | 0.5866 | 0.8412 | 0.7784 |
| RF | 0.6916 | 0.3937 | 0.6244 | 0.7939 | 0.7637 | |
| MARS | 0.6890 | 0.3817 | 0.6444 | 0.7567 | 0.7675 | |
| CART | 0.6930 | 0.3360 | 0.8111 | 0.5135 | 0.7031 | |
| XGBoost |
| 0.4161 | 0.5977 | 0.8513 | 0.7790 | |
| CART selection | LGR | 0.7091 | 0.4009 | 0.7311 | 0.6756 | 0.7624 |
| RF | 0.6554 | 0.3464 | 0.5200 | 0.8614 | 0.7557 | |
| MARS | 0.7091 | 0.3954 | 0.7488 | 0.6486 | 0.7653 | |
| CART | 0.6930 | 0.3360 | 0.8111 | 0.5135 | 0.7031 | |
| XGBoost |
| 0.4062 | 0.7444 | 0.6655 | 0.7652 | |
| MARS selection | LGR | 0.6876 | 0.3960 | 0.5866 | 0.8412 | 0.7784 |
| RF | 0.6916 | 0.3937 | 0.6244 | 0.7939 | 0.7637 | |
| MARS | 0.6890 | 0.3817 | 0.6444 | 0.7567 | 0.7675 | |
| CART | 0.6930 | 0.3360 | 0.8111 | 0.5135 | 0.7031 | |
| XGBoost |
| 0.4161 | 0.5977 | 0.8513 | 0.7790 | |
| XGBoost | LGR |
| 0.4186 | 0.7444 | 0.6790 | 0.7739 |
| RF | 0.6903 | 0.3800 | 0.6600 | 0.7364 | 0.7453 | |
| MARS | 0.7131 | 0.4096 | 0.7333 | 0.6824 | 0.7683 | |
| CART | 0.6930 | 0.3360 | 0.8111 | 0.5135 | 0.7031 | |
| XGBoost | 0.7104 | 0.4212 | 0.6733 | 0.7668 | 0.7763 | |
| XGBoost | LGR | 0.6890 | 0.3937 | 0.6044 | 0.8175 | 0.7807 |
| RF | 0.7037 | 0.4008 | 0.6911 | 0.7229 | 0.7727 | |
| MARS |
| 0.4233 | 0.7600 | 0.6665 | 0.7831 | |
| CART | 0.6930 | 0.3360 | 0.8111 | 0.5135 | 0.7031 | |
| XGBoost | 0.6970 | 0.4069 | 0.6200 | 0.8141 | 0.7845 | |
| MARS selection | LGR | 0.6916 | 0.3964 | 0.6155 | 0.8074 | 0.7780 |
| RF | 0.6836 | 0.3806 | 0.6088 | 0.7972 | 0.7629 | |
| MARS | 0.7024 | 0.3998 | 0.6844 | 0.7297 | 0.7722 | |
| CART | 0.6930 | 0.3360 | 0.8111 | 0.5135 | 0.7031 | |
| XGBoost |
| 0.4190 | 0.6600 | 0.7804 | 0.7806 |
Abbreviations: LGR: logistic regression; RF: random forest; CART: classification and regression tree; MARS: multivariate adaptive regression splines; AUC: area under the curve; XGBoost: extreme gradient boosting.