| Literature DB >> 34996990 |
Min Kim1, Younghyun Kang2, Seng Chan You3, Hyung-Deuk Park2, Sang-Soo Lee2, Tae-Hoon Kim4, Hee Tae Yu4, Eue-Keun Choi5, Hyoung-Seob Park6, Junbeom Park7, Young Soo Lee8, Ki-Woon Kang9, Jaemin Shim10, Jung-Hoon Sung11, Il-Young Oh12, Jong Sung Park13, Boyoung Joung14.
Abstract
To assess the utility of machine learning (ML) algorithms in predicting clinically relevant atrial high-rate episodes (AHREs), which can be recorded by a pacemaker. We aimed to develop ML-based models to predict clinically relevant AHREs based on the clinical parameters of patients with implanted pacemakers in comparison to logistic regression (LR). We included 721 patients without known atrial fibrillation or atrial flutter from a prospective multicenter (11 tertiary hospitals) registry comprising all geographical regions of Korea from September 2017 to July 2020. Predictive models of clinically relevant AHREs were developed using the random forest (RF) algorithm, support vector machine (SVM) algorithm, and extreme gradient boosting (XGB) algorithm. Model prediction training was conducted by seven hospitals, and model performance was evaluated using data from four hospitals. During a median follow-up of 18 months, clinically relevant AHREs were noted in 104 patients (14.4%). The three ML-based models improved the discrimination of the AHREs (area under the receiver operating characteristic curve: RF: 0.742, SVM: 0.675, and XGB: 0.745 vs. LR: 0.669). The XGB model had a greater resolution in the Brier score (RF: 0.008, SVM: 0.008, and XGB: 0.021 vs. LR: 0.013) than the other models. The use of the ML-based models in patient classification was associated with improved prediction of clinically relevant AHREs after pacemaker implantation.Entities:
Mesh:
Year: 2022 PMID: 34996990 PMCID: PMC8741914 DOI: 10.1038/s41598-021-03914-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Study population selection process.
Figure 2Flow diagram for the modeling process.
Baseline characteristics of the patients with and without clinically relevant AHREs.
| Variables | Total (n = 721) | Clinically relevant AHREs (-) (n = 617) | Clinically relevant AHREs ( +) (n = 104) | |
|---|---|---|---|---|
| Age, (years) | 73 (65, 79) | 73 (65, 78) | 74 (68, 80) | 0.155 |
| Male, n (%) | 285 (39.5) | 242 (39.2) | 43 (41.3) | 0.763 |
| Body mass index, (kg/m2) | 24.1 (22.2, 26.2) | 24.2 (22.2, 26.2) | 23.9 (22.1, 26.1) | 0.566 |
| Smoking, n (%) | ||||
| Former/Current | 90 (12.5) | 78 (12.6) | 12 (11.5) | 0.877 |
| Alcohol, n (%) | 93 (12.9) | 80 (13.0) | 13 (12.5) | 1.000 |
| Former/Current | ||||
| Heart failure, n (%) | 26 (3.6) | 23 (3.7) | 3 (2.9) | 1.000 |
| Hypertension, n (%) | 484 (67.1) | 416 (67.4) | 68 (65.4) | 0.767 |
| Diabetes, n (%) | 196 (27.2) | 173 (28.0) | 23 (22.1) | 0.256 |
| Prior stroke/TIA, n (%) | 80 (11.1) | 64 (10.4) | 16 (15.4) | 0.181 |
| Vascular disease, n (%) | 71 (9.8) | 58 (9.4) | 13 (12.5) | 0.422 |
| Dyslipidemia, n (%) | 222 (30.8) | 197 (31.9) | 25 (24.0) | 0.134 |
| Chronic kidney disease, n (%) | 59 (8.2) | 50 (8.1) | 9 (8.7) | 1.000 |
| 3 (2, 4) | 3 (2, 4) | 3 (2, 4) | 0.454 | |
| 0.955 | ||||
| 0, n (%) | 21 (2.9) | 18 (2.9) | 3 (2.9) | |
| 1, n (%) | 104 (14.4) | 90 (14.6) | 14 (13.5) | |
| ≥ 2, n (%) | 596 (82.7) | 509 (82.5) | 87 (83.7) | |
| Pacemaker indication | < 0.001 | |||
| Sick sinus syndrome, n (%) | 312 (43.3) | 248 (40.2) | 64 (61.5) | |
| AV block, n (%) | 409 (56.7) | 369 (59.8) | 40 (38.5) | |
| Baseline systolic blood pressure, (mmHg) | 135 (121, 148) | 135 (122, 148) | 133 (120, 145) | 0.174 |
| Baseline diastolic blood pressure, (mmHg) | 71 (64, 80) | 71 (64, 80) | 72 (63, 80) | 0.735 |
| Baseline heart rate, (/min) | 60 (50, 72) | 60 (50, 72) | 60 (50, 72) | 0.954 |
| Baseline eGFR, (mL/min/1.73 m2) | 78.0 (62.0, 91.0) | 77.0 (63.0, 92.0) | 80.5 (62.0, 88.3) | 0.894 |
| QRS duration, (ms) | 106 (90, 142) | 108 (90, 144) | 98 (88, 134) | 0.038 |
| QTc interval, (ms) | 455 (422, 488) | 455 (423, 488) | 451 (420, 477) | 0.301 |
| LA diameter, (mm) | 40 (36, 45) | 40 (35, 45) | 42 (37, 45) | 0.094 |
| LVEF, (%) | 65 (60, 70) | 65 (60, 70) | 65 (60, 70) | 0.868 |
| APC > 1% at pre-implantation, n (%) | 53 (7.4) | 38 (6.2) | 15 (14.4) | 0.005 |
| VPC > 1% at pre-implantation, n (%) | 49 (6.8) | 39 (6.3) | 10 (9.6) | 0.306 |
| ARB/ACEi, n (%) | 312 (43.3) | 273 (44.2) | 39 (37.5) | 0.239 |
| Beta adrenergic receptor blocker, n (%) | 112 (15.5) | 84 (13.6) | 28 (26.9) | 0.001 |
| Calcium channel blocker, n (%) | 224 (31.1) | 194 (31.4) | 30 (28.8) | 0.678 |
| Statin, n (%) | 306 (42.4) | 267 (43.3) | 39 (37.5) | 0.320 |
| Diuretics, n (%) | 162 (22.5) | 139 (22.5) | 23 (22.1) | 1.000 |
The data are presented as number (%), median [IQR].
ACEi angiotensin-converting-enzyme inhibitor, AHREs atrial high-rate episodes, APC atrial premature complex, ARB angiotensin receptor blocker, AV atrioventricular, GFR glomerular filtration rate, LA left atrium, LVEF left ventricular ejection fraction, QTc corrected QT, TIA transient ischemic attack, VPC ventricular premature complex.
*The CHA2DS2-VAS score is a measure of the risk of stroke in patients with atrial fibrillation, with scoring ranging from 0 to 9 and higher scores indicating greater risk. Congestive heart failure, hypertension, age 75 years or older (doubled), diabetes, stroke (doubled), vascular disease, age 65 to 74 years, sex category (female).
Performance characteristics of models in the validation set for predicting clinically relevant AHREs in patients with pacemaker.
| Logistic regression | RF | SVM | XGB | |
|---|---|---|---|---|
| AUROC (95% CI) | 0.669 (0.536–0.803) | 0.742 (0.637–0.835) | 0.675 (0.561–0.789) | 0.745 (0.631–0.847) |
| AUPRC (95% CI) | 0.182 (0.104–0.274) | 0.224 (0.119–0.397) | 0.182 (0.102–0.337) | 0.240 (0.125–0.424) |
| F1 score (95% CI) | 0.853 (0.783–0.881) | 0.888 (0.845–0.925) | 0.865 (0.821–0.905) | 0.896 (0.857–0.931) |
| Accuracy (95% CI) | 0.753 (0.677–0.819) | 0.805 (0.734–0.865) | 0.773 (0.698–0.836) | 0.818 (0.748–0.876) |
| Sensitivity (95% CI) | 0.815 (0.739–0.876) | 0.881 (0.815–0.931) | 0.830 (0.755–0.889) | 0.889 (0.823–0.936) |
| Specificity (95% CI) | 0.316 (0.126–0.566) | 0.263 (0.091–0.512) | 0.368 (0.163–0.616) | 0.316 (0.126–0.566) |
| PPV (95% CI) | 0.194 (0.074–0.375) | 0.238 (0.082–0.472) | 0.233 (0.099–0.423) | 0.286 (0.113–0.522) |
| NPV (95% CI) | 0.894 (0.824–0.943) | 0.895 (0.830–0.941) | 0.903 (0.837–0.949) | 0.902 (0.839–0.947) |
| Overall | 0.181 | 0.138 | 0.158 | 0.141 |
| Reliability | 0.086 | 0.038 | 0.058 | 0.054 |
| Resolution | 0.013 | 0.008 | 0.008 | 0.021 |
| Uncertainty | 0.108 | 0.108 | 0.108 | 0.108 |
AHREs atrial high-rate episodes, AUPRC area under the precision-recall curve, AUROC area under receiver operating characteristic, CI confidence interval, NPV negative predictive value, PPV positive predictive value, RF random forest, SVM support vector machine, XGB extreme gradient boosting.
Figure 3Receiver operating characteristic curve analysis (A) and precision–recall curve analysis (B) for each model.
Feature importance in the three machine learning-based models.
| Rank | Random Forest | Support Vector Machine | eXtreme Gradient Boosting | |||
|---|---|---|---|---|---|---|
| Features | Importance* | Features | Importance* | Features | Importance* | |
| 1 | Prior stroke/TIA | 0.184 | Beta adrenergic receptor blockers | 0.254 | Beta adrenergic receptor blockers | 0.306 |
| 2 | Beta adrenergic receptor blockers | 0.178 | Prior stroke/TIA | 0.253 | Prior stroke/TIA | 0.158 |
| 3 | > 1% APC on Holter, pre-implantation | 0.145 | > 1% APC on Holter, pre-implantation | 0.220 | LA diameter | 0.132 |
| 4 | Dyslipidemia | 0.111 | LA diameter | 0.088 | Age | 0.105 |
| 5 | LA diameter | 0.104 | Baseline systolic blood pressure | 0.063 | Baseline systolic blood pressure | 0.075 |
| 6 | QRS duration | 0.101 | Dyslipidemia | 0.044 | QRS duration | 0.073 |
| 7 | Baseline systolic blood pressure | 0.085 | Pacemaker indication | 0.040 | > 1% APC on Holter, pre-implantation | 0.071 |
| 8 | Age | 0.057 | QRS duration | 0.030 | Dyslipidemia | 0.030 |
| 9 | Pacemaker indication | 0.030 | Diabetes | 0.005 | Diabetes | 0.028 |
| 10 | Diabetes | 0.006 | Age | 0.003 | Pacemaker indication | 0.023 |
AAD average absolute deviation, APC atrial premature contraction, CI confidence interval, TIA transient ischemic attack, LA left atrium.
* Feature importance was measured from sensitivity analysis.
Figure 4Feature importance index plot for each model.