| Literature DB >> 36046058 |
Chuang Zhang1, Qiongchan Guan1, Jie Qin1, Daochao Huang1, Jinhong Wu2.
Abstract
The purpose of this study was to explore the establishment of an auxiliary scoring model for patients with acute pulmonary embolism (APE) complicated with atrial fibrillation (AF) based on random forest (RF) and its application effect. A retrospective analysis was performed on the general data, underlying diseases, laboratory indicators, and cardiac indicators of 100 patients with APE admitted to our hospital from 2018 to 2021. The occurrence of atrial fibrillation in patients with pulmonary embolism was taken as a categorical variable, and the general data, underlying diseases, laboratory indicators, and cardiac indicators were taken as input variables. Then, the risk auxiliary scoring model for patients with APE complicated with AF was established based on RF and logistic regression. Finally, the accuracy, sensitivity, specificity, recall rate, accuracy, F1 value, and the receiver operator characteristic (ROC) curve were used to evaluate the predictive value of the models. After statistical analysis, the optimal node value was 3 and the optimal number of decision trees was 500 in the RF model. The importance of predictors in descending order were Hcy, diabetes mellitus, FT3 level, UA level, left atrial diameter, hypertension, and smoking history. The prediction accuracy of the RF model was 0.934, sensitivity 0.966, specificity 0.876, recall rate 0.9660, accuracy 0.934, and F1 value 0.950. The logistic regression model prediction accuracy was 0.816, sensitivity 0.915, specificity 0.125, recall rate 0.902, accuracy 0.811, and F1 value 0.896. The RF model and logistic regression prediction model AUC values were 0.984 and 0.883, respectively. From this, we conclude that the RF model was better than the logistic regression model in predicting AF in APE patients. So, the RF model had the clinical application value.Entities:
Year: 2022 PMID: 36046058 PMCID: PMC9424024 DOI: 10.1155/2022/2596839
Source DB: PubMed Journal: Emerg Med Int ISSN: 2090-2840 Impact factor: 1.621
Univariate analysis of risk factors for AF in patients with AP ( ± s/n (%)).
| Variable | AF group ( | Non-AF group ( |
|
|
|---|---|---|---|---|
|
| ||||
| Age (year) | 51.31 ± 3.50 | 50.59 ± 3.20 | 0.940 | 0.350 |
| Sex (male/female) | 14 (58.33)/10 (41.67) | 43 (56.58)/33 (43.42) | ||
| Height (cm) | 166.42 ± 3.06 | 167.42 ± 2.39 | 1.666 | 0.099 |
| Time from onset to admission ( | 7.66 ± 1.79 | 7.96 ± 1.15 | 0.965 | 0.337 |
| BMI (kg/m2) | 21.53 ± 1.12 | 21.47 ± 1.22 | 0.214 | 0.831 |
| Obesity (yes/no) | 5 (20.83)/19 (79.17) | 17 (22.37)/59 (77.63) | 0.025 | 0.874 |
| Pregnant women (yes/no) | 3 (12.50)/21 (87.50) | 5 (6.58)/71 (93.42) | 0.869 | 0.351 |
| Risk degree of disease (low risk/moderate risk/high risk) | 3 (12.50)/9 (37.50)/12 (50.00) | 11 (14.47)/27 (35.53)/38 (50.00) | 0.070 | 0.965 |
| AF history (have/not have) | 6 (25.00)/18 (75.00) | 18 (23.68)/58 (76.32) | 0.017 | 0.895 |
| APE (first confirmed/not first confirmed) | 20 (83.33)/4 (16.67) | 69 (90.79)/7 (9.21) | 1.036 | 0.309 |
| Drinking history (have/not have) | 7 (29.17)/17 (70.83) | 26 (34.21)/50 (65.79) | 0.210 | 0.647 |
| Smoking history (have/not have) | 8 (33.33)/16 (66.67) | 7 (9.21)/69 (90.79) | 8.325 | 0.004 |
|
| ||||
|
| ||||
| Diabetes (have/not have) | 12 (50.00)/12 (50.00) | 8 (10.53)/68 (89.47) | 17.763 | <0.001 |
| Hypertension (have/not have) | 12 (50.00)/12 (50.00) | 15 (19.74)/61 (80.26) | 8.476 | 0.004 |
| Hyperlipidemia (have/not have) | 6 (25.00)/18 (75.00) | 17 (22.37)/59 (77.63) | 0.071 | 0.789 |
|
| ||||
|
| ||||
| D-Dimer ( | 234.66 ± 30.36 | 235.18 ± 28.27 | 0.077 | 0.939 |
| Albumin (g/L) | 33.45 ± 4.86 | 33.27 ± 5.03 | 0.154 | 0.878 |
| Hcy ( | 18.13 ± 3.34 | 15.85 ± 2.59 | 3.497 | <0.001 |
| UA (mmol/L) | 446.26 ± 35.81 | 415.86 ± 44.87 | 3.025 | 0.003 |
| Creatinine ( | 80.26 ± 7.75 | 82.16 ± 6.41 | 1.202 | 0.232 |
| FT3 (mmol/L) | 3.85 ± 0.90 | 4.26 ± 0.85 | 2.031 | 0.044 |
| FT4 (mmol/L) | 17.55 ± 3.41 | 17.46 ± 3.35 | 0.114 | 0.909 |
|
| ||||
|
| ||||
| Ventricular rate (time/min) | 83.26 ± 15.39 | 82.88 ± 16.42 | 0.100 | 0.920 |
| LVEF (%) | 45.58 ± 12.25 | 45.35 ± 8.64 | 0.102 | 0.919 |
| Left atrial internal diameter (mm) | 40.00 ± 4.38 | 37.51 ± 4.18 | 2.515 | 0.014 |
| Right atrial internal meridian (mm) | 47.39 ± 5.52 | 47.28 ± 5.41 | 0.086 | 0.931 |
Note. BMI, body mass index; AF, atrial fibrillation; APE, acute pulmonary embolism; Hcy, homocysteine; UA, uric acid; FT3, free triiodothyronine; FT4, free tetraiodothyronine; LVEF, left ventricular ejection fraction.
Variable assignment.
| Independent variable | Assignment |
|---|---|
| Smoking history | have = 1, not have = 0 |
| Diabetes | have = 1, not have = 0 |
| Hypertension | have = 1, not have = 0 |
| Hcy | Continuous variable |
| UA | Continuous variable |
| FT3 | Continuous variable |
| Left atrial diameter | Continuous variable |
Multivariate logistic regression analysis of APE patients complicated with AF.
| Variable |
| SE | Wald |
| OR | 95% CI | |
|---|---|---|---|---|---|---|---|
| Lower limit | Upper limit | ||||||
| Smoking history | 2.785 | 1.078 | 6.676 | 0.010 | 16.196 | 1.959 | 133.908 |
| Diabetes | 1.758 | 0.773 | 5.172 | 0.023 | 5.800 | 1.275 | 26.383 |
| Hypertension | 1.882 | 0.772 | 5.941 | 0.015 | 6.567 | 1.446 | 29.832 |
| Hcy | 0.024 | 0.009 | 6.540 | 0.011 | 1.025 | 1.006 | 1.044 |
| UA | 0.159 | 0.079 | 4.091 | 0.043 | 1.172 | 1.005 | 1.367 |
| FT3 | 0.377 | 0.150 | 6.303 | 0.012 | 1.457 | 1.086 | 1.956 |
| Left atrial diameter | −0.932 | 0.435 | 4.594 | 0.032 | 0.394 | 0.168 | 0.923 |
| Constant | −22.071 | 7.021 | 9.881 | 0.002 | <0.001 | — | — |
Figure 1Prediction model of relationship between model error and number of random trees.
Mean decrease Gini of RF model variables.
| Variable | Mean decrease Gini |
|---|---|
| Hcy | 6.290 |
| Diabetes mellitus | 4.115 |
| FT3 | 3.859 |
| UA | 3.646 |
| Left atrial internal meridian | 3.600 |
| High blood pressure | 2.390 |
| Smoking history | 1.755 |
Figure 2Order of importance of modeling variables.
Comparison of prediction performance between the two models.
| Model | Accuracy | Sensitivity | Specificity | Recall rate | Accuracy | F1 value |
|---|---|---|---|---|---|---|
| RF model | 0.934 | 0.966 | 0.876 | 0.966 | 0.934 | 0.950 |
| Logistic regression model | 0.816 | 0.915 | 0.125 | 0.902 | 0.811 | 0.896 |
Figure 3ROC curve based on the RF prediction model.
Figure 4ROC curve based on the logistic prediction model.