| Literature DB >> 34967381 |
Lv Zheng1, Lv Wen1, Wang Lei1, Zhang Ning2.
Abstract
ABSTRACT: Exploring candidate markers to predict the clinical outcomes of pulmonary infection in stroke patients have a high unmet need. This study aimed to develop machine learning (ML)-based predictive models for pulmonary infection.Between January 2008 and April 2021, a retrospective analysis of 1397 stroke patients who had CT angiography from skull to diaphragm (including CT of the chest) within 24 hours of symptom onset. A total of 21 variables were included, and the prediction model of pulmonary infection was established by multiple ML-based algorithms. Risk factors for pulmonary infection were determined by the feature selection method. Area under the curve (AUC) and decision curve analysis were used to determine the model with the best resolution and to assess the net clinical benefits associated with the use of predictive models, respectively.A total of 889 cases were included in this study as a training group, while 508 cases were as a validation group. The feature selection indicated the top 6 predictors were procalcitonin, C-reactive protein, soluble interleukin-2 receptor, consciousness disorder, dysphagia, and invasive procedure. The AUCs of the 5 models ranged from 0.78 to 0.87 in the training cohort. When the ML-based models were applied to the validation set, the results also remained reconcilable, and the AUC was between 0.891 and 0.804. The decision curve analysis also showed performed better than positive line and negative line, indicating the favorable predictive performance and clinical values of the models.By incorporating clinical characteristics and systemic inflammation markers, it is feasible to develop ML-based models for the presence and consequences of signs of pulmonary infection in stroke patients, and the use of the model may be greatly beneficial to clinicians in risk stratification and management decisions.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34967381 PMCID: PMC8718201 DOI: 10.1097/MD.0000000000028439
Source DB: PubMed Journal: Medicine (Baltimore) ISSN: 0025-7974 Impact factor: 1.889
Figure 1The flow chart of patient selection and data process.
Clinical and serological characteristics of stroke patients with or without pulmonary infection.
| Training set | Validation set | ||||||
| Variables | Overall (N = 889) | Noninfection (N = 645) | Infection (N = 244) | Overall (N = 508) | Noninfection (N = 363) | Infection (N = 145) |
|
| Age (median [IQR]) | 58.00 [47.00, 68.00] | 58.00 [47.00, 69.00] | 57.00 [47.00, 65.25] | 58.00 [47.00, 67.25] | 58.00 [48.00, 68.00] | 57.00 [47.00, 65.00] | .21 |
| BMI (median [IQR]) | 23.00 [21.00, 25.00] | 24.00 [22.00, 26.00] | 22.00 [20.00, 24.00] | 23.00 [21.00, 25.00] | 24.00 [22.00, 26.00] | 22.00 [20.00, 24.00] | <.01 |
| Sex (%) | |||||||
| Female | 462 (52.0) | 348 (54.0) | 114 (46.7) | 269 (53.0) | 200 (55.1) | 69 (47.6) | .15 |
| Male | 427 (48.0) | 297 (46.0) | 130 (53.3) | 239 (47.0) | 163 (44.9) | 76 (52.4) | |
| Smoking (%) | |||||||
| Yes | 444 (49.9) | 330 (51.2) | 114 (46.7) | 272 (53.5) | 198 (54.5) | 74 (51.0) | |
| No | 445 (50.1) | 315 (48.8) | 130 (53.3) | 236 (46.5) | 165 (45.5) | 71 (49.0) | .53 |
| Diabetes (%) | |||||||
| Yes | 514 (57.8) | 436 (67.6) | 78 (32.0) | 291 (57.3) | 246 (67.8) | 45 (31.0) | |
| No | 375 (42.2) | 209 (32.4) | 166 (68.0) | 217 (42.7) | 117 (32.2) | 100 (69.0) | <.01 |
| Hypertension (%) | |||||||
| Yes | 421 (47.4) | 304 (47.1) | 117 (48.0) | 249 (49.0) | 174 (47.9) | 75 (51.7) | |
| No | 468 (52.6) | 341 (52.9) | 127 (52.0) | 259 (51.0) | 189 (52.1) | 70 (48.3) | .51 |
| Coronary heart disease (%) | |||||||
| Yes | 442 (49.7) | 316 (49.0) | 126 (51.6) | 256 (50.4) | 185 (51.0) | 71 (49.0) | |
| No | 447 (50.3) | 329 (51.0) | 118 (48.4) | 252 (49.6) | 178 (49.0) | 74 (51.0) | .75 |
| Hyperlipidemia (%) | |||||||
| Yes | 443 (49.8) | 318 (49.3) | 125 (51.2) | 255 (50.2) | 192 (52.9) | 63 (43.4) | |
| No | 446 (50.2) | 327 (50.7) | 119 (48.8) | 253 (49.8) | 171 (47.1) | 82 (56.6) | .06 |
| Consciousness disorder (%) | |||||||
| Yes | 304 (34.2) | 132 (20.5) | 172 (70.5) | 194 (38.2) | 93 (25.6) | 101 (69.7) | |
| No | 585 (65.8) | 513 (79.5) | 72 (29.5) | 314 (61.8) | 270 (74.4) | 44 (30.3) | <.01 |
| Dysphagia (%) | |||||||
| Yes | 286 (32.2) | 87 (13.5) | 199 (81.6) | 176 (34.6) | 60 (16.5) | 116 (80.0) | |
| No | 603 (67.8) | 558 (86.5) | 45 (18.4) | 332 (65.4) | 303 (83.5) | 29 (20.0) | <.01 |
| Invasive procedure (%) | |||||||
| Yes | 356 (40.0) | 157 (24.3) | 199 (81.6) | 214 (42.1) | 100 (27.5) | 114 (78.6) | |
| No | 533 (60.0) | 488 (75.7) | 45 (18.4) | 294 (57.9) | 263 (72.5) | 31 (21.4) | <.01 |
| Time to ambulation (%) | |||||||
| >7 d | 308 (34.6) | 120 (18.6) | 188 (77.0) | 168 (33.1) | 63 (17.4) | 105 (72.4) | |
| ≤7 d | 581 (65.4) | 525 (81.4) | 56 (23.0) | 340 (66.9) | 300 (82.6) | 40 (27.6) | <.01 |
| WBC (median [IQR]) | 9.89 [8.43, 11.32] | 9.12 [8.08, 10.50] | 12.23 [10.80, 13.91] | 9.95 [8.42, 11.35] | 9.19 [8.07, 10.37] | 12.39 [10.83, 14.14] | <.01 |
| CRP (median [IQR]) | 20.18 [16.89, 23.12] | 18.68 [16.02, 21.31] | 27.19 [22.95, 30.39] | 20.44 [17.11, 23.41] | 18.82 [16.07, 21.61] | 27.53 [23.50, 30.79] | <.01 |
| PCT (median [IQR]) | 2.12 [1.70, 2.48] | 1.89 [1.58, 2.19] | 3.44 [2.87, 3.99] | 2.13 [1.67, 2.60] | 1.88 [1.56, 2.20] | 3.49 [2.78, 3.99] | <.01 |
| SIL-2R (median [IQR]) | 367.00 [321.00, 400.00] | 342.00 [308.00, 377.00] | 438.50 [394.00, 482.00] | 368.00 [320.75, 403.00] | 338.00 [308.00, 381.00] | 440.00 [407.00, 488.00] | <.01 |
| NC (median [IQR]) | 3.21 [2.54, 3.87] | 2.88 [2.31, 3.47] | 4.66 [3.78, 5.27] | 3.25 [2.55, 3.89] | 2.83 [2.33, 3.48] | 4.63 [3.87, 5.27] | <.01 |
| LC (median [IQR]) | 1.81 [1.35, 2.20] | 1.62 [1.20, 2.03] | 2.21 [1.82, 2.50] | 1.78 [1.34, 2.22] | 1.59 [1.19, 2.06] | 2.22 [1.79, 2.52] | <.01 |
| PLT (median [IQR]) | 176.00 [116.00, 232.00] | 177.00 [119.00, 237.00] | 172.00 [111.50, 225.25] | 179.00 [116.75, 237.25] | 187.00 [122.50, 244.50] | 162.00 [108.00, 218.00] | <.01 |
| PLR (median [IQR]) | 97.00 [64.80, 138.33] | 109.48 [73.26, 152.73] | 76.71 [51.16, 102.09] | 99.60 [63.38, 142.08] | 114.85 [75.99, 163.97] | 72.43 [47.62, 101.36] | <.01 |
| NLR (median [IQR]) | 1.89 [1.45, 2.44] | 1.82 [1.35, 2.35] | 2.13 [1.70, 2.53] | 1.90 [1.44, 2.50] | 1.80 [1.36, 2.45] | 2.17 [1.67, 2.52] | <.01 |
Figure 2Generalized linear model. A. Nomograms conveying the results of the candidate factors for predicting pulmonary infection. B. Calibration curves for internal validation of the nomogram. C. Predicted risk histogram comparing predicted risk of the nomogram with the observed frequency.
Figure 3Random forest classifier model. A. The candidate factors associated with pulmonary infection were ordered according to the mean decreased Gini index. B. Relationship of dynamic changes between the prediction error and the number of decision trees. C. Performance of the prediction model with increasing numbers of features in the principal component analysis.
Figure 4ROC curve analysis compares the prediction efficiency associated with predicting pulmonary infection using machine learning algorithms. A. Internal training set. B. External validation set.
Figure 5Decision curve analysis compares the net benefits associated with predicting pulmonary infection using RFC and GLR models. A. Internal training set. B. External validation set.