| Literature DB >> 35957862 |
Chen Cui1, Fei Mu1, Meng Tang1, Rui Lin1, Mingming Wang1, Xian Zhao1, Yue Guan1, Jingwen Wang1.
Abstract
Pseudomonas aeruginosa is a ubiquitous opportunistic bacterial pathogen, which is a leading cause of nosocomial pneumonia. Early identification of the risk factors is urgently needed for severe infection patients with P. aeruginosa. However, no detailed relevant investigation based on machine learning has been reported, and little research has focused on exploring relationships between key risk clinical variables and clinical outcome of patients. In this study, we collected 571 severe infections with P. aeruginosa patients admitted to the Xijing Hospital of the Fourth Military Medical University from January 2010 to July 2021. Basic clinical information, clinical signs and symptoms, laboratory indicators, bacterial culture, and drug related were recorded. Machine learning algorithm of XGBoost was applied to build a model for predicting mortality risk of P. aeruginosa infection in severe patients. The performance of XGBoost model (AUROC = 0.94 ± 0.01, AUPRC = 0.94 ± 0.03) was greater than the performance of support vector machine (AUROC = 0.90 ± 0.03, AUPRC = 0.91 ± 0.02) and random forest (AUROC = 0.93 ± 0.03, AUPRC = 0.89 ± 0.04). This study also aimed to interpret the model and to explore the impact of clinical variables. The interpretation analysis highlighted the effects of age, high-alert drugs, and the number of drug varieties. Further stratification clarified the necessity of different treatment for severe infection for different populations.Entities:
Keywords: Pseudomonas aeruginosa; interpretation; machine learning; risk factors; severe infection; stratification analysis
Year: 2022 PMID: 35957862 PMCID: PMC9358029 DOI: 10.3389/fmed.2022.942356
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
Figure 1Flow chart of the study.
Characteristics of patients at baseline and clinical outcomes.
|
|
| |
|---|---|---|
| Basic information | Age (years) [median (IQR)] | 64 (47–81) |
| Male [No. (%)] | 428 (74.86%) | |
| Hosp (days) [median (IQR)] | 23 (13–40) | |
| Drug Allergy [No. (%)] | 69 (12%) | |
| Smoking [No. (%)] | 105 (18%) | |
| Alcohol User [No. (%)] | 55 (10%) | |
| Drug related | Number of Drug Varieties [median (IQR)] | 52 (39-66) |
| Number of Intravenous Drugs Varieties [median (IQR)] | 7 (4-10) | |
| Clinical signs and symptoms | Headache [No. (%)] | 91 (16%) |
| Cough [No. (%)] | 365 (64%) | |
| Expectoration [No. (%)] | 322 (56%) | |
| Sore Throat [No. (%)] | 15 (3%) | |
| Hemoptysis [No. (%)] | 7 (1%) | |
| Dyspnea [No. (%)] | 149 (26%) | |
| Vomiting [No. (%)] | 187 (33%) | |
| Diarrhea [No. (%)] | 76 (13%) | |
| Lymphadenopathy [No. (%)] | 14 (2%) | |
| Drainage [No. (%)] | 222 (39%) | |
| Tracheotomy [No. (%)] | 104 (18%) | |
| Endotracheal Intubation [No. (%)] | 150 (26%) | |
| Central Venous Catheter [No. (%)] | 43 (8%) | |
| Indwelling Catheter [No. (%)] | 302 (53%) | |
| PICC Catheter [No. (%)] | 141 (25%) | |
| Temperature (°C) [median (IQR)] | 36.9 (36.5–37.6) | |
| Respiratory Rate (min−1) [median (IQR)] | 21.0 (19.0–25.0) | |
| Heart Rate (min−1) [median (IQR)] | 89.0 (78.0–105.0) | |
| DBP (mmHg) [median (IQR)] | 68.0 (60.0–76.0) | |
| SBP (mmHg) [median (IQR)] | 116.0 (102.0–129.0) | |
| Bacterial culture | Blood [No. (%)] | 57 (10%) |
| Urine [No. (%)] | 16 (3%) | |
| Phlegm [No. (%)] | 455 (80%) | |
| Secretions [No. (%)] | 54 (9%) | |
| Cerebrospinal Fluid [No. (%)] | 7 (1%) | |
| Feces [No. (%)] | 0 (0%) | |
| Number of Concurrent Infection [No. (%)] | 399 (70%) | |
| Laboratory Indicators | WBC(s×109/L) [median (IQR)] | 10.08 (6.9–14.39) |
| NEUT# (×109/L) [median (IQR)] | 7.96 (5.12–11.82) | |
| NEUT% [median (IQR)] | 0.83 (0.74–0.89) | |
| RBC (×1012/L) [median (IQR)] | 3.19 (2.77–3.66) | |
| PLA (×109/L) [median (IQR)] | 166.0 (91.0–258.0) | |
| HGB (g/L) [median (IQR)] | 95.0 (84.0–110.0) | |
| ALT (IU/L) [median (IQR)] | 29.0 (17.0–57.0) | |
| AST (IU/L) [median (IQR)] | 31.0 (20.0–55.0) | |
| DBIL (μmol/L) [median (IQR)] | 8.4 (4.6–16.0) | |
| CREA (μmol/L) [median (IQR)] | 78.0 (59.0–115.0) | |
| Urea (mmol/L) [median (IQR)] | 8.87 (5.7–15.0) | |
| ALB (g/L) [median (IQR)] | 31.6 (28.5–34.8) | |
| SAA (mg/L) [median (IQR)] | 202.0 (72.1–421.0) | |
| ESR (mm/h) [median (IQR)] | 56.0 (26.25–84.5) | |
| CRP (mg/L) [median (IQR)] | 60.5 (25.25–116.15) | |
| IL-6 (pg/mL) [median (IQR)] | 56.96 (25.03–139.2) | |
| PCT (ng/mL) [median (IQR)] | 0.95 (0.31–3.42) |
NEUT# represents the neutrophil count.
Figure 2Receiver operator characteristics (ROC) curve and precision recall (PR) curve of five machine learning models.
Methods comparison based on AUROC and AUPRC.
|
|
|
|
|
|---|---|---|---|
| XGBoost | 0.88 ± 0.02 | 0.94 ± 0.01 | 0.94 ± 0.03 |
| LightGBM | 0.86 ± 0.05 | 0.92 ± 0.02 | 0.93 ± 0.05 |
| CatBoost | 0.86 ± 0.02 | 0.93 ± 0.03 | 0.93 ± 0.03 |
| Random Forest | 0.86 ± 0.03 | 0.93 ± 0.03 | 0.89 ± 0.04 |
| Support Vector Machine | 0.84 ± 0.03 | 0.90 ± 0.03 | 0.91 ± 0.02 |
Figure 3Summarize bee swarm plots for top 15 clinical variables of SHAP values. In a bee swarm plot, each point corresponding to a sample of single P. aeruginosa infected patient of data set. The position of each point on the horizontal axis indicated the effect of that feature on the model prediction, and the color of a point reflected the eigenvalue of the case. For binary variables (such as drainage or not), red dots and bule dots correspond to 1 and 0 respectively. For numerical variables (such as age), the color of dots represented high and low values, respectively. Overlapping points that fall in the same horizontal position will be scattered vertically to show the density.
Figure 4(A,B) SHAP dependence plots for interaction of crucial clinical variables. The x axis represents the eigenvalue of the axis title, and the y axis indicates the corresponding SHAP value, representing the contribution of this feature to prediction results of model. The color of every dot reflects the eigenvalues of right axis title. The larger the value of the x-coordinate of the sample point, the variable of x-axis is more large. The larger the value of the y coordinate of the sample point, the greater risk of mortality of the sample point, and the redder the color of the sample point, the higher the value of the right index.
Performance of model on external validation sets.
|
|
|
|
|
|---|---|---|---|
| P. aeruginosa | 0.88 ± 0.02 | 0.94 ± 0.01 | 0.94 ± 0.03 |
| K. pneumoniae | 0.85 ± 0.04 | 0.91 ± 0.03 | 0.92 ± 0.05 |
Figure 5Stratified analysis of infection sites. (A) Histogram showing the number of P. aeruginosa cultured at different infection sites. (B) Histogram showing the proportion of P. aeruginosa cultured at different infection sites. (C) Violin plots showing the number of concurrent infections between different infection sites. (D) Violin plots showing the number of A-Alert drugs between different infection sites.
Figure 6Stratified analysis of age. (A) Histogram showing the number of age stratification. (B) Histogram showing the proportion of age stratification. (C) Violin plots showing the maximum decrease in respiratory rate between different age stratification. (D) Violin plots showing the maximum increase in platelets between different age stratification. (E) Violin plots showing the number of A-alert drugs between different age stratification. (F) Violin plots showing the number of B-alert drugs between different age stratification.
Figure 7Stratified analysis of intravenous drugs varieties. (A) Histogram showing the number of intravenous drugs varieties stratification. (B) Histogram showing the proportion of intravenous drugs varieties stratification. (C) Violin plots showing the maximum increase in creatinine between different intravenous drugs varieties stratification. (D) Violin plots showing the number of maximum increase in urea between different intravenous drugs varieties stratification. (E) Violin plots showing the maximum decrease in respiratory rate between different medication varieties stratification. (F) Violin plots showing the maximum decrease in diastolic diastolic blood pressure between different medication varieties stratification.