| Literature DB >> 35690822 |
Fei Guo1, Xishun Zhu2, Zhiheng Wu3, Li Zhu3, Jianhua Wu4, Fan Zhang5.
Abstract
BACKGROUND: Sepsis is a life-threatening syndrome eliciting highly heterogeneous host responses. Current prognostic evaluation methods used in clinical practice are characterized by an inadequate effectiveness in predicting sepsis mortality. Rapid identification of patients with high mortality risk is urgently needed. The phenotyping of patients will assistant invaluably in tailoring treatments.Entities:
Keywords: Coagulation; Convolution neural network; Deep learning; Sepsis; Survival prediction
Mesh:
Substances:
Year: 2022 PMID: 35690822 PMCID: PMC9187899 DOI: 10.1186/s12967-022-03469-6
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 8.440
Fig. 1The flow chart of data processing (A) and the feature maps of CNN based survival rate (B)
| Comparison of the performance of multiple prediction models
| Methods | Accuracy | Precision | Recall | AUC | ||
|---|---|---|---|---|---|---|
| Random Forest | Training | 0.851 | 1.000 | 0.238 | 0.384 | 0.619 |
| Test | 0.808 | 0.909 | 0.068 | 0.127 | 0.533 | |
| Logistic Regression | Training | 0.825 | 0.629 | 0.256 | 0.364 | 0.610 |
| Test | 0.808 | 0.567 | 0.260 | 0.357 | 0.605 | |
| Lasso Regression | Training | 0.825 | 0.762 | 0.148 | 0.248 | 0.568 |
| Test | 0.813 | 0.710 | 0.151 | 0.249 | 0.567 | |
| Radial SVM [ | Training | 0.515 | 0.970 | 0.491 | 0.652 | 0.701 |
| Test | 0.337 | 0.896 | 0.204 | 0.333 | 0.586 | |
| Val | 0.806 | 0.849 | 0.920 | 0.883 | 0.642 | |
| Gradient boosting [ | Training | 0.851 | 0.934 | 0.899 | 0.916 | 0.690 |
| test | 0.718 | 0.822 | 0.816 | 0.819 | 0.574 | |
| Val | 0.828 | 0.885 | 0.905 | 0.895 | 0.682 | |
| Bayes [ | Training | 0.567 | 0.965 | 0.553 | 0.703 | 0.649 |
| Test | 0.465 | 0.861 | 0.405 | 0.551 | 0.562 | |
| Val | 0.828 | 0.891 | 0.895 | 0.893 | 0.713 | |
| Linear regression [ | Training | 0.801 | 0.943 | 0.835 | 0.886 | 0.599 |
| Test | 0.679 | 0.828 | 0.763 | 0.794 | 0.541 | |
| Val | 0.788 | 0.885 | 0.842 | 0.863 | 0.689 | |
| Linear SVM [ | Training | 0.337 | 0.896 | 0.205 | 0.333 | 0.586 |
| Test | 0.467 | 0.861 | 0.407 | 0.553 | 0.586 | |
| Val | 0.818 | 0.873 | 0.906 | 0.889 | 0.676 | |
| Sofa Score [ | All data | 0.752 | 0.371 | 0.327 | 0.348 | 0.807 |
| Training | ||||||
| Test | ||||||
| Val | ||||||
| CNN (Proposed) | Training | |||||
| Test | ||||||
| Val |
Fig. 2Identification of the subgroup phenotype with k-means clustering. The K value was optimized by compromising between the elbow method (A) and the silhouette coefficient method (B); C. The 3D PCA plot visualizes the 4 clusters; D. Survival curves; E. SIC score; F. SOFA score. * P < 0.05, ****P < 0.0001 analyzed by log-rank test of Mantel or Gehan-Breslow Wilcoxon test with the higher P value presented
Fig. 3Major features of the four clusters. A White blood cell counts (K/µL); B Neutrophil proportion (%); C Lymphocytes proportion (%); D PTT (sec.); E. Low-molecular Heparin usage proportion within patients from each cluster (%)
The heterogeneous features for training set (1661 cases) according to blood tests
| Features | Cluster one | Cluster two | Cluster three | Cluster four | |
|---|---|---|---|---|---|
| number of each cluster | 211 | 1215 | 46 | 189 | |
| survival (%) | 165 (78.2%) | 1005 (82.7%) | 35 (76.1%) | 132 (69.8%) | 0.020 |
| Age, median (IQR), year | 66 (54–76) | 66 (54–76.5) | 67.5 (55–80) | 66 (57–77) | 0.933 |
| Male, no. (%) | 102 (48.3%) | 663 (54.6%) | 17 (37.0%) | 110 (58.2%) | 0.023 |
| Top ten blood test varies, median (IQR), unit | |||||
| PTT, sec | 30.6 (26.5–36.3) | 29.6 (26.2–34.3) | 150 (122.9–150) | 51.6 (43.5–60.6) | 0.000 |
| Neutrophils, % | 51 (33–61.7) | 84 (77–89.9) | 81.8 (77.2–86) | 83 (76.4–89) | 0.000 |
| PT, sec | 14.3 (13.1–16.4) | 14.2 (13.1–16.0) | 17.7 (15.3–23.8) | 30.2 (21.9–43.1) | 0.000 |
| INR(PT), NULL % | 1.3 (1.1–1.5) | 1.3 (1.1–1.5) | 1.9 (1.5–3.2) | 3.4 (2.2–5.2) | 0.000 |
| Lymphocytes, % | 26 (15.5–34.3) | 7 (4–11.3) | 7 (4.6–14.0) | 8 (4–13) | 0.000 |
| White blood cells, K/μL | 6.1 (3.4–10.8) | 13.2 (8.9–8.1) | 10.7 (7.0–8.4) | 12.1(8.5–17.4) | 0.000 |
| Platelet count, K/μL | 2.3 (2.0–2.5) | 2.4 (2.2–2.5) | 2.3 (2.2–2.4) | 2.3 (2.1–2.5) | 0.000 |
| MCHC, % | 33.3 (32.3–34.3) | 33 (32–34.1) | 32.5 (32.2–34) | 32 (31–33.3) | 0.000 |
| Albumin, % | 2.9 (2.5–3.3) | 2.9 (2.5–3.4) | 2.9 (2.5–3.4) | 2.9 (2.4–3.2) | 0.000 |
| Red blood cells, K/μL | 3.5 (3.1–4.0) | 3.7 (3.3–4.2) | 3.7 (3.2–4.0) | 3.7 (3.1–4.0) | 0.000 |
The heterogeneous features for test data (710 cases) according to blood tests
| Features | Cluster one | Cluster two | Cluster three | Cluster four | |
|---|---|---|---|---|---|
| Number of clusters | 90 | 520 | 19 | 81 | |
| survival (%) | 68 (75.56%) | 416 (80%) | 13 (68.42%) | 57 (70.37%) | 0.152 |
| Age, median (IQR), y | 64 (51.5–77.5) | 66 (54–77) | 60 (51.3–69.3) | 65 (54–74) | 0.364 |
| Male, no. (%) | 55 (61.11%) | 296 (56.92%) | 12 (63.16%) | 49 (60.49%) | 0.797 |
| Top ten blood test varies, median (IQR), unit | |||||
| PTT, sec | 31.7 (27.7–36.5) | 30.1 (26.6–34.2) | 150 (119.0–150) | 50.5(43.6–57.7) | 0.000 |
| Neutrophils, % | 52 (36.3–61.7) | 83 (77–88.7) | 81 (70.2–87.5) | 81 (76.9–88.6) | 0.000 |
| PT, sec | 14.4 (13.3–17.1) | 14.3 (13.1–16.3) | 19.5 (15.4–23.1) | 28 (19.6–39) | 0.000 |
| INR(PT), NULL % | 1.3 (1.2–1.7) | 1.3 (1.1–1.6) | 1.9 (1.6–2.7) | 3 (2–4.3) | 0.000 |
| Lymphocytes, % | 27.1 (18.5–35.3) | 7.1 (4–11.9) | 9.7 (3.7–11) | 7.2 (3–12) | 0.000 |
| White blood cells, K/μL | 4.7 (2.8–10.3) | 13.0 (8.4–18.4) | 9.6 (7.4–18.2) | 12.4 (8.2–19.2) | 0.000 |
| Platelet count, K/μL | 2.2 (1.9–2.5) | 2.4 (2.2–2.5) | 2.4 (2.1–2.6) | 2.3 (2.2–2.5) | 0.000 |
| MCHC, % | 33.4 (32.3–34.6) | 33 (32–34) | 32.7 (30.5–33.6) | 32.5 (31.2–33.6) | 0.000 |
| Albumin, % | 2.9 (2.6–3.4) | 2.9 (2.6–3.4) | 2.9 (2.8–2.9) | 2.6 (2.3–3) | 0.000 |
| Red blood cells, K/μL | 3.4 (3.0–4.0) | 3.7 (3.3–4.3) | 3.8 (3.4–4.0) | 3.5 (2.9–3.9) | 0.000 |
Fig. 4The nomograms predict 28-day survival using 35 clinical features. The nomograms were generated from all the 2371 sepsis cases from MIMIC-III. To use the nomograms, locate patient’s variable on the corresponding axis, draw a line to obtain the point’s axis, sum the points, and draw a line from the total point’s axis to the 28-day survival probability
Fig. 5Survival prediction with CNN based and DCQMFF model. ROC curve of the CNN based model (A), and DCQMFF model (B); C. a demo analysis with DCQMFF-based application platform; D. ROC curve of the SOFA score