| Literature DB >> 33836745 |
Menghan Ding1, Yuan Luo2.
Abstract
BACKGROUND: Sepsis is a highly lethal and heterogeneous disease. Utilization of an unsupervised method may identify novel clinical phenotypes that lead to targeted therapies and improved care.Entities:
Keywords: Clustering; Frequent subgraph mining; Gradient boosting machine; Intensive care unit; Nonnegative matrix factorization; Phenotyping; Physiological measurements; Sepsis; Unsupervised learning
Mesh:
Year: 2021 PMID: 33836745 PMCID: PMC8033653 DOI: 10.1186/s12911-021-01460-7
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Candidate physiological variables with mean and standard deviation
| Physiological variable | Mean | SD | Physiological variable | Mean | SD |
|---|---|---|---|---|---|
| Heart rate (bpm) | 88.4 | 19.0 | Platelet count (K/uL) | 215.3 | 147.6 |
| Respiration rate (insp/min) | 20.4 | 6.2 | Partial prothrombin time (sec) | 40.4 | 17.1 |
| Glasgow coma scale motor | 5.1 | 1.5 | International normalized ratio | 1.5 | 0.6 |
| Mean arterial blood pressure (mmHg) | 79.2 | 18.2 | Blood urea nitrogen (mg/dL) | 32.8 | 25.8 |
| Diastolic blood pressure (mmHg) | 61.7 | 14.7 | Blood serum creatinine (mg/dL) | 1.5 | 1.3 |
| Systolic blood pressure (mmHg) | 121.6 | 23.7 | Blood total bilirubin (mg/dL) | 3.0 | 4.3 |
| Urine output (mL) | 120.7 | 114.6 | Blood direct bilirubin (mg/dL) | 4.4 | 5.4 |
| Temperature (Celsius) | 37.1 | 0.9 | Aspartate aminotransferase (IU/L) | 61.7 | 43.1 |
| Blood oxygen saturation (%) | 97.0 | 3.2 | Base excess (mEq/L) | − 0.1 | 5.6 |
| Fraction of inspired O2 (%) | 47.5 | 14.8 | Glucose (mg/dL) | 136.7 | 51.2 |
| Partial pressure of oxygen (mmHg) | 99.4 | 34.0 | Chloride (mEq/L) | 105.2 | 6.9 |
| Pao2/FiO2 ratio | 208.2 | 110.5 | Bicarbonate (mEq/L) | 24.7 | 5.4 |
| White blood cell count (K/uL) | 12.1 | 6.9 | Lactate (mmol/L) | 2.3 | 1.6 |
| Hemoglobin (g/dl) | 10.0 | 1.8 | Blood albumin (g/dL) | 2.8 | 0.6 |
| Hematocrit (%) | 29.6 | 5.2 | Carbon dioxide (mEq/L) | 26.0 | 6.3 |
| Ph (unit) | 7.4 | 0.1 | Blood serum potassium (mEq/L) | 4.1 | 0.7 |
| Magnesium (mg/dL) | 2.1 | 0.4 | Blood sodium (mEq/L) | 139.3 | 5.8 |
SD standard deviation
Demographics and outcome
| Subgroup 1 | Subgroup 2 | Subgroup 3 | Sepsis-3 cohort | |
|---|---|---|---|---|
| Group size | 1218 | 2036 | 2528 | 5782 |
| Age, mean (year) ± SD | 73.12 ± 14.56 | 67.91 ± 16.27 | 59.92 ± 18.24 | 65.52 ± 17.64 |
| Gender (%) | ||||
| Male | 59.03% | 60.27% | 50.44% | 55.50% |
| Female | 40.97% | 39.73% | 49.56% | 44.30% |
| Weight, mean (kg) ± SD | 84.25 ± 35.08 | 83.01 ± 24.32 | 79.37 ± 25.54 | 81.68 ± 27.56 |
| BMI, mean (kg/m2) ± SD | 29.36 ± | 29.48 ± | 28.36 ± | 29.01 ± 8.73 |
| BMI, Strata | ||||
| Overweight | 32.50% | 32.00% | 33.14% | 32.56% |
| Underweight | 1.88% | 2.72% | 3.78% | 2.96% |
| Obese | 37.50% | 37.61% | 30.67% | 34.81% |
| Healthy weight | 28.12% | 27.67% | 32.40% | 29.66% |
| ICU LOS, mean (day) ± SD | 3.14 ± 4.67 | 6.23 ± 7.10 | 4.21 ± 5.52 | 4.69 ± 6.09 |
| Elixhauser index, Mean ± SD | 3.53 ± 6.90 | 5.65 ± 7.12 | 2.37 ± 6.59 | 3.77 ± 6.99 |
| Ethnicity (%) | ||||
| Asian | 2.38% | 2.95% | 3.56% | 3.10% |
| Black | 9.61% | 8.64% | 8.23% | 8.66% |
| White | 76.68% | 72.15% | 71.08% | 72.64% |
| Hispanic | 1.40% | 3.09% | 4.27% | 3.25% |
| Other | 9.93% | 13.16% | 12.86% | 12.35% |
| Day-1 SOFA score, Mean ± SD | 6.51 ± 2.60 | 8.65 ± 3.38 | 6.50 ± 2.53 | 7.29 ± 3.07 |
| 30-Day mortality (%) | 17.00% | 28.39% | 10.13% | 18.00% |
| In-hospital mortality (%) | 11.99% | 24.80% | 7.32% | 14.46% |
SD standard deviation
Gradient boosting machine error metrics for patient group membership classification on frequent subgraphs
| Measure | Split | Subgroup | ||
|---|---|---|---|---|
| 1 | 2 | 3 | ||
| Precision | Train | 0.987 | 0.981 | 0.988 |
| Test | 0.917 | 0.889 | 0.939 | |
| Recall | Train | 0.978 | 0.984 | 0.990 |
| Test | 0.835 | 0.921 | 0.951 | |
| F-score | Train | 0.982 | 0.983 | 0.989 |
| Test | 0.874 | 0.905 | 0.945 | |
| AUC | Train | 0.954 | 0.964 | 0.977 |
| Test | 0.891 | 0.914 | 0.944 | |
Fig. 1Subgroup 1 trend group selected from representative frequent subgraphs of standardized physiological variable values over measurement period of six time windows
Fig. 2Subgroup 2 trend group selected from representative frequent subgraphs of standardized physiological variable values over measurement period of six time windows
Fig. 3Subgroup 3 trend group selected from representative frequent subgraphs of standardized physiological variable values over measurement period of six time windows
Fig. 47-Day sofa score charts
Within-subgroup incidences of the top 17 comorbidities with a minimal 10% incidence by subgroup
| Comorbidity category | Comorbidity incidence (%) | ||
|---|---|---|---|
| Subgroup 1 | Subgroup 2 | Subgroup 3 | |
| Hypertension | 69.87 | 65.18 | 48.58 |
| Fluid electrolyte imbalance | 42.12 | 61.94 | 35.36 |
| Cardiac arrhythmias | 38.59 | 37.13 | 22.47 |
| Congestive heart failure | 32.35 | 30.60 | 14.95 |
| Deficiency anemias | 22.91 | 28.98 | 21.36 |
| Diabetes uncomplicated | 28.08 | 25.74 | 16.57 |
| Chronic pulmonary | 23.40 | 21.02 | 20.49 |
| Renal failure | 28.08 | 28.88 | 6.80 |
| Coagulopathy | 9.93 | 29.57 | 11.95 |
| Neurologic disease | 15.11 | 13.21 | 21.76 |
| Hypothyroidism | 15.35 | 13.26 | 10.72 |
| Depression | 11.66 | 12.97 | 14.04 |
| Liver disease | 6.16 | 17.44 | 9.49 |
| Valvular disease | 11.82 | 13.02 | 7.24 |
| Alcohol abuse | 6.08 | 10.61 | 12.86 |
| Peripheral vascular | 11.33 | 12.03 | 5.42 |
| Pulmonary circulation | 9.85 | 11.10 | 6.29 |
Comorbidity categories are sorted in descending order of their combined incidence in the sepsis cohort
Gradient boosting machine error metrics for patient 30-day mortality model on frequent subgraphs representing patient subgroups
| Measure | Train | Test |
|---|---|---|
| Accuracy | 0.892 | 0.863 |
| Precision | 0.885 | 0.802 |
| Recall | 0.726 | 0.681 |
| F-score | 0.773 | 0.716 |
| AUC | 0.726 | 0.681 |
Gradient boosting machine error metrics for patient 30-day mortality model on mean 7-day SOFA scores
| Measure | Train | Test |
|---|---|---|
| Accuracy | 0.846 | 0.837 |
| Precision | 0.809 | 0.759 |
| Recall | 0.583 | 0.560 |
| F-score | 0.602 | 0.566 |
| AUC | 0.583 | 0.560 |
Gradient boosting machine error metrics for patient 30-day mortality model on Elixhauser comorbidity index
| Measure | Train | Test |
|---|---|---|
| Accuracy | 0.829 | 0.811 |
| Precision | 0.914 | 0.629 |
| Recall | 0.520 | 0.507 |
| F-score | 0.491 | 0.466 |
| AUC | 0.520 | 0.507 |
Two sample T Test results for distinguishing clinical characteristics by subgroup
| Clinical characteristic | Mean (SD) within subgroup | Two-sample T Test t-statistic (significance level) for difference in means between subgroups | ||||
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 1 and 2 | 1 and 3 | 2 and 3 | |
| Gender (is male) | 0.59 (0.49) | 0.60 (0.49) | 0.50 (0.50) | − 0.695 ( | 4.954 ( | 6.664 ( |
| Age | 73.12 (14.56) | 67.91 (16.27) | 59.92 (18.24) | 9.194 ( | 22.094 ( | 15.429 ( |
| Weight (kg) | 84.25 (35.08) | 83.01 (24.32) | 79.37 (25.54) | 1.144 ( | 4.659 ( | 4.661 ( |
| Elixhauser index | 3.53 (6.90) | 5.65 (7.12) | 2.37 (6.59) | − 8.334 ( | 4.957 ( | 16.136 ( |
| Day-1 sofa score | 6.51 (2.60) | 8.65 (3.38) | 6.50 (2.53) | − 17.575 ( | 0.104 ( | 23.652 ( |
| ICU LOS (day) | 3.14 (4.67) | 6.23 (7.10) | 4.21 (5.52) | − 13.561 ( | − 5.846 ( | 10.83 ( |
| 30-Day mortality | 0.17 (0.38) | 0.28 (0.45) | 0.10 (0.30) | − 7.411 ( | 6.01 ( | 16.323 ( |
| In-hospital mortality | 0.12 (0.32) | 0.26 (0.43) | 0.07 (0.26) | − 8.95 ( | 4.729 ( | 16.893 ( |
| Coagulopathy | 0.10 (0.30) | 0.30 (0.46) | 0.12 (0.32) | − 13.388 ( | − 1.823 ( | 15.217 ( |
| Liver disease | 0.06 (0.24) | 0.17 (0.38) | 0.09 (0.29) | − 9.313 ( | − 3.451 ( | 7.975 ( |
| Alcohol abuse | 0.06 (0.24) | 0.11 (0.31) | 0.13 (0.33) | − 4.404 ( | − 6.333 ( | − 2.335 ( |
| Pulmonary circulation | 0.10 (0.30) | 0.11 (0.31) | 0.06 (0.24) | − 1.117 ( | 3.897 ( | 5.833 ( |
| Neurological disease | 0.15 (0.36) | 0.13 (0.34) | 0.22 (0.41) | 1.511 ( | − 4.817 ( | − 7.522 ( |
| Chronic pulmonary | 0.06 (0.24) | 0.11 (0.31) | 0.13 (0.33) | 1.587 ( | 2.033 ( | 0.440 ( |
| Hypertension | 0.70 (0.46) | 0.65 (0.48) | 0.49 (0.50) | 2.756 ( | 12.536 ( | 11.386 ( |
| Fluid electrolyte imbalance | 0.42 (0.49) | 0.62 (0.49) | 0.35 (0.48) | − 11.192 ( | 4.006 ( | 18.530 ( |
| Cardiac arrhythmias | 0.39 (0.49) | 0.37 (0.48) | 0.22 (0.42) | 0.829 ( | 10.473 ( | 10.991 ( |
| Congestive heart failure | 0.32 (0.47) | 0.31 (0.46) | 0.15 (0.36) | 1.041 ( | 12.585 ( | 12.926 ( |
| Deficiency anemias | 0.23 (0.42) | 0.29 (0.45) | 0.21 (0.41) | − 3.796 ( | 1.072 ( | 5.949 ( |
| Diabetes uncomplicated | 0.28 (0.45) | 0.26 (0.44) | 0.17 (0.37) | 1.463 ( | 8.270 ( | 7.646 ( |
| Renal failure | 0.28 (0.45) | 0.29 (0.45) | 0.07 (0.25) | − 0.490 ( | 18.517 ( | 20.819 ( |
| Hypothyroidism | 0.15 (0.36) | 0.13 (0.34) | 0.11 (0.31) | 1.662 ( | 4.063 ( | 2.642 ( |
| Depression | 0.12 (0.32) | 0.13 (0.34) | 0.14 (0.35) | − 1.093 ( | − 2.016 ( | 1.055 ( |
| Valvular disease | 0.12 (0.32) | 0.13 (0.34) | 0.07 (0.26) | − 0.993 ( | 4.668 ( | 6.549 ( |
| Peripheral vascular | 0.11 (0.32) | 0.12 (0.33) | 0.05 (0.23) | − 0.602 ( | 6.533 ( | 8.076 ( |
P value smaller than 0.0001 are shown in scientific notation; Missing values in clinical characteristics were dropped in T Test